APIs & Integrations

JHowardAdapt
Member

Need to merge 100k companies

I have wrote a script that identifies duplicated companies and then sets one as the primary account. After this script has ran I have another that merges each duplicate into the primary account. 

 

The script works fine on my local machine but there are > 100k companies to merge and each requires a single api call to do so. It's looking like this script would have to run for over a week solid to complete. If I make the call synchronously it hits the api rate limit. 

 

The other option was put the code in workflow and enrolling each company, this also worked for single test but when running on all it also hits the api rate limit straight away.

 

Could anybody offer a better approach?

 

Thanks

Johnson

0 Upvotes
2 Replies 2
Jaycee_Lewis
Community Manager
Community Manager

Need to merge 100k companies

Hey, @JHowardAdapt 👋 Can share any additional details with the community? The specific endpoint and an example request and response are all great places to start.

 

As far as the API limits, I do not have any workarounds to offer. Here is the documentation on those limits in case it is helpful for you or anyone else who comes across this thread:

We'll leave this here to see if the community has any additional thoughts on your question. 

 

I hope this is helpful in getting you moving.

 

Have fun coding! — Jaycee

linkedin

Jaycee Lewis

Developer Community Manager

Community | HubSpot

0 Upvotes
JHowardAdapt
Member

Need to merge 100k companies

So basically I have an array of duplicated company json objects, each one has a "primaryId" field and then a list of ids that need to be merged into that primary id. My script loops through these objects and merges each id from the list into the primary id.  There can be dozens of ids in the list per object and in total there will need to be > 100K merges to completely dedupe our Hubspot.  

 

Within each loop of the list of the objects I am calling the merge endpoint like so:

    var requestOptions = {
        method: 'POST',
        headers: { "Content-Type": "application/json" },
        redirect: 'follow',
        body: JSON.stringify({
            "primaryObjectId": primaryId,
            "objectIdToMerge": toMergeId
          })
    };

   await fetch(`https://api.hubapi.com/crm/v3/objects/companies/merge?hapikey=<APIKEY>`, 
   requestOptions)

 

This works fine on my local machine but because of the api limits I cannot make too many calls in quick succession. Given the amount of calls that need to be made (100K) if each takes a second it will take over a day to complete. 

 

If this is the only way I will stick in on a seperate server and let it run but I just wanted to check with the community in case there is an obvious more efficient solution 

0 Upvotes