APIs & Integrations

mitani
Member

Sync Primary Objects and Associations on a nightly basis...

We do machine learning on CRM data and are looking at how to keep our data lake in sync with our customers HubSpot data. 

 

We are planning to sync primary objects like Contacts, Companies and Deals using the Search API to filter based on records that were updated since the previous sync. 

 

But it's also important for us to load the relationships for each of the records. Would we need to resync the all the associations on a nightly basis using 

https://developers.hubspot.com/docs/api/crm/associations

 

or is there something that we are missing that would allow us to filter for only associations that were modified/created/deleted since the last sync. 

0 Upvotes
6 Replies 6
HenryCipolla
Member

Sync Primary Objects and Associations on a nightly basis...

Hi Mitani,

 

At Demand Sage we do the same thing. We sync all the objects nightly and unfortunately we didn't find a public solution to this. We ended up going with redownloading all objects every night. (There are "most recently updated" filters for a few objects like Deals and Contacts but those have al limited numbed of rows so data gets lost). And of course this is really painful with associations b/c there isn't a bulk associations API.

 

One thing that helped was the legacy Deals API has an option to return associated companies and contacts. So between that and the other propeties on contacts and companies we are able to reconstruct the associations nightly without having to rely on a seperate API.

 

-- Henry

www.demandsage.com

mitani
Member

Sync Primary Objects and Associations on a nightly basis...

Hi Henry,

 

Thank you for the detailed response.Did you take a look at the Search API and filtering by lastmodifieddate for the primary objects?  It looks like we will go a similar route as you and resync all the relationships on a nightly basis. 

 

-- Majed 

0 Upvotes
HenryCipolla
Member

Sync Primary Objects and Associations on a nightly basis...

Which search API are you referring to? Most of the APIs we looked had a record limit such that searching or filtering didn't actually solve the problem. 

0 Upvotes
zwolfson
HubSpot Employee
HubSpot Employee

Sync Primary Objects and Associations on a nightly basis...

Hey @mitani and @HenryCipolla ,

 

You both have put toghether some good ideas on how to do a bi-directional sync with HubSpot. While it feels like you might have a work-able solution, I figured I would lay out an approach you might take given what you outlined in your goal.

 

  • For an initial sync, the most straightforward path will be the crm v3 Object APIs. These let you get all records for each object type with properties 100 records at a time. You can also request to include associated records in the response
  • For updates, webhooks is the best option for most use cases. You can subscibe to learn about object creates, deletes and property changes. This will let you stay up to date with most things.
  • The biggest notable exception to this is assocation changes. You will have to use the batch assocations API to get all assocations for all your known objects and compare differences to what you have stored. 
  • For any other updates not covered by webhooks or assocations, you can batch get known records to get the current "state of the world" for them to update your database accordingly. 

Hopefully that helps. If you have any suggestions on how we can improve our API to better support your use case, don't hesitate leave them at ideas.hubspot.com

 

Thanks,

Zack

0 Upvotes
HenryCipolla
Member

Sync Primary Objects and Associations on a nightly basis...

Hi Zack, thanks for the feedback!

 

We didn't use the new V3 API because 100 records at a time can be more limiting for some objects than the legacy API. (Deals for example has a limit of 250) and I also believe the rate limiting is more restrictive than the legacy API?

 

Webhooks is a much larger engineering burdern because of the need to worry about uptime, spikes, etc. We are also syncing Engagements and other objects that I don't believe are supported by the webhooks endpoints yet and they seem to require a full sync every night (because we didn't see a way to get the most recent updates if there are more than 10K, which there often are).

 

What did you mean by batch get known records to get the current state?

0 Upvotes
zwolfson
HubSpot Employee
HubSpot Employee

Sync Primary Objects and Associations on a nightly basis...

Hey @HenryCipolla I see what you mean. You are correct, engagements are not currently support by webhooks so those require a full sync.

 

By batch get, I mean the endpoints in teh CRM v3 API that looks like POST /crm/v3/object/{objectName}/batch/read they take an array of IDs and will return the set of objects you requested.  That way, if you have a set of records you suspect changed, you can check just those specifically instead of scrolling through all records to to find specific ones. 

0 Upvotes