Contacts Query by Recently Modified: Not All Results Returned

Tim_Munro
Participant | Elite Partner

With regards to this endpoint:

GET /contacts/v1/lists/recently_updated/contacts/recent

 

In some situations we use this to collect details of Contacts created / modified since last last time a full sync ran. The logic used is typically:

  1. Note the time of this sync (START_TIME)
  2. Look up the time of the last successful sync (LAST_START_TIME i.e. START_TIME of last sync that ran to completion)
  3. Use the above endpoint to query for all Contacts created / modified since LAST_START_TIME minus 5 minutes - the 5 minutes allows for potential time skew between the server running the integration and HubSpot servers
  4. Save START_TIME from (1) to use as the starting time for the next sync

Occasionally we find that some Contact record changes are not detected by the sync - we can see that a Contact has been created/modified however the above query "missed" passing this Contact to the sync process. It is safe to assume that the server that runs the integration has an accurate PC clock (i.e. START_TIME will be accurate to +/- 1 second).

  • Is there a flaw in the logic as outlined above?
  • This seems to happen in larger HubSpot accounts during times when a large number of data changes are happening, e.g. importing of thousands of Contacts that trigger lots of list / workflow update. Consequently we are wondering if HubSpot's internal engine may e.g. start processing a batch at START_TIME minus 1 minute - lets say it takes HubSpot 10 minutes to process that batch of changes - and then, at START_TIME + 9 minutes, HubSpot commit's ALL changed Contacts with some of them having a last-modified timestamp that will now be less that the 5 minute window allowed for above. So: can HubSpot sometimes apply changes to a Contact and set the last-modified timestamp of that change to a time that is more than 5 minutes behind "current" wall clock time?

Thanks!

 

0 Upvotes
5 Replies 5
Willson
HubSpot Employee

Hey @Tim_Munro 

 

This is a great question, I am working with our team internally now to get a better idea of the timings around this for you and will revert back as soon as I know more.

 

Thanks!

Matthew Willson

HubSpot Developer Support
Tim_Munro
Participant | Elite Partner

Hey @Willson, any feedback on this? We are still experiencing this here. Thanks, Tim

0 Upvotes
Tim_Munro
Participant | Elite Partner

We spent some hours analysing this and noticed that when querying multiple pages of data from this endpoint the HubSpot API sometimes returns the following:

{
  "contacts": [],
  "has-more": true,
  "vid-offset": 0,
  "time-offset": 0
}

 

This is in response to a valid "next page" request sent by us (i.e. our request will include the count, vidOffset and timeOffset as returned from the prior page). This - seems - to happen if the 'last' HubSpot record from the prior page is updated in the time between us receiving the last page and requesting the next page.

 

For example, if the prior page return this:

{
  "contacts": [...],
  "has-more": true,
  "vid-offset": 7578301,
  "time-offset": 1596758710496
}

If, before we request the next page, the contact with ID 7578301 is updated in HubSpot the blank resonse above is returned.

 

If that is the case I guess the question is whether there is a recommended approach for handling this scenrio?

 

Our logic was treating the response that has no contacts, no vid-offset and no time-offset as though HubSpot was reporting 'no more data' when in reality there is more data to process. 

 

 

0 Upvotes
Willson
HubSpot Employee

Hey @Tim_Munro,

 

Thanks for your patience here, I have been out of office recently!

 

Jumping back into this one and reviewing what i've managed to find so far, I believe that when you're making your calls, you're likely hitting a "race condition".

 

A recommendation for getting around this at the moment is rewinding the starting offset to an ID that's several seconds older than the largest ID you've encountered so far.

 

I hope this helps!

Matthew Willson

HubSpot Developer Support
0 Upvotes
Tim_Munro
Participant | Elite Partner

Thanks @Willson - you mean set the timeOffset to a few seconds older than the first timeOffset and leave the vidOffset blanks?

 

So if the first page of data from HubSpot looked like this:

{
  "contacts": [...],
  "has-more": true,
  "vid-offset": 7578301,
  "time-offset": 1596758710496
}

and a later response from HubSpot looks like this:

{
  "contacts": [],
  "has-more": true,
  "vid-offset": 0,
  "time-offset": 0
}

then I believe the next request we send should have timeOffset=1596758700000 and leave the vidOffset blank.

0 Upvotes