CRM Search API Endpoint - pulling more than 10K records

OKulka
Member

Hi there,

 

I'm developing a script to pull Contacts from the CRM Search API into a database so that I further work with these records. These are not all contacts but a specified list of contacts - that we identify via specified filters.

What I've realised is that once pagination hits 10K the call requests seize and reading the documentation and posts, it seems to be a limitation in the end-point.

 

What I still cannot figure out, is what the work-around for this limitation is. We have well over 10K and the use-case so far guides in requiring two steps:

- Create a manual extraction of the 'first' 10K records - one-off

- Alter the script to include a 'filter' that compares the latest timetamp pulled ('createdate', hs_lastmodifieddate etc) 

This is an example snippet of the payload:

 

 

payload = {
    "filterGroups": [
        {
            "filters": [
              {
                    "propertyName": "XXXXXXXXX",
                    "operator": "EQ",
                    "value": "PLACEHOLDER"
                },
                {
                    "propertyName": "createdate",
                    "operator": "LT",
                    "value": "1645484400000"
                },
            ]
        }
    ],
    "sorts": [
        {
            "propertyName": "createdate",
            "direction": "ASCENDING"
        }

    ],
    "limit": 100,
     "properties": ["x","y","z"]
}

 

 

 

As you can see with the second filter, I try to reduce the scope of the returned records to a specific time frame with 'createdate' as the filtering property but not sure this is the right way to go. I've read a post suggesting that and another suggesting hs_lastmodifieddate.


1. What is the more concrete way / approach to capture the latest pulled record and then send that back in the following call to ensure we do not hit the limit again?
2. Are we using the right end-point? Perhpas a different end-point will produce better results?


2 Replies 2
RMones
Contributor

Hi @OKulka ,

 

If you are using the limit you can add the after variable in your request.
See Paging through results at the following URL: https://developers.hubspot.com/docs/api/crm/search


Is this where you looking for?

 

Kind Regards,

Ronald

OKulka
Member

Hi @RMones 

Thanks for the reply. 

I am using the 'after' parameter in the payload in order to pagintion through the contacts, 100 records at a time, however since there is a limitation on the end-point of 10K records and that we have more than that I am basically missing the records from 1001 onwards.

 

As per previous posts in the forum it seems that the workaround is to:
- one-time full extract and push to destination

- save the last known timestamp(last_modified or createdate) and then add that into the payload once it hits the threshold again. How to go about that is the question? Which filters and what would be the sequence of steps to reach that solution.

Have you had any need or success doing that?

0 Upvotes