<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Search API - Batch size not always 10,000 in APIs &amp; Integrations</title>
    <link>https://community.hubspot.com/t5/APIs-Integrations/Search-API-Batch-size-not-always-10-000/m-p/815366#M65422</link>
    <description>&lt;P&gt;We have download routines built in SQL Server Integration Services, that are designed to download the all records from certain objects. We do this by the using the Search Endpoint and passing in a body like this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;{"filterGroups": [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "filters": [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "value": {{User::RecordID}},&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "propertyName": "hs_object_id",&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "operator": "GTE"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ]&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;&amp;nbsp; ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; "sorts": [&lt;/P&gt;&lt;P&gt;&amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "propertyName": "hs_object_id",&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "direction": "ASCENDING"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;&amp;nbsp; ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; "properties": [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {{User::Properties}}&lt;/P&gt;&lt;P&gt;&amp;nbsp; ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; "limit": 100,&lt;/P&gt;&lt;P&gt;&amp;nbsp; "after": &amp;lt;%after%&amp;gt;&lt;/P&gt;&lt;P&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We call this in a loop, as the Search Endpoint will only return 10,000 records. So we loop through 10,000 and if the record count is &amp;lt; 10,000 we stop the process. If the Record Count is = 10,000, we grab the last ID from the last call and then make another call as above which does hs_object_id &amp;gt;= last ID returned. So in effect, we are looping through in batches of 10,000 until we have returned all ID’s in the object.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We have identified however, that on our Line Items package, this routine was failing to download all IDs. After investigating, I have identified that the Search Endpoint does not always return 10,000 records.&lt;/P&gt;&lt;P&gt;In a test I completed yesterday on our portal regarding the Line Item object:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 1 – Returned 10,000.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 2 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 3 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 4 – Returned 9997&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Our routine then stopped, as the last batch was &amp;lt; 10,000.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, after tweaking our criteria, I amended it to say &amp;lt; 9900 then continue. And this behaviour was then witnessed:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 1 – Returned 10,000&lt;/P&gt;&lt;P&gt;Batch 2 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 3 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 4 – Returned 9997&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 5 – Returned 9177&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In this last test, we then returned all ID.s&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question therefore is, why did Batch 4 return &amp;lt; 10,000 records? &amp;nbsp;My expectation, after reading the HubSpot API Documentation, is that the Search Endpoint will return 10,000 records if there are more records remaining from the query.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to understand this behaviour so I can best align our download routines for every object to ensure we do not miss any data. We have many internal reports configured from our downloaded data in our data warehouse, so this is of high importance to ensure we have correct reports been shared with our business.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone have any thoughts on why this endpoint behaved in this way?&lt;/P&gt;</description>
    <pubDate>Tue, 04 Jul 2023 11:56:26 GMT</pubDate>
    <dc:creator>BHarrison-Cook</dc:creator>
    <dc:date>2023-07-04T11:56:26Z</dc:date>
    <item>
      <title>Search API - Batch size not always 10,000</title>
      <link>https://community.hubspot.com/t5/APIs-Integrations/Search-API-Batch-size-not-always-10-000/m-p/815366#M65422</link>
      <description>&lt;P&gt;We have download routines built in SQL Server Integration Services, that are designed to download the all records from certain objects. We do this by the using the Search Endpoint and passing in a body like this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;{"filterGroups": [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "filters": [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "value": {{User::RecordID}},&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "propertyName": "hs_object_id",&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "operator": "GTE"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ]&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;&amp;nbsp; ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; "sorts": [&lt;/P&gt;&lt;P&gt;&amp;nbsp; {&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "propertyName": "hs_object_id",&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "direction": "ASCENDING"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/P&gt;&lt;P&gt;&amp;nbsp; ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; "properties": [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; {{User::Properties}}&lt;/P&gt;&lt;P&gt;&amp;nbsp; ],&lt;/P&gt;&lt;P&gt;&amp;nbsp; "limit": 100,&lt;/P&gt;&lt;P&gt;&amp;nbsp; "after": &amp;lt;%after%&amp;gt;&lt;/P&gt;&lt;P&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We call this in a loop, as the Search Endpoint will only return 10,000 records. So we loop through 10,000 and if the record count is &amp;lt; 10,000 we stop the process. If the Record Count is = 10,000, we grab the last ID from the last call and then make another call as above which does hs_object_id &amp;gt;= last ID returned. So in effect, we are looping through in batches of 10,000 until we have returned all ID’s in the object.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We have identified however, that on our Line Items package, this routine was failing to download all IDs. After investigating, I have identified that the Search Endpoint does not always return 10,000 records.&lt;/P&gt;&lt;P&gt;In a test I completed yesterday on our portal regarding the Line Item object:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 1 – Returned 10,000.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 2 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 3 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 4 – Returned 9997&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Our routine then stopped, as the last batch was &amp;lt; 10,000.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, after tweaking our criteria, I amended it to say &amp;lt; 9900 then continue. And this behaviour was then witnessed:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 1 – Returned 10,000&lt;/P&gt;&lt;P&gt;Batch 2 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 3 – Returned 10,000&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 4 – Returned 9997&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Batch 5 – Returned 9177&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In this last test, we then returned all ID.s&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question therefore is, why did Batch 4 return &amp;lt; 10,000 records? &amp;nbsp;My expectation, after reading the HubSpot API Documentation, is that the Search Endpoint will return 10,000 records if there are more records remaining from the query.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to understand this behaviour so I can best align our download routines for every object to ensure we do not miss any data. We have many internal reports configured from our downloaded data in our data warehouse, so this is of high importance to ensure we have correct reports been shared with our business.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone have any thoughts on why this endpoint behaved in this way?&lt;/P&gt;</description>
      <pubDate>Tue, 04 Jul 2023 11:56:26 GMT</pubDate>
      <guid>https://community.hubspot.com/t5/APIs-Integrations/Search-API-Batch-size-not-always-10-000/m-p/815366#M65422</guid>
      <dc:creator>BHarrison-Cook</dc:creator>
      <dc:date>2023-07-04T11:56:26Z</dc:date>
    </item>
  </channel>
</rss>

