<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Duplicate Companies in APIs &amp; Integrations</title>
    <link>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380201#M37386</link>
    <description>&lt;P&gt;&lt;SPAN&gt;When I make a call via Python to get all my companies, I am only getting 250 contacts. If I set the max to 500, I get 500, but HubSpot duplicates the companies. So halving that would still only be 250. If I set the max for 1000, I get 1000, but HubSpot quadruples the values (meaning I'm still only getting 250 unique values). I should have 340 companies. Any ideas why HubSpot would do this? It seems to not be paginating properly or something.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;import requests
import json
import urllib

# Maximum number of results to cycle through
max_results = 1000
hapikey = 'APIkey'
# Specify the number of companies to return in the API call (default is 100, max is 250)
limit = 250
company_list = []
# Set the properties to retrieve in the URL (&amp;amp;property=prop_name)
get_all_companies_url = "https://api.hubapi.com/companies/v2/companies/paged?&amp;amp;properties=hs_object_id&amp;amp;"
parameter_dict = {'hapikey': hapikey, 'limit': limit}
headers = {}

# Paginate your request using offset
has_more = True
# has_more lets the program know if there are more companies to cycle through
while has_more:
	parameters = urllib.parse.urlencode(parameter_dict)
	get_url = get_all_companies_url + parameters
	r = requests.get(url= get_url, headers = headers)
	response_dict = json.loads(r.text)
	has_more = response_dict.get('has-more') or False
	company_list.extend(response_dict.get('companies') or [])
	parameter_dict['Offset']= response_dict.get('offset')
    # Exit pagination, based on what ever value you've set your max results variable to
	if len(company_list) &amp;gt;= max_results:
		print('maximum number of results exceeded')
		break
print('loop finished')

list_length = len(company_list)

print("You've succesfully parsed through {} company records and added them to a list".format(list_length))&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I updated the post with my current code. Maybe it'll help.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/13982"&gt;@dennisedson&lt;/a&gt;&amp;nbsp;I tagged you since you seem to eventually find and reply to all of my posts anyways, regardless of your Python expertise. &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 20 Oct 2020 15:55:23 GMT</pubDate>
    <dc:creator>taran42</dc:creator>
    <dc:date>2020-10-20T15:55:23Z</dc:date>
    <item>
      <title>Duplicate Companies</title>
      <link>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380201#M37386</link>
      <description>&lt;P&gt;&lt;SPAN&gt;When I make a call via Python to get all my companies, I am only getting 250 contacts. If I set the max to 500, I get 500, but HubSpot duplicates the companies. So halving that would still only be 250. If I set the max for 1000, I get 1000, but HubSpot quadruples the values (meaning I'm still only getting 250 unique values). I should have 340 companies. Any ideas why HubSpot would do this? It seems to not be paginating properly or something.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;import requests
import json
import urllib

# Maximum number of results to cycle through
max_results = 1000
hapikey = 'APIkey'
# Specify the number of companies to return in the API call (default is 100, max is 250)
limit = 250
company_list = []
# Set the properties to retrieve in the URL (&amp;amp;property=prop_name)
get_all_companies_url = "https://api.hubapi.com/companies/v2/companies/paged?&amp;amp;properties=hs_object_id&amp;amp;"
parameter_dict = {'hapikey': hapikey, 'limit': limit}
headers = {}

# Paginate your request using offset
has_more = True
# has_more lets the program know if there are more companies to cycle through
while has_more:
	parameters = urllib.parse.urlencode(parameter_dict)
	get_url = get_all_companies_url + parameters
	r = requests.get(url= get_url, headers = headers)
	response_dict = json.loads(r.text)
	has_more = response_dict.get('has-more') or False
	company_list.extend(response_dict.get('companies') or [])
	parameter_dict['Offset']= response_dict.get('offset')
    # Exit pagination, based on what ever value you've set your max results variable to
	if len(company_list) &amp;gt;= max_results:
		print('maximum number of results exceeded')
		break
print('loop finished')

list_length = len(company_list)

print("You've succesfully parsed through {} company records and added them to a list".format(list_length))&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I updated the post with my current code. Maybe it'll help.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/13982"&gt;@dennisedson&lt;/a&gt;&amp;nbsp;I tagged you since you seem to eventually find and reply to all of my posts anyways, regardless of your Python expertise. &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Oct 2020 15:55:23 GMT</pubDate>
      <guid>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380201#M37386</guid>
      <dc:creator>taran42</dc:creator>
      <dc:date>2020-10-20T15:55:23Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Companies</title>
      <link>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380347#M37403</link>
      <description>&lt;P&gt;&lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/144301"&gt;@taran42&lt;/a&gt; , my old friend!&amp;nbsp; &lt;/P&gt;
&lt;P&gt;Is there a way to print out the second iteration of the loop to see what the offset is?&lt;/P&gt;
&lt;P&gt;Have you tried to use something like postman to test that request?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I see what you are doing.&amp;nbsp; You are forcing me to learn Python.&amp;nbsp; Sneaky.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Calling my Python crew &lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/95539"&gt;@akaiser&lt;/a&gt; , &lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/86141"&gt;@khookguy&lt;/a&gt; , &lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/146466"&gt;@wfong&lt;/a&gt;!&amp;nbsp; I am beginning to think that we need to set up a python room &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;/P&gt;
&lt;P&gt;Do you guys have any idea what is going wrong here?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks all!!&lt;/P&gt;</description>
      <pubDate>Tue, 20 Oct 2020 20:15:38 GMT</pubDate>
      <guid>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380347#M37403</guid>
      <dc:creator>dennisedson</dc:creator>
      <dc:date>2020-10-20T20:15:38Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Companies</title>
      <link>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380370#M37406</link>
      <description>&lt;P&gt;&lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/13982"&gt;@dennisedson&lt;/a&gt;&amp;nbsp;if you learned Python, would that be so bad? &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt;&amp;nbsp; I'm actually fairly new to Python myself. I've only been using it for the last couple of months. The problem is I was just thrown in the deep in with it, not having a chance to learn it. I come from a web development background, so I have some coding experience, but not with stuff like this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have no problem with the Contact code, and it's set up almost exactly like Companies and Deals. How many contacts you parse through, whether they're paginatined or not, is shown at the top of the prompt when you run the code. So for Contacts, it lists all 2k+ I have. With Companies and Deals, it's listing whatever I set my Max Results to.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So just twenty minutes or so ago, a friend helped me to figure out how to get the Companies to return properly. He noted that the &lt;FONT color="#FF0000"&gt;offset&lt;/FONT&gt; value isn't defined. I added &lt;FONT color="#FF0000"&gt;offset=250&lt;/FONT&gt; under the &lt;FONT color="#FF0000"&gt;limit&lt;/FONT&gt; and I was able to get all 340 companies. However that same fix did not work for Deals. So I'm halfway there.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have no idea why Contacts, Companies and Deals work so differentlyl from one another when they're basically the same thing. It's like three different designers coded the API.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I use Visual Studio Code to look for errors, which works much like Postman does. I usually code in Notepad++ though.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Oct 2020 21:06:55 GMT</pubDate>
      <guid>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380370#M37406</guid>
      <dc:creator>taran42</dc:creator>
      <dc:date>2020-10-20T21:06:55Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Companies</title>
      <link>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380412#M37410</link>
      <description>&lt;P&gt;The same friend helped me with Deals. It turns out that the offset needed to be set on it as well (which I did) and that has more should be called as &lt;FONT color="#FF0000"&gt;hasMore&lt;/FONT&gt; instead of &lt;FONT color="#FF0000"&gt;has-more&lt;/FONT&gt;. The HubSpot documentation seems to be really broken, at least when it comes to the Python API calls. Everything I was using I had pulled from the documenation, but I kept running into numerous errors. Little things like has more is used as &lt;FONT color="#FF0000"&gt;hasMore&lt;/FONT&gt;, &lt;FONT color="#FF0000"&gt;has_more&lt;/FONT&gt; and &lt;FONT color="#FF0000"&gt;has-more&lt;FONT color="#333333"&gt; throughout Contacts, Companies and Deals. One would think they'd all be the same&lt;/FONT&gt;&lt;/FONT&gt;. Also &lt;FONT color="#FF0000"&gt;limit&lt;/FONT&gt; and &lt;FONT color="#FF0000"&gt;count&lt;/FONT&gt; are used interchangeably. There are several other errors I struggled with while using the documentation that I got help with here and there. And being new to Python, I didn't know how to spot and fix those errors.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I know I am using V2 and HubSpot uses V3 now, but I tried V3 and still had pagination problems. So I just stuck with what I was already familiar with.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2020 01:49:43 GMT</pubDate>
      <guid>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380412#M37410</guid>
      <dc:creator>taran42</dc:creator>
      <dc:date>2020-10-21T01:49:43Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Companies</title>
      <link>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380612#M37424</link>
      <description>&lt;P&gt;&lt;a href="https://community.hubspot.com/t5/user/viewprofilepage/user-id/144301"&gt;@taran42&lt;/a&gt; , Glad you got it figured out.&amp;nbsp; The key benefit to the V3 API is to bring consistency to the APIs as you are not alone in thinking that they were built in silos. &lt;/P&gt;
&lt;P&gt;Hopefully you can figure out the issue with the pagination on the newer APIs.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regardless, glad this immediate issue is resolved!&lt;/P&gt;</description>
      <pubDate>Wed, 21 Oct 2020 14:12:52 GMT</pubDate>
      <guid>https://community.hubspot.com/t5/APIs-Integrations/Duplicate-Companies/m-p/380612#M37424</guid>
      <dc:creator>dennisedson</dc:creator>
      <dc:date>2020-10-21T14:12:52Z</dc:date>
    </item>
  </channel>
</rss>

