APIs & Integrations

taran42
Contributor

Duplicate Companies

SOLVE

When I make a call via Python to get all my companies, I am only getting 250 contacts. If I set the max to 500, I get 500, but HubSpot duplicates the companies. So halving that would still only be 250. If I set the max for 1000, I get 1000, but HubSpot quadruples the values (meaning I'm still only getting 250 unique values). I should have 340 companies. Any ideas why HubSpot would do this? It seems to not be paginating properly or something.

 

import requests
import json
import urllib

# Maximum number of results to cycle through
max_results = 1000
hapikey = 'APIkey'
# Specify the number of companies to return in the API call (default is 100, max is 250)
limit = 250
company_list = []
# Set the properties to retrieve in the URL (&property=prop_name)
get_all_companies_url = "https://api.hubapi.com/companies/v2/companies/paged?&properties=hs_object_id&"
parameter_dict = {'hapikey': hapikey, 'limit': limit}
headers = {}

# Paginate your request using offset
has_more = True
# has_more lets the program know if there are more companies to cycle through
while has_more:
	parameters = urllib.parse.urlencode(parameter_dict)
	get_url = get_all_companies_url + parameters
	r = requests.get(url= get_url, headers = headers)
	response_dict = json.loads(r.text)
	has_more = response_dict.get('has-more') or False
	company_list.extend(response_dict.get('companies') or [])
	parameter_dict['Offset']= response_dict.get('offset')
    # Exit pagination, based on what ever value you've set your max results variable to
	if len(company_list) >= max_results:
		print('maximum number of results exceeded')
		break
print('loop finished')

list_length = len(company_list)

print("You've succesfully parsed through {} company records and added them to a list".format(list_length))

 

I updated the post with my current code. Maybe it'll help.

 

@dennisedson I tagged you since you seem to eventually find and reply to all of my posts anyways, regardless of your Python expertise. 😄 

2 Accepted solutions
taran42
Solution
Contributor

Duplicate Companies

SOLVE

@dennisedson if you learned Python, would that be so bad? 😄  I'm actually fairly new to Python myself. I've only been using it for the last couple of months. The problem is I was just thrown in the deep in with it, not having a chance to learn it. I come from a web development background, so I have some coding experience, but not with stuff like this.

 

I have no problem with the Contact code, and it's set up almost exactly like Companies and Deals. How many contacts you parse through, whether they're paginatined or not, is shown at the top of the prompt when you run the code. So for Contacts, it lists all 2k+ I have. With Companies and Deals, it's listing whatever I set my Max Results to.

 

So just twenty minutes or so ago, a friend helped me to figure out how to get the Companies to return properly. He noted that the offset value isn't defined. I added offset=250 under the limit and I was able to get all 340 companies. However that same fix did not work for Deals. So I'm halfway there. 

 

I have no idea why Contacts, Companies and Deals work so differentlyl from one another when they're basically the same thing. It's like three different designers coded the API.

 

I use Visual Studio Code to look for errors, which works much like Postman does. I usually code in Notepad++ though.

View solution in original post

0 Upvotes
taran42
Solution
Contributor

Duplicate Companies

SOLVE

The same friend helped me with Deals. It turns out that the offset needed to be set on it as well (which I did) and that has more should be called as hasMore instead of has-more. The HubSpot documentation seems to be really broken, at least when it comes to the Python API calls. Everything I was using I had pulled from the documenation, but I kept running into numerous errors. Little things like has more is used as hasMore, has_more and has-more throughout Contacts, Companies and Deals. One would think they'd all be the same. Also limit and count are used interchangeably. There are several other errors I struggled with while using the documentation that I got help with here and there. And being new to Python, I didn't know how to spot and fix those errors.

 

I know I am using V2 and HubSpot uses V3 now, but I tried V3 and still had pagination problems. So I just stuck with what I was already familiar with.

View solution in original post

0 Upvotes
4 Replies 4
dennisedson
HubSpot Product Team
HubSpot Product Team

Duplicate Companies

SOLVE

@taran42 , my old friend! 

Is there a way to print out the second iteration of the loop to see what the offset is?

Have you tried to use something like postman to test that request? 

I see what you are doing.  You are forcing me to learn Python.  Sneaky.

 

Calling my Python crew @akaiser , @khookguy , @wfong!  I am beginning to think that we need to set up a python room 🙂

Do you guys have any idea what is going wrong here?

 

Thanks all!!

0 Upvotes
taran42
Solution
Contributor

Duplicate Companies

SOLVE

@dennisedson if you learned Python, would that be so bad? 😄  I'm actually fairly new to Python myself. I've only been using it for the last couple of months. The problem is I was just thrown in the deep in with it, not having a chance to learn it. I come from a web development background, so I have some coding experience, but not with stuff like this.

 

I have no problem with the Contact code, and it's set up almost exactly like Companies and Deals. How many contacts you parse through, whether they're paginatined or not, is shown at the top of the prompt when you run the code. So for Contacts, it lists all 2k+ I have. With Companies and Deals, it's listing whatever I set my Max Results to.

 

So just twenty minutes or so ago, a friend helped me to figure out how to get the Companies to return properly. He noted that the offset value isn't defined. I added offset=250 under the limit and I was able to get all 340 companies. However that same fix did not work for Deals. So I'm halfway there. 

 

I have no idea why Contacts, Companies and Deals work so differentlyl from one another when they're basically the same thing. It's like three different designers coded the API.

 

I use Visual Studio Code to look for errors, which works much like Postman does. I usually code in Notepad++ though.

0 Upvotes
taran42
Solution
Contributor

Duplicate Companies

SOLVE

The same friend helped me with Deals. It turns out that the offset needed to be set on it as well (which I did) and that has more should be called as hasMore instead of has-more. The HubSpot documentation seems to be really broken, at least when it comes to the Python API calls. Everything I was using I had pulled from the documenation, but I kept running into numerous errors. Little things like has more is used as hasMore, has_more and has-more throughout Contacts, Companies and Deals. One would think they'd all be the same. Also limit and count are used interchangeably. There are several other errors I struggled with while using the documentation that I got help with here and there. And being new to Python, I didn't know how to spot and fix those errors.

 

I know I am using V2 and HubSpot uses V3 now, but I tried V3 and still had pagination problems. So I just stuck with what I was already familiar with.

0 Upvotes
dennisedson
HubSpot Product Team
HubSpot Product Team

Duplicate Companies

SOLVE

@taran42 , Glad you got it figured out.  The key benefit to the V3 API is to bring consistency to the APIs as you are not alone in thinking that they were built in silos.

Hopefully you can figure out the issue with the pagination on the newer APIs. 

 

Regardless, glad this immediate issue is resolved!

0 Upvotes