HubSpot crawling private URLs

SOLVE
darin1
Member

Hi there,

 

Our server is getting requests to private URLs with a User Agent of:  "HubSpot Links Crawler 2.0 http://www.hubspot.com/"

 

It cannot access these URLs and I understand that it will respect robots.txt if provided.

 

What I'd like to understand is how HubSpot is getting these URLs to scrape in the first place.  They should only be available from private pages, HubSpot shouldn't know these URLs exist.


Is there anything available to help understand how HubSpot is getting these URLs to scrape?

 

Thanks,

Darin

0 Upvotes
1 Accepted solution

Accepted Solutions
Derek_Gervais
Solution
HubSpot Employee

Hey @darin1 ,

 

I know the URLs that you're referring to are private, but it'd be easier to dig into this if you could send an example or two of what URLs are being crawled. Would you mind sending me a private message here on the community with a couple examples of the private URLs that are being crawled?

View solution in original post

0 Upvotes
1 Reply 1
Derek_Gervais
Solution
HubSpot Employee

Hey @darin1 ,

 

I know the URLs that you're referring to are private, but it'd be easier to dig into this if you could send an example or two of what URLs are being crawled. Would you mind sending me a private message here on the community with a couple examples of the private URLs that are being crawled?

View solution in original post

0 Upvotes