HubSpot crawling private URLsSOLVE
Feb 4, 2020 6:00 PM
Our server is getting requests to private URLs with a User Agent of: "HubSpot Links Crawler 2.0 http://www.hubspot.com/"
It cannot access these URLs and I understand that it will respect robots.txt if provided.
What I'd like to understand is how HubSpot is getting these URLs to scrape in the first place. They should only be available from private pages, HubSpot shouldn't know these URLs exist.
Is there anything available to help understand how HubSpot is getting these URLs to scrape?
Solved! Go to Solution.
Feb 5, 2020 5:22 PM
Hey @darin1 ,
I know the URLs that you're referring to are private, but it'd be easier to dig into this if you could send an example or two of what URLs are being crawled. Would you mind sending me a private message here on the community with a couple examples of the private URLs that are being crawled?