Add a "no index" checkbox to the files tool

Hi,

 

Every company has some documents which are gated and need to be excluded from search engine crawlers. There is a convuloted solution at the moment which is to add everysingle URL to the robot.txt file.

 

It would be great to have a checkbox for each document to specifiy if we want to disallow crwaling for the document.

HubSpot updates
4 Replies
aarongarcia1
Regular Contributor

Hi there,

 

You're right, this is more difficult than it really should be. In reality, it's pretty simple to set up – you just need to set up two directories.

 

For files:

The trick is to create a specific folder within the file system in which all contents are not indexed, then add that folder to the robots.txt directory with a wildcard (asterisk "*"). Any file that lives in that folder will be hidden from search.

 

For pages, same idea:

Name a given directory you would like to disable search on by using the robots.txt file (it can be named whatever you like on your hubspot domain.) To hide a page, you have to custom-set the URL of any landing page to be in that directory so that it will not show up on search engines.

 

If this is too complex, the hubspot people should be able to guide you through it.

geyries
Occasional Contributor

Thank you @aarongarcia1. Great workaround! I implemented it right away :-). 

easiware
New Member

Good idea!

yayabobi
New Member

If I don't have a Hubspot domain?

 

Let's say I'm not suing HS for my website but I am using HS to house my files. Then, my files live under cdn.hubspot.com. In this case, how can I set up a robots.txt file to unindex specific files? Or can't I? @aarongarcia1