Add a "no index" checkbox to the files tool

Hi,

 

Every company has some documents which are gated and need to be excluded from search engine crawlers. There is a convuloted solution at the moment which is to add everysingle URL to the robot.txt file.

 

It would be great to have a checkbox for each document to specifiy if we want to disallow crwaling for the document.

5 Replies
Regular Contributor

Hi there,

 

You're right, this is more difficult than it really should be. In reality, it's pretty simple to set up – you just need to set up two directories.

 

For files:

The trick is to create a specific folder within the file system in which all contents are not indexed, then add that folder to the robots.txt directory with a wildcard (asterisk "*"). Any file that lives in that folder will be hidden from search.

 

For pages, same idea:

Name a given directory you would like to disable search on by using the robots.txt file (it can be named whatever you like on your hubspot domain.) To hide a page, you have to custom-set the URL of any landing page to be in that directory so that it will not show up on search engines.

 

If this is too complex, the hubspot people should be able to guide you through it.

Occasional Contributor

Thank you @aarongarcia1. Great workaround! I implemented it right away :-). 

New Member

Good idea!

New Member

If I don't have a Hubspot domain?

 

Let's say I'm not suing HS for my website but I am using HS to house my files. Then, my files live under cdn.hubspot.com. In this case, how can I set up a robots.txt file to unindex specific files? Or can't I? @aarongarcia1 

HubSpot Product Team
HubSpot Product Team
updated to: Delivered

Thank you for submitting your ideas and upvotes.

 

Happy to announce that we have rolled out functionality to manage whether your files are being indexed by search engines on a per-file basis: https://www.hubspot.com/product-updates/control-file-visibility-in-the-file-manager