We are having issues with the Post Listing module lately. Google Search Console is finding crawl errors for all posts linked in the listing. The links behave normally when clicked, but they are sending Google to a broken page.
These all lead to a blank page that just says “expired”
If I search the source, these broken links aren’t present. If I use developer tools to search the source, they are. So, these are generated by javascript. Has anyone else run into this problem and fixed it? The only solution I can think of is abandoning the hubspot module and just writing a new one - the custom made ones I have on other sites don’t produce this issue. I’d like to avoid this because we have a lot of templates for this site each needing a listing from a different category - it’d be great if HubSpot could just fix the issue with the standard module.
I'm having issues with this as well. I see that up until October 4th, these pages from popular posts etc were being blocked, and something must have changed on the fourth?
Thanks for your patience here. I dug into this with the team, and we've removed these resources from the robots.txt files by default and set the x-robots-tag: none Response Header (Equivalent to noindex, nofollow) on the _hcms/postlisting requests as documented by Google. This should resolve these errors in the console going forward.
I just got a new batch of alerts from Google Search Console related to these errors.
Googlebot couldn't crawl your URL because your server either requires authentication to access the page, or it is blocking Googlebot from accessing your site. Learn more
I had a bunch of errors (soft-404) on /_hcms/perf paths. In my case the record in robots.txt looked like this;
Disallow: /_hcms/perf/
This does not however match /_hcms/perf (without the trailing slash). So I changed this to;
Disallow: /_hcms/perf
Which actually does match the url with the trailing slash.
I suppose your problem with the /_hcms/postlisting url is caused by a similar record (although in my case Hubspot adds this url by default to the robots.txt)
I am seeing an error on similar pages in my search console as well. It is listed as "Indexed, though blocked by robots.txt". Is it important for these pages to be indexed? All my posts have a similar URL in the source but the 'hs-signature' value changes depending on the page. What is the 'hs-signature'?
Can you give me an example page where you're seeing these links? It'd also be helpful if you could include an example of a full URL being included in the search console.
Thanks for your patience here; we've recently pushed out a fix that should prevent Google Search Console from picking up these links. Let me know if you're still seeing issues going forward (allow ~5 hours for all caches to clear).
Sorry for the delay here, I appreciate your patience. I'm currently digging into this with the team. I'll update this topic when I have more information.
Actually, the issue is no longer happening. My best guess is that they fixed it in this last day or so. I’ll tell search console these have been fixed and see if they pop back up again. Hopefully they won’t.
Using developer tools, you can find the offending links. They are not in the source code otherwise. Google is finding them and I suppose that is our main problem with this.
The offending blog listings are using the standard hubSpot Post Listing Module.