Content Search API results are unusable and related minimum score question
Jan 16, 2020 5:23 PM - edited Jan 19, 2020 9:05 AM
Hi HS community people, this is an updated question on a post from last week.
I'm hoping someone can help confirm that there is no option for improving the results returned. In pulling search results for a hubspot blog using the content search API, the returned posts, in the order they're returned is pretty unusable.
Steps to reproduce the issue
If I search a two word phrase, it's going to return posts matching either word in the phrase by default, sorted by publish date (i think?).
Possible solution in better use of the minimum score param ?
I went through the docs a few times and tinkered with how to structure the calls to order the results by relevance to query. The minimum score paramter to reduce the number of returned results, which seems to help with relevance a little by limiting results, though the range it uses or what that means doesn't seem to be documented at all.
Minimum Score (&minScore=) Specifies the minimum score threshold to return a given result. This parameter is intentionally set low by default in order to return many results. Increase this for higher precision, but less recall.
I've tried things like 0.05 or 0.01 or 0 and the results returned change, but I can't recognize any patterns as to why or what has changed. It just seems like it limits the results sometimes from a 0 to 0.1 range.
But I know there is a limit parameter so it has to be doing something different than that.
Alternative solutions possibly?
If there is no good solution to order results with content search API, it's starting to look like our options are:
1. return unusable search results
2. migrate the blog to another CMS
3. use a crawler based third party search engine
I'd really appreciate any help or feedback here!