WebHelp search algorithm

Post here questions and problems related to editing and publishing DITA content.
ann.jensen
Posts: 295
Joined: Wed Jun 17, 2015 10:19 am

WebHelp search algorithm

Post by ann.jensen »

Hi,
I have read the Oxygen XML Editor Help page here - http://www.oxygenxml.com/doc/versions/1 ... ption.html which says -
The Search tab is enhanced with a rating mechanism that computes scores for every page that matches the search criteria. These scores are then translated into a 5-star rating scheme. The search results are sorted depending on the following:
• The number of keywords found in a single page (the higher the number, the better).
• The context (for example, a word found in a title scores better than a word found in unformatted text). The search ranking order, sorted by relevance is as follows:
o The search phrase is included in a meta keyword
o The search phrase is in the title of the page
o The search phrase is in bold text in a paragraph
o The search phrase is in normal text in a paragraph


However, I have noticed that WebHelp search results seems to give highest priority to finding the search words all in a single sentence also (which is very intuitive).
Is this another part of the search algorithm?
Thanks,
Ann
bogdan_cercelaru
Posts: 222
Joined: Tue Jul 01, 2014 11:48 am

Re: WebHelp search algorithm

Post by bogdan_cercelaru »

Hello,

I've analyzed the code used to search the WebHelp output and I can confirm that the information from user-guide is correct.
However, I have noticed that WebHelp search results seems to give highest priority to finding the search words all in a single sentence also (which is very intuitive).
Is this another part of the search algorithm?
Probably what you have noticed is a fortunate match and a result of the first rule applied to sort the results:
The number of keywords found in a single page (the higher the number, the better).
Regards,
Bogdan
Bogdan Cercelaru
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Post Reply