Webhelp search doesn't find term that is present on html-page
Post here questions and problems related to editing and publishing DITA content.
-
- Posts: 1
- Joined: Tue Oct 25, 2022 5:22 pm
Webhelp search doesn't find term that is present on html-page
We're transforming our dita files and maps to the webhelp format which and the result is just as intended. However, we noticed something peculiar when using the search function:
When searching for farbe achsen, a page that actually contains the term achsen shows as follows in the results (in the red box it says that the term is missing on that page)
Does somebody experience the same behavior and knows how to solve it?
When searching for farbe achsen, a page that actually contains the term achsen shows as follows in the results (in the red box it says that the term is missing on that page)
25-10-_2022_16-29-01.png
However, it is there.
25-10-_2022_16-29-58.png
It's like the term can't be found as a substring on this page. On other pages, however, the two search terms can be found as a string and substring.Does somebody experience the same behavior and knows how to solve it?
You do not have the required permissions to view the files attached to this post.
-
- Posts: 846
- Joined: Mon Dec 05, 2011 6:04 pm
Re: Webhelp search doesn't find term that is present on html-page
Hi mbur1,
The issue seems to reside in the incorrect language the content in your DITA Map is indexed by the WebHelp indexer.
For the indexer to consider terms in German language, you should explicitly set the language for your DITA Map to German.
To do that, you could either do it manually, by adding the "xml:lang" attribute on your DITA Map's root element and set its value to "de" or "de-DE", or, if you run the transformation from a GUI-based software (like Editor or Author) set the value of the dedicated transformation scenario parameter "default.language" (you can find the parameter if you edit the WebHelp transformation scenario you are using and look for it in the "Parameters" tab).
If you still encounter indexing issues, even after you explicitly set the German language on your DITA Map, to investigate, you should send a complete DITA Map files hierarchy in an as minimal form as possible (the DITA Map with a few topics) on support@oxygenxml.com and we will look into it.
Kind Regards,
Costin
The issue seems to reside in the incorrect language the content in your DITA Map is indexed by the WebHelp indexer.
For the indexer to consider terms in German language, you should explicitly set the language for your DITA Map to German.
To do that, you could either do it manually, by adding the "xml:lang" attribute on your DITA Map's root element and set its value to "de" or "de-DE", or, if you run the transformation from a GUI-based software (like Editor or Author) set the value of the dedicated transformation scenario parameter "default.language" (you can find the parameter if you edit the WebHelp transformation scenario you are using and look for it in the "Parameters" tab).
If you still encounter indexing issues, even after you explicitly set the German language on your DITA Map, to investigate, you should send a complete DITA Map files hierarchy in an as minimal form as possible (the DITA Map with a few topics) on support@oxygenxml.com and we will look into it.
Kind Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
oXygen XML Editor and Author Support
-
- Posts: 115
- Joined: Mon Jul 10, 2023 11:49 am
Re: Webhelp search doesn't find term that is present on html-page
Hi Costin,
I was experiencing the same or maybe similar issue as this one, then I took a search against "search result" in the forum, that's why I'm here.
In the screenshot above, I was trying to search Python API, and we do have a document with such title, but in the search results, this document was ranked 6th instead of 1st. From the search results, it seems the search was carried out under the or operator (which is true and that's how I specified in the opt file), and search results were given by joining the search results for "Python" and "API", which is:
searchResult1 = search "Python";
searchResult2 = search "API"
search result = searchResult1+searchResult2
I'm not sure if I was correct about this.
So for a workaround, should I modify the search operator from "or" to "and" to force the built-in search join two or more words separated by space into one string, like, search for A B C = search for "ABC" or "A32B32C" (in which 32 is the decimal value for space in ASCII table)?
I was experiencing the same or maybe similar issue as this one, then I took a search against "search result" in the forum, that's why I'm here.
python api as keyword
For example, in a webhelp where xml:lang was set as "zh" at the map level and bookmap level (90% of the characters are Chinese), however 10% of the characters are English or latin characters, for example, API, plugin name, function name, parameter name, configuration items, code blocks, etc. In the screenshot above, I was trying to search Python API, and we do have a document with such title, but in the search results, this document was ranked 6th instead of 1st. From the search results, it seems the search was carried out under the or operator (which is true and that's how I specified in the opt file), and search results were given by joining the search results for "Python" and "API", which is:
searchResult1 = search "Python";
searchResult2 = search "API"
search result = searchResult1+searchResult2
I'm not sure if I was correct about this.
So for a workaround, should I modify the search operator from "or" to "and" to force the built-in search join two or more words separated by space into one string, like, search for A B C = search for "ABC" or "A32B32C" (in which 32 is the decimal value for space in ASCII table)?
You do not have the required permissions to view the files attached to this post.
-
- Site Admin
- Posts: 275
- Joined: Thu Dec 24, 2009 11:21 am
Re: Webhelp search doesn't find term that is present on html-page
Hello,
Search results are ranked based on a complex algorithm used by the search engine to determine the relevance of each page (topic) to the user's search query.
The search engine computes scores for every topic that matches the search criteria and uses this score to sort the search results.
The search rank of a page depends on the location and the number of occurrences of the searched terms in the content.
The search ranking order, sorted by relevance is determined by the following locations:
The <indexterm> and <keywords> DITA elements are an effective way to increase the ranking of a page.
The terms found in these elements add a lot of weight to the current page in the list of search results.
Regards,
Alin
Search results are ranked based on a complex algorithm used by the search engine to determine the relevance of each page (topic) to the user's search query.
The search engine computes scores for every topic that matches the search criteria and uses this score to sort the search results.
The search rank of a page depends on the location and the number of occurrences of the searched terms in the content.
The search ranking order, sorted by relevance is determined by the following locations:
- Page Title, Keywords & index terms
- Short description and section headings (H1 to H6)
- Bold text
- Italic & underlined text
- Plain text
The <indexterm> and <keywords> DITA elements are an effective way to increase the ranking of a page.
The terms found in these elements add a lot of weight to the current page in the list of search results.
Regards,
Alin
Alin Balasa
Software Developer
<oXygen/> XML Editor
http://www.oxygenxml.com
Software Developer
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 115
- Joined: Mon Jul 10, 2023 11:49 am
Re: Webhelp search doesn't find term that is present on html-page
Hi Alin
Thanks for the reply. I backed up the default scoring.properties file under \Oxygen XML Editor 26\frameworks\dita\DITA-OT\plugins\com.oxygenxml.webhelp.responsive\indexer then modified the weight for each itmes as follows:
h1 = 50
h2 = 40
h3 = 30
h4 = 20
h5 = 10
h6 = 10
b = 10
strong = 5
em = 5
i = 5
u = 5
div.toc = 10
title = 50
div.ignore = ignored
meta_keywords = 1
meta_indexterms = 1
meta_description = 1
shortdesc = 10
As you can see, my intention was to increase the matching rate for search terms in titles from h1 to 6 and page title (or the root topic title if it's a file contains nested topics). Then I gave it a try with this scoring file and generate a webhelp. The ranking for certain specific search item didn't go up as I expected.
Also, I set the weight for keyword and indexterms to very small value because in the content of my map and files, I didn't added lots of topicmeta info for various topics, not much at all. Is there a way to add meta keywords or index terms in one batch using topic file names or page title?
Thanks for the reply. I backed up the default scoring.properties file under \Oxygen XML Editor 26\frameworks\dita\DITA-OT\plugins\com.oxygenxml.webhelp.responsive\indexer then modified the weight for each itmes as follows:
h1 = 50
h2 = 40
h3 = 30
h4 = 20
h5 = 10
h6 = 10
b = 10
strong = 5
em = 5
i = 5
u = 5
div.toc = 10
title = 50
div.ignore = ignored
meta_keywords = 1
meta_indexterms = 1
meta_description = 1
shortdesc = 10
As you can see, my intention was to increase the matching rate for search terms in titles from h1 to 6 and page title (or the root topic title if it's a file contains nested topics). Then I gave it a try with this scoring file and generate a webhelp. The ranking for certain specific search item didn't go up as I expected.
Also, I set the weight for keyword and indexterms to very small value because in the content of my map and files, I didn't added lots of topicmeta info for various topics, not much at all. Is there a way to add meta keywords or index terms in one batch using topic file names or page title?
Return to “DITA (Editing and Publishing DITA Content)”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service