Extend stop-words-list (for Webhelp search)

Silke
Posts: 10

Extend stop-words-list (for Webhelp search)

Thu Mar 31, 2016 2:58 pm

Hi,

ist there a possibility to extend the stop-word-list for Webhelp output -> index-1.js?

I searched for a example from the stoplist: "anderr".
- my content does not contain that string.
- the oxygen folder contains "anderr" in the de-DE.dic in words like "Wanderrucksack".

In the webhelp-plugin folder ist a file: de_words.properties which looks as if the stop words do come from, but "anderr" is not in there?

Thank you in advance
Silke
bogdan_cercelaru
Posts: 205

Re: Extend stop-words-list (for Webhelp search)

Fri Apr 01, 2016 4:34 pm

Hello,

Unfortunately you cannot change the stop-words-list using the current implementation of the WebHelp.
I have registered your request in our issue tracking system to be analyzed.

Regards,
Bogdan
Bogdan Cercelaru
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
ckabstein
Posts: 58

Re: Extend stop-words-list (for Webhelp search)

Tue May 16, 2017 3:13 pm

Hi,

Given the fact that the new webhelp search now does the following:

"Always search for words containing three or more characters (shorter words, such as to or of are ignored). This rule does not apply to CJK (Chinese, Japanese, Korean) languages."

and

"To improve performance, the Search feature excludes certain stop words. For example, the English version of such stop words include: a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with."

I would be interested to know if it's now possible to
a) extend or reduce the list of stop words and where we could do this.
b) add more lists for other languages.

And finally, which rules apply to CJK languages exactly? And where are these set?

Thanks,
Christina
ionela
Posts: 218

Re: Extend stop-words-list (for Webhelp search)

Wed May 17, 2017 12:56 pm

Hi Christina,

Unfortunately, the stop-words-list is not configurable.
The stop words are computed dynamically depending on the language you have chosen when you publish your documentation. They are computed by the search indexer and written in the out/webhelp-responsive/oxygen-webhelp/search/index-1.js file:

We have also discussed about this on the following topic from our forum:
https://www.oxygenxml.com/forum/post42687.html#p42687

Regards,
Ionela
Ionela Istodor
oXygen XML Editor and Author Support
ckabstein
Posts: 58

Re: Extend stop-words-list (for Webhelp search)

Wed May 17, 2017 1:01 pm

Hi Ionela,

Sorry, I didn't see that post. Thanks for pointing me towards that one.

Best,
Christina

Return to “DITA (Editing and Publishing DITA Content)”

Who is online

Users browsing this forum: Majestic-12 [Bot] and 0 guests