[oXygen-user] Feature request: Improvement of Japanese search for WebHelp

Oxygen XML Editor Support (Radu Coravu)
Fri May 8 08:36:17 CDT 2015


Hi,

We have not considered this in the past. If at some point you have 
something which works and you are willing to share it with us, please do :)

Regards,
Radu

Radu Coravu
<oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

On 4/30/2015 5:18 PM, T. Hatanaka wrote:
> Thanks for the quick turnaround.
>
> As a related but side note, I'm thinking to add to WebHelp a simple fuzzy query matching/scoring based on the Levenshtein distance ratio between the query and the index.
> Have you considered it in the past?
> I thought you might have considered it, as it would benefit not only Far Eastern cultures but probably also Western cultures. (albeit some Western's have the stemmer in WebHelp)
>
> Thanks,
> T. Hatanaka
>
> -----Original Message-----
> From:  [mailto:] On Behalf Of Support Oxygen XML Editor (Sorin Ristache)
> Sent: Thursday, April 30, 2015 10:50 PM
> To: T. Hatanaka; 
> Subject: Re: [oXygen-user] Feature request: Improvement of Japanese search for WebHelp
>
> Hi,
>
> Right, good catch! I just fixed it, the search box will allow in Oxygen
> 17.0 also search words of length shorter than three characters if the
> language of the content is Japanese.
>
>
> Best regards,
> Sorin
>
> <oXygen/> XML Editor
>
> http://www.oxygenxml.com
>
>
> On 4/30/2015 4:13 PM, T. Hatanaka wrote:
>> Sorin,
>>
>> I tried it out and found the indexed terms improved as expected.
>> Also confirmed that the user dictionary was loaded, which prevented the critical words from being missed.
>>
>> A simple but rather critical issue in the user experience at this moment:
>> nwSearchFnt.js expects 3 or more characters, while the Japanese words usually consist of 2 or more characters.
>> As a quick fix for testing purposes, I reduced the following threshold to 1 when indexerLanguage=='ja'.
>>
>>>               if (finalArray[x].length > 2 || useCJKTokenizing){
>>
>>>       if (word.length > 2) {
>>
>>
>> There are a few other minor things that could be improved; I'll report them in due course.
>> The experience as a whole is very promising. Thank you so much!
>>
>>
>> Thanks,
>> T. Hatanaka
>>
>> -----Original Message-----
>> From:  [mailto:] On Behalf Of Support Oxygen XML Editor (Sorin Ristache)
>> Sent: Thursday, April 30, 2015 3:15 PM
>> To: T. Hatanaka; 
>> Subject: Re: [oXygen-user] Feature request: Improvement of Japanese search for WebHelp
>>
>> Hi,
>>
>> Please test the search in WebHelp pages Japanese content with the
>> following Oxygen 17.0 beta build:
>>
>> http://www.oxygenxml.com/userFiles/oxygen-AllPlatforms-17.0-beta-build-2015042917.tar.gz
>>
>> You have to set the Japanese language in your DITA map (by adding the
>> xml:lang="ja-jp" parameter) or in the args.default.language parameter
>> and also follow the instructions at the end of the listing with DITA
>> transformation output messages for adding the Kuromoji analyzer runtime
>> file. After you add the runtime file the DITA transformation will detect
>> it and call it to index the Japanese content.
>>
>> The search terms should be separated by spaces as in European languages
>> but I think you already know that better than us because you are the one
>> who suggested we can assume that for a Japanese search string entered in
>> a web page.
>>
>> A Japanese user dictionary for the Kuromoji analyzer is optional and can
>> be set in the webhelp.search.japanese.dictionary parameter.
>>
>>
>> Best regards,
>> Sorin
>>
>> <oXygen/> XML Editor
>>
>> http://www.oxygenxml.com
>>
>>
>> On 4/30/2015 8:47 AM, T. Hatanaka wrote:
>>> Super!
>>> It'll be a great bonus to 17.0 whose beta itself is already amazing.
>>> I'll definitely test it out as soon as released.
>>> If I can help in any way at this moment, please let me know.
>>>
>>> Thanks,
>>> T. Hatanaka
>>>
>>> -----Original Message-----
>>> From:  [mailto:] On Behalf Of Sorin Ristache
>>> Sent: Tuesday, April 28, 2015 11:16 PM
>>> To: T. Hatanaka; 
>>> Subject: Re: [oXygen-user] Feature request: Improvement of Japanese search for WebHelp
>>>
>>> Hi,
>>>
>>> We implemented indexing of Japanese content in the WebHelp pages with
>>> the Kuromoji analyzer. It will go in the upcoming Oxygen 17.0 release.
>>> The Kuromoji analyzer will not be included in the Oxygen install kit,
>>> but a detailed INFO message in the DITA transformation console view will
>>> tell you where you can download the analyzer runtime file and where you
>>> have to copy it in the Oxygen install directory in order to enable
>>> indexing of Japanese content.
>>>
>>> Also a new parameter was added in the transformation for setting a user
>>> dictionary to the Kuromoji analyzer. It is an optional parameter, so the
>>> search results should be greatly improved on Japanese content even
>>> without setting a user dictionary.
>>>
>>>
>>> Regards,
>>> Sorin
>>>
>>> <oXygen/> XML Editor
>>>
>>> http://www.oxygenxml.com
>>>
>>>
>>> On 4/17/2015 2:34 PM, T. Hatanaka wrote:
>>>> Thank you for hearing.
>>>> Analyzer + user dic will be a huge plus in the relevant cultural area like us.
> _______________________________________________
> oXygen-user mailing list
> 
> http://www.oxygenxml.com/mailman/listinfo/oxygen-user
>
>
> _______________________________________________
> oXygen-user mailing list
> 
> http://www.oxygenxml.com/mailman/listinfo/oxygen-user
>


More information about the oXygen-user mailing list