Hi Radu,
I asked a similar question four months ago, Costin gave me a javascript template as a solution, then I modified that script to make it happen.
See here:
post72152.html#p72152
The logic is quite simple: for every Chinese input in the search bar, split the string and separate every 2 chars by spaces. Because after hundreds of webhelp transformation, I can see that searching Chinese content doesn't work when the number of characters exceed 2. For example, "AB" or "A" is ok as search words, but ABC, ABCD, ABCDE would fail.
So the script is as follows:
var executed = false;
$( document ).ready(function() {
$("#searchForm").on("submit", (e) => {
// WebHelps triggers the submit event handler multiple times.
if(!executed) { // We make sure that we execute it only one time.
e.stopPropagation();
var userQuery = $('#textToSearch').val();
if (userQuery.trim() === '') {
e.preventDefault();
return false;
}
if (!/^[a-zA-Z]+$/.test(userQuery)) {
userQuery = userQuery.replace(/[\u4e00-\u9fa5]{2}/g, '$& '); // if the input isn't english characters, split every given Chinese input (string) by separating every two Chinese chars by spaces.
}
// userQuery = userQuery.replace(/[\u4e00-\u9fa5]{2}/g, '$& '); // split every given Chinese input by separating every two Chinese characters by spaces.
//userQuery = userQuery.replace(/[\u4e00-\u9fa5]/g, '$& '); // split every given Chinese input by separating every Chinese character by spaces.
$('#textToSearch').val(userQuery);
executed = true;
}
});
});
In the scripts above, starting from line 13, there are three options for splitting a given Chinese string in search bar. The first one, which I'm using now, detects the input string to determine if it's Chinese or latin languages, splits the string into an array of word groups, for example, converts "ABCDEFG" into "AB CD EF G" as search keywords, returns the result to the search form.
The next two options functions in the same way but only detects if the string is Chinese and will carry out only when the string contains Chinese. See the comments after each line.
The js file is specified in an html file:
<!DOCTYPE html>
<html>
<script src="${oxygen-webhelp-template-dir}/js/custom.js" defer="defer"></script>
</html>
And in the publishing template, js file is included in fileset area, and the html file that calls the js file is specified in html-fragment area.
<html-fragments>
<fragment file="fragments/js.html" placeholder="webhelp.fragment.after.body"/>
<!--DO NOT DELETE THIS LINE!-->
</html-fragments>
<resources>
<css file="styles.css"/>
<fileset>
<include name="resources/**/*"/>
<exclude name="resources/**/*.svn"/>
<exclude name="resources/**/*.git"/>
<include name="js/**"/>
</fileset>
</resources>
Here's the publishing package I'm using. Thanks Costin for the great answer, and I'm happy to share the solution other writes struggling with similar issues.
chn_webhelp.zip
----------------------
The scripts above is quite simple and function perfectly, except that the splitted strings, or let's say, group of words are also displayed in the search bar on a search result page. For example, for a given string containing Chinese characters ABCDEFG, when the search results are returned, "AB CD EF G" as visible in the search bar. It would be great to have a script or two to join these word groups into one piece in the search bar after returning the search results.