Edit online

Search Engine

Search engine has two main components:

Search indexer

It is also known as a spider. This component is active when you publish your documentation to WebHelp and it is responsible for creating the search index. This component traverses all HTML pages (for DITA topics) to gather information.

Search interface
This component is an interface between the user and the search index. It helps the user to search through the search index and displays results in the search page.

Search Field and Results Page

When you enter search terms in the Search field, the results are displayed in a results page. When you click on a result, the topic is opened in the main pane and the search results are highlighted. If you want to remove the colored highlights, click the Toggle Highlights button at the top-right side of the page. The Search field also includes an autocomplete feature.

Each result includes the topic title that can be clicked to open that page. Under the title, a breadcrumb is displayed that shows the path of the topic and you can click any of the topics in the breadcrumb to open that particular page.

If you enter multiple search terms (other than stop words), for any result that the search engine found at least one term but not one or more of the other terms, the Missing terms will be listed below each result.

5-Star Rating Mechanism and Sorting

The Search feature is also enhanced with a rating mechanism that computes scores for every result that matches the search criteria. These scores are then translated into a 5-star rating scheme and the stars are displayed to the right of each result. The search results are sorted depending on the following:
  • Search entries that satisfy the phrase search criterion are presented first.
  • The number of keywords found in a single page (the higher the number, the better).
  • The context (for example, a word found in a title, scores better than a word found in unformatted text). The search ranking order, sorted by relevance is as follows:
    • The search term is included in a meta keyword.
    • The search term is in the title of the page.
    • The search term is in bold text in a paragraph.
    • The search term is in normal text in a paragraph.

Tag Element Scoring Values

HTML tag elements are also assigned a scoring value and these values are evaluated for the search results. For information about editing these values, see How to Change Element Scoring in Search Results.

Search Rules

Rules that are applied during a search include:
  • You can use quotes to perform an exact search for multiple word phrases (for example, "grow flowers" will only return results if both words are found consecutively and exactly as they are typed in the search field). This type of search is known as a phrase search.
  • Boolean Search is supported using the following operators: and, or, not. When there are two adjacent search terms without an operator, or is used as the default search operator (for example, grow flowers is the same as grow or flowers).
  • The space character separates keywords (an expression such as grow flowers counts as two separate keywords.
  • Words composed by merging two or more words with colon (":"), minus ("-"), underline ("_"), or dot (".") characters count as a single word.
  • Your search terms should contain two or more characters (note that stop words will be ignored). This rule does not apply to CJK (Chinese, Japanese, Korean) languages.
  • When searching for multi-word phrases in CJK (Chinese, Japanese, Korean) languages that often have multiple words appear in strings without a space separator, you need to add a space to separate the words. Otherwise, WebHelp will not find results. For example, Chinese uses a specialized character for space separators, but the current WebHelp implementation cannot detect such specialized characters, so to search for 开始之前 (it translates as "before you begin" or "before start"), you have to enter 开始 之前 (notice the space between the second and third symbols) in the search field.
Tip: The <indexterm> and <keywords> DITA elements are an effective way to increase the ranking of a page (for example, content inside a keywords element weighs more than an H1 HTML element).

Excluded Terms

To improve performance, the Search feature excludes certain stop words. For example, the English version of the stop words includes: a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with.