Administer > Database administration > Data persistence > IR Expert > Customizing IR Expert for foreign languages

Customizing IR Expert for foreign languages

IR Expert already supports English, German, Chinese, Japanese, and Korean languages without customization. However, you can configure IR Expert to perform efficiently in Spanish, French, Italian, and Portuguese or any other language with the use of stop words, stem, and suffix dictionaries by using these files:

  • The stop word file: [ir_languagefiles_path]<language>.stp
  • The stem dictionary: [ir_languagefiles_path]language.STM
  • The suffix file: [ir_languagefiles_path]language.SUF

Note IR search is not supported for the Arabic and Hebrew languages.

For Chinese, Japanese, or Korean, see the following section.

Implement IR searches for foreign languages

Applies to User roles: System Administrator

In this example we use <language>. Substitute the language you need to implement wherever you see “<language>”.

To enter language files into the IR system:

  1. For English, Spanish, French, Italian, Portuguese, and Japanese, create the files:

    • <language>.stp (stop words)
    • <language>.stm (stem dictionary)
    • <language>.suf (suffix dictionary)
    • <language>.nor (normals dictionary)

    For Chinese use the following files:

    • irlang/cma_options.utf8
    • irlang/cma/*

    For Japanese, use the following files:

    • irlang/jma_options.utf8
    • irlang/jma/*

    For Korean, use the following files:

    • irlang/kma_options.utf8
    • irlang/kma/*

    Note: These file names may vary for different languages or platforms.

  2. Place those files in a unique directory. In this example, we use the directory DICT_PATH.
  3. Insert the following parameters into the sm.ini file.

    ir_language:<language>

    ir_languagefiles_path:..\irlang\>

    ir_opt_path:..\irlang\cma_options.utf8 for Chinese

    or

    ir_opt_path:..\irlang\jma_options.utf8 for Japanese

    or

    ir_opt_path:..\irlang\kma_options.utf8 for Korean

 

Related topics

Chinese, Japanese, and Korean language analyzer
Enable IR search for a file
Access IR Expert
Start IR Asynchronous mode