Administer > Database administration > Data persistence > IR Expert > How IR Expert evaluates documents for relevance

How IR Expert evaluates documents for relevance

IR Expert queries return results based on relevance to the query. To do this, IR Expert looks at each term used in an IR query and gives a ranking to the term, based on how often it appears in the stored documents.

A term found in many documents has a lower rank than a term found in a few documents. For example, if all of the incidents that customers report involve the Windows operating system, then the term “Window” is in almost every document and has a very small ranking.

After IR Expert assigns ranking, it gives each stored document a weight based on how many of the terms used in the query are in the document, and on how often a term is in the document. A document that contains a term twice has a greater weight than a document that contains the term once. Next, it compares the terms in the document with the terms in the query to see if there is a “phrase” match. If so, IR Expert gives that document a higher weight. Finally, IR Expert considers the most recently updated document to be the most relevant.

Lexical analysis

Symbol Description
Digits Numbers do not make good index terms, and are not usually included as tokens. In some instances, query statements consist of only digits, such as a record number. Therefore, IR Expert indexes digits along with alphabetic characters.
Hyphens Words broken at the end of lines, or including hyphens, can result in multiple word fragment tokens. IR Expert considers hyphenated terms as a single token and does not break them apart.
Other punctuation Other punctuation, including periods (.), commas (,), and underscores (_) are often used as parts of terms. IR Expert allows apostrophes (’), dashes (-), and periods (.) to appear within, but not at the beginning or end of a token.
Case Case distinctions are important in some cases, such as programming languages. IR Expert is case-insensitive. It converts all Service Manager database terms to lower case.

Stemming

Stemming enables the user to find the variants of a term, while reducing the size of the index file. Because single stems typically correspond to several full terms, storing stems instead of full terms enables a compression factor of over 50 percent.

Stop words

The stop words file naming convention in IR Expert eliminates the need for an extensive list of English words. Both the most frequently occurring and least useful words for intelligent data retrieval are identified.

Using IR Expert and Adaptive Learning

IR expert allows users to take search phrases and relate them to solution phrases in a table so that when the same query is used, the usage relevance score for the solution phrase increases. Additionally, IR Expert allows administrators to manually change the relevance score of individual phrases and words.

The accuracy of this feature is based on a query term and how it may have been used, or incorrectly used. The data is useful only if the adaptive learning data is monitored by a System Administrator.

The largest value that can be applied to a query or incident record is 65000.

 

Related topics

IR Expert file descriptions

Access IR Expert

Start IR Asynchronous mode