Understanding an unexpected text order

HPE Service Manager stores texts in Unicode encoding. Service Request Catalog displays texts by using the Unicode standard, which defines the order in which text should be displayed by specifying a particular base direction.

The base direction influences the order of the display of text of different directions, and the display of directionally-neutral text (i.e., characters or sequences of characters that do not have inherent directionality, as defined in the Unicode Character Standard).

In Service Request Catalog, if the language is Arabic or Hebrew, the base direction is RTL; otherwise, LTR.

For example, suppose a user writes the following text in English as the name of a catalog item. In an LTR language in Service Request Catalog, the text is displayed exactly as shown:

Printing (North America)

However, in a Hebrew or Arabic language environment, where the base direction is RTL, the text displays as follow:

(Printing (North America

This behavior occurs because a parenthesis is directionally neutral in the Unicode standard and does not have an inherent direction. The Unicode text engine first checks whether a parenthesis can inherit a direction the from surrounding text. If a parenthesis cannot inherit a direction from the surrounding text, it defaults to the base direction. In the example text, the first parenthesis is embedded in a single run of LTR text. Therefore, it adopts the direction of its surrounding text. The second parenthesis, which is at the end of the text, is not surrounded by text of a certain direction. Therefore, the second parenthesis defaults to the RTL base direction.

Thus, the text can be divided into two parts with different directions, wherein the first part is LTR:

Printing (North America

And, the second part is RTL:

(

Because “)” displays as “(” in the RTL direction, and the two divided text parts are ordered in the RTL base direction, the text is displayed as shown:

(Printing (North America

To resolve this issue, the Unicode standard contains two special characters that are used in bi-direction-enabled text.

Unicode character Code HTML Name Description

LRM

U+200E

‎

‎

‎

LEFT-TO-RIGHT MARK

Left-to-right zero-width character

RLM

U+200F

‏

‏

‏

RIGHT-TO-LEFT MARK Right-to-left zero-width character

These zero-width invisible characters act as surrounding text. These characters enable directionally-neutral characters (such as parentheses) to inherit the correct direction from the surrounding text instead of defaulting to the base direction.

In the prior example, you can add a left-to-right mark after the second parenthesis to fix the problem as shown in the following example:

Printing (North America)LRM

Because both parentheses are surrounded by LTR text, the whole text is treated as a single run of LTR text. Therefore, the text is displayed as we want in Hebrew or Arabic environment:

Printing (North America)

There are other control characters related to bi-directional text in the Unicode standard which are rarely used. For full instruction on the Unicode Bidirectional Algorithm and the related control characters, see the following links:

http://www.w3.org/International/articles/inline-bidi-markup/uba-basics

http://www.unicode.org/L2/L2012/12173-bidi-paren.pdf

http://www.unicode.org/reports/tr9/