The following characters signal that a piece of text is to be treated as embedded. For example, an English quotation in the middle of an Arabic sentence could be marked as being embedded left-to-right text. If there were a Hebrew phrase in the middle of the English quotation, that phrase could be marked as being embedded right-to-left text. Embeddings can be nested one inside another, and in isolates and overrides.
Abbr. | Code Point | Name | Description |
---|
LRE | U+202A | LEFT-TO-RIGHT EMBEDDING | Treat the following text as embedded left-to-right. |
RLE | U+202B | RIGHT-TO-LEFT EMBEDDING | Treat the following text as embedded right-to-left. |
The effect of right-left line direction, for example, can be accomplished by embedding the text with RLE...PDF. (PDF will be described in Section 2.3, Terminating Explicit Directional Embeddings and Overrides.)
The following characters allow the bidirectional character types to be overridden when required for special cases, such as for part numbers. They are to be avoided wherever possible, because of security concerns. For more information, see [UTR36]. Directional overrides can be nested one inside another, and in embeddings and isolates.
Abbr. | Code Point | Name | Description |
---|
LRO | U+202D | LEFT-TO-RIGHT OVERRIDE | Force following characters to be treated as strong left-to-right characters. |
RLO | U+202E | RIGHT-TO-LEFT OVERRIDE | Force following characters to be treated as strong right-to-left characters. |
The precise meaning of these characters will be made clear in the discussion of the algorithm. The right-to-left override, for example, can be used to force a part number made of mixed English, digits and Hebrew letters to be written from right to left.
The following character terminates the scope of the last LRE, RLE, LRO, or RLO whose scope has not yet been terminated.
Abbr. | Code Point | Name | Description |
---|
PDF | U+202C | POP DIRECTIONAL FORMATTING | End the scope of the last LRE, RLE, RLO, or LRO. |
The precise meaning of this character will be made clear in the discussion of the algorithm.
The following characters signal that a piece of text is to be treated as directionally isolated from its surroundings. They are very similar to the explicit embedding formatting characters. However, while an embedding roughly has the effect of a strong character on the ordering of the surrounding text, an isolate has the effect of a neutral like U+FFFC OBJECT REPLACEMENT CHARACTER, and is assigned the corresponding display position in the surrounding text. Furthermore, the text inside the isolate has no effect on the ordering of the text outside it, and vice versa.
In addition to allowing the embedding of strongly directional text without unduly affecting the bidirectional order of its surroundings, one of the isolate formatting characters also offers an extra feature: embedding text while inferring its direction heuristically from its constituent characters.
Isolates can be nested one inside another, and in embeddings and overrides.
Abbr. | Code Point | Name | Description |
---|
LRI | U+2066 | LEFT‑TO‑RIGHT ISOLATE | Treat the following text as isolated and left-to-right. |
RLI | U+2067 | RIGHT‑TO‑LEFT ISOLATE | Treat the following text as isolated and right-to-left. |
FSI | U+2068 | FIRST STRONG ISOLATE | Treat the following text as isolated and in the direction of its first strong directional character that is not inside a nested isolate. |
The precise meaning of these characters will be made clear in the discussion of the algorithm.
The following character terminates the scope of the last LRI, RLI, or FSI whose scope has not yet been terminated, as well as the scopes of any subsequent LREs, RLEs, LROs, or RLOs whose scopes have not yet been terminated.
Abbr. | Code Point | Name | Description |
---|
PDI | U+2069 | POP DIRECTIONAL ISOLATE | End the scope of the last LRI, RLI, or FSI. |
The precise meaning of this character will be made clear in the discussion of the algorithm.
These characters are very light-weight formatting. They act exactly like right-to-left or left-to-right characters, except that they do not display or have any other semantic effect. Their use is more convenient than using explicit embeddings or overrides because their scope is much more local.
Abbr. | Code Point | Name | Description |
---|
LRM | U+200E | LEFT-TO-RIGHT MARK | Left-to-right zero-width character |
RLM | U+200F | RIGHT-TO-LEFT MARK | Right-to-left zero-width non-Arabic character |
ALM | U+061C | ARABIC LETTER MARK | Right-to-left zero-width Arabic character |
There is no special mention of the implicit directional marks in the following algorithm. That is because their effect on bidirectional ordering is exactly the same as a corresponding strong directional character; the only difference is that they do not appear in the display.