Skip to content

Commit dca45f7

Browse files
authored
Unicode art, grammar suggestions
1 parent 18b1314 commit dca45f7

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

1-js/99-js-misc/06-unicode/article.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Unicode, String internals
33

44
```warn header="Advanced knowledge"
5-
The section goes deeper into string internals. This knowledge will be useful for you if you plan to deal with emoji, rare mathematical or hieroglyphic characters or other rare symbols.
5+
The section goes deeper into string internals. This knowledge will be useful for you if you plan to deal with emoji, rare mathematical or hieroglyphic characters, or other rare symbols.
66
```
77

88
As we already know, JavaScript strings are based on [Unicode](https://en.wikipedia.org/wiki/Unicode): each character is represented by a byte sequence of 1-4 bytes.
@@ -11,21 +11,21 @@ JavaScript allows us to insert a character into a string by specifying its hexad
1111

1212
- `\xXX`
1313

14-
`XX` must be two hexadecimal digits with value between `00` and `FF`, then it's character whose Unicode code is `XX`.
14+
`XX` must be two hexadecimal digits with a value between `00` and `FF`, then it's a character whose Unicode code is `XX`.
1515

1616
Because the `\xXX` notation supports only two digits, it can be used only for the first 256 Unicode characters.
1717

18-
These first 256 characters include latin alphabet, most basic syntax characters and some others. For example, `"\x7A"` is the same as `"z"` (Unicode `U+007A`).
18+
These first 256 characters include the latin alphabet, most basic syntax characters, and some others. For example, `"\x7A"` is the same as `"z"` (Unicode `U+007A`).
1919

2020
```js run
2121
alert( "\x7A" ); // z
2222
alert( "\xA9" ); // ©, the copyright symbol
2323
```
2424

2525
- `\uXXXX`
26-
`XXXX` must be exactly 4 hex digits with the value between `0000` and `FFFF`, then `\uXXXX` is a character whose Unicode code is `XXXX` .
26+
`XXXX` must be exactly 4 hex digits with the value between `0000` and `FFFF`, then `\uXXXX` is a character whose Unicode code is `XXXX`.
2727
28-
Characters with Unicode value greater than `U+FFFF` can also be represented with this notation, but in this case we will need to use a so called surrogate pair (we will talk about surrogate pairs later in this chapter).
28+
Characters with Unicode values greater than `U+FFFF` can also be represented with this notation, but in this case, we will need to use a so called surrogate pair (we will talk about surrogate pairs later in this chapter).
2929
3030
```js run
3131
alert( "\u00A9" ); // ©, the same as \xA9, using the 4-digit hex notation
@@ -120,7 +120,7 @@ For instance, the letter `a` can be the base character for these characters: `à
120120

121121
Most common "composite" characters have their own code in the Unicode table. But not all of them, because there are too many possible combinations.
122122

123-
To support arbitrary compositions, Unicode standard allows us to use several Unicode characters: the base character followed by one or many "mark" characters that "decorate" it.
123+
To support arbitrary compositions, the Unicode standard allows us to use several Unicode characters: the base character followed by one or many "mark" characters that "decorate" it.
124124

125125
For instance, if we have `S` followed by the special "dot above" character (code `\u0307`), it is shown as Ṡ.
126126

@@ -167,6 +167,6 @@ alert( "S\u0307\u0323".normalize().length ); // 1
167167
alert( "S\u0307\u0323".normalize() == "\u1e68" ); // true
168168
```
169169

170-
In reality, this is not always the case. The reason being that the symbol `` is "common enough", so Unicode creators included it in the main table and gave it the code.
170+
In reality, this is not always the case. The reason is that the symbol `` is "common enough", so Unicode creators included it in the main table and gave it the code.
171171

172172
If you want to learn more about normalization rules and variants -- they are described in the appendix of the Unicode standard: [Unicode Normalization Forms](https://www.unicode.org/reports/tr15/), but for most practical purposes the information from this section is enough.

0 commit comments

Comments
 (0)