You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 1-js/01-getting-started/2-manuals-specifications/article.md
+2-3
Original file line number
Diff line number
Diff line change
@@ -3,16 +3,15 @@
3
3
4
4
This book is a *tutorial*. It aims to help you gradually learn the language. But once you're familiar with the basics, you'll need other sources.
5
5
6
-
7
6
## Specification
8
7
9
8
**The ECMA-262 specification** contains the most in-depth, detailed and formalized information about JavaScript. It defines the language.
10
9
11
-
But being that formalized, it's difficult to understand at first. So if you need the most trustworthy source of information about the language details, it's the right place. But it's not for everyday use.
10
+
But being that formalized, it's difficult to understand at first. So if you need the most trustworthy source of information about the language details, the specification is the right place. But it's not for everyday use.
12
11
13
12
The latest draft is at <https://tc39.es/ecma262/>.
14
13
15
-
To read about bleeding-edge features, that are not yet widely supported, see proposals at <https://github.com/tc39/proposals>.
14
+
To read about new bleeding-edge features, that are "almost standard", see proposals at <https://github.com/tc39/proposals>.
16
15
17
16
Also, if you're in developing for the browser, then there are other specs covered in the [second part](info:browser-environment) of the tutorial.
|`\u{X…XXXXXX}` (1 to 6 hex characters)|A unicode symbol with the given UTF-32 encoding. Some rare characters are encoded with two unicode symbols, taking up to 4 bytes. This way we can insert long codes. |
84
94
85
95
Examples with unicode:
86
96
@@ -102,7 +112,7 @@ alert( 'I*!*\'*/!*m the Walrus!' ); // *!*I'm*/!* the Walrus!
102
112
103
113
As you can see, we have to prepend the inner quote by the backslash `\'`, because otherwise it would indicate the string end.
104
114
105
-
Of course, that refers only to the quotes that are the same as the enclosing ones. So, as a more elegant solution, we could switch to double quotes or backticks instead:
115
+
Of course, only to the quotes that are the same as the enclosing ones need to be escaped. So, as a more elegant solution, we could switch to double quotes or backticks instead:
106
116
107
117
```js run
108
118
alert( `I'm the Walrus!` ); // I'm the Walrus!
@@ -455,7 +465,7 @@ Let's recap these methods to avoid any confusion:
455
465
```smart header="Which one to choose?"
456
466
All of them can do the job. Formally, `substr` has a minor drawback: it is described not in the core JavaScript specification, but in Annex B, which covers browser-only features that exist mainly for historical reasons. So, non-browser environments may fail to support it. But in practice it works everywhere.
457
467
458
-
The author finds themself using `slice` almost all the time.
468
+
Of the other two variants, `slice` is a little bit more flexible, it allows negative arguments and shorter to write. So, it's enough to remember solely `slice` of these three methods.
459
469
```
460
470
461
471
## Comparing strings
@@ -530,7 +540,7 @@ The characters are compared by their numeric code. The greater code means that t
530
540
531
541
### Correct comparisons
532
542
533
-
The "right" algorithm to do string comparisons is more complex than it may seem, because alphabets are different for different languages. The same-looking letter may be located differently in different alphabets.
543
+
The "right" algorithm to do string comparisons is more complex than it may seem, because alphabets are different for different languages.
534
544
535
545
So, the browser needs to know the language to compare.
This method actually has two additional arguments specified in [the documentation](mdn:js/String/localeCompare), which allows it to specify the language (by default taken from the environment) and setup additional rules like case sensitivity or should `"a"` and `"á"` be treated as the same etc.
563
+
This method actually has two additional arguments specified in [the documentation](mdn:js/String/localeCompare), which allows it to specify the language (by default taken from the environment, letter order depends on the language) and setup additional rules like case sensitivity or should `"a"` and `"á"` be treated as the same etc.
554
564
555
565
## Internals, Unicode
556
566
@@ -580,7 +590,7 @@ We actually have a single symbol in each of the strings above, but the `length`
580
590
581
591
`String.fromCodePoint` and `str.codePointAt` are few rare methods that deal with surrogate pairs right. They recently appeared in the language. Before them, there were only [String.fromCharCode](mdn:js/String/fromCharCode) and [str.charCodeAt](mdn:js/String/charCodeAt). These methods are actually the same as `fromCodePoint/codePointAt`, but don't work with surrogate pairs.
582
592
583
-
But, for instance, getting a symbol can be tricky, because surrogate pairs are treated as two characters:
593
+
Getting a symbol can be tricky, because surrogate pairs are treated as two characters:
584
594
585
595
```js run
586
596
alert( '𝒳'[0] ); // strange symbols...
@@ -608,7 +618,7 @@ In many languages there are symbols that are composed of the base character with
608
618
609
619
For instance, the letter `a` can be the base character for: `àáâäãåā`. Most common "composite" character have their own code in the UTF-16 table. But not all of them, because there are too many possible combinations.
610
620
611
-
To support arbitrary compositions, UTF-16 allows us to use several unicode characters. The base character and one or many "mark" characters that "decorate" it.
621
+
To support arbitrary compositions, UTF-16 allows us to use several unicode characters: the base character followed by one or many "mark" characters that "decorate" it.
612
622
613
623
For instance, if we have `S` followed by the special "dot above" character (code `\u0307`), it is shown as Ṡ.
alert( 'S\u0307\u0323'=='S\u0323\u0307' ); // false, different characters (?!)
638
648
```
639
649
640
650
To solve this, there exists a "unicode normalization" algorithm that brings each string to the single "normal" form.
@@ -660,7 +670,7 @@ If you want to learn more about normalization rules and variants -- they are des
660
670
661
671
## Summary
662
672
663
-
- There are 3 types of quotes. Backticks allow a string to span multiple lines and embed expressions.
673
+
- There are 3 types of quotes. Backticks allow a string to span multiple lines and embed expressions`${…}`.
664
674
- Strings in JavaScript are encoded using UTF-16.
665
675
- We can use special characters like `\n` and insert letters by their unicode using `\u...`.
666
676
- To get a character, use: `[]`.
@@ -673,6 +683,6 @@ There are several other helpful methods in strings:
673
683
674
684
-`str.trim()` -- removes ("trims") spaces from the beginning and end of the string.
675
685
-`str.repeat(n)` -- repeats the string `n` times.
676
-
- ...and more. See the [manual](mdn:js/String) for details.
686
+
- ...and more to be found in the [manual](mdn:js/String).
677
687
678
-
Strings also have methods for doing search/replace with regular expressions. But that topic deserves a separate chapter, so we'll return to that later.
688
+
Strings also have methods for doing search/replace with regular expressions. But that's big topic, so it's explained in a separate tutorial section <info:regular-expressions>.
0 commit comments