欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Chapter 8. Regular Expressions

程序员文章站 2022-04-22 09:44:56
...

1.  Regular expression patterns are written between slashes (/). Slashes inside the expression have to be escaped with backslashes (but quotes do not).

 

2.  The search method of strings works like indexOf—it returns the position at which it finds its argument—but takes a regular expression instead of a string.

 

3.  The [ and ] characters have a special meaning inside a regular expression. They enclose a list of characters and will match when one of these characters is found.

 

4.  The dot (.) can be used to mean “any character that is not a line-break character.” An escaped d (\d) means “any digit.” An escaped w (\w) matches any “word” character, meaning alphabetic characters, digits, and the underscore character. An escaped s (\s) matches any whitespace character (things such as tabs, newlines, and spaces).

 

5.  You can replace the \d, \w, and \s characters with capital letters to negate their meanings. When using [ and ], a pattern can be inverted by starting with a ^ character.

 

6.  The ^ character matches the start of the string, and the $ character matches the end.

 

7.  Regular expressions are objects and have methods. The test method returns a Boolean indicating whether the given string matches(contains) the expression.

 

8.  The \b escape character matches a “word boundary,” which can be punctuation, whitespace, or the start or end of a string.

 

9.  Putting an asterisk (*) after an element allows it to be repeated any number of times, including zero. A plus (+) does the same but requires the pattern to occur at least one time. A question mark (?) makes an element “optional”—it can occur zero or one time.

 

10.  A number between braces ({4}) gives the exact number of times that element must occur. Two numbers with a comma between them ({3,10}) indicate that the pattern must occur at least as often as the first number and at most as often as the second one. Analogously, {2,} means two or more occurrences, while {,4} means four or less.

 

11.  It is possible to group parts of a regular expression together with parentheses and then do something with the whole group.

 

12.  After the closing slash, “options” may be added to a regular expression. The option i means that the expression is case-insensitive.

 

13.  Strings have a method named match, which takes a regular expression as an argument. It returns null if the match failed and returns an array of matched strings if it succeeded. The first element in the returned array is always the part of the string that matched the whole pattern. When there are parenthesized parts in the pattern, the parts they match are also added to the array.

 

14.The replace method of string values can be given a regular expression as its first argument. Option g stands for “global” and means that every part of the string that matches the pattern should be replaced.

 

15.  Sometimes we need to keep parts of the strings we replace. The $1 and $2 in the replacement string refer to the parenthesized parts in the pattern. $1 is replaced by the text that matched against the first pair of parentheses, $2 by the second, and so on, up to $9.

 

16.  When the second argument given to the replace method is a function value instead of a string, this function is called every time a match is found, and the matched text is replaced by whatever the function returns. The arguments given to the function are the matched elements, similar to the values found in the arrays returned by match: The first argument is the whole match, and after that there is an argument for every parenthesized part of the pattern.

 

17.  The first argument to the RegExp constructor is a string containing the pattern, and the second argument (which may be omitted) can be used to add case-insensitivity or globalness.

 

18.  Any backslashes that must end up in the regular expression itself have to be escaped.

 

19.  The split method of strings also allows a regular expression as its argument.