Homogenized images (#69)

* RegExp image for French plus

* Some typo corrected
* Typographics rules (as space before :)
* égual - égal

* Images folder added, img src updated

* § formatted to 80 lines
This commit is contained in:
Nicolas Borboën 2017-08-19 11:15:19 +02:00 committed by Zeeshan Ahmed
parent 4b641cd11d
commit 9968a235b6
9 changed files with 573 additions and 112 deletions

View File

@ -21,7 +21,7 @@ Imagina que estas escribiendo una aplicación y quieres agregar reglas para cuan
<br/><br/> <br/><br/>
<p align="center"> <p align="center">
<img src="http://imgur.com/EtlKH14.png" alt="Regular expression"> <img src="./img/regexp-es.png" alt="Expresión regular">
</p> </p>
De la expresión regular anterior, se puede aceptar las cadenas 'john_doe', 'jo-hn_doe' y 'john12_as'. La expresión no coincide con el nombre de usuario 'Jo', porque es una cadena de caracteres que contiene letras mayúsculas y es demasiado corta. De la expresión regular anterior, se puede aceptar las cadenas 'john_doe', 'jo-hn_doe' y 'john12_as'. La expresión no coincide con el nombre de usuario 'Jo', porque es una cadena de caracteres que contiene letras mayúsculas y es demasiado corta.

View File

@ -24,7 +24,7 @@ le pseudonyme à contenir des lettres, des nombres, des underscores et des trait
de caractères dans le pseudonyme pour qu'il n'ait pas l'air moche. Nous utilisons l'expression régulière suivante pour valider un pseudonyme: de caractères dans le pseudonyme pour qu'il n'ait pas l'air moche. Nous utilisons l'expression régulière suivante pour valider un pseudonyme:
<br/><br/> <br/><br/>
<p align="center"> <p align="center">
<img src="http://i.imgur.com/OGM7KV8.png" alt="Expression régulière"> <img src="./img/regexp-fr.png" alt="Expressions régulières">
</p> </p>
L'expression régulière ci-dessus peut accepter les strings `john_doe`, `jo-hn_doe` et `john12_as`. Ça ne fonctionne pas avec `Jo` car L'expression régulière ci-dessus peut accepter les strings `john_doe`, `jo-hn_doe` et `john12_as`. Ça ne fonctionne pas avec `Jo` car

View File

@ -27,7 +27,7 @@
<br/><br/> <br/><br/>
<p align="center"> <p align="center">
<img src="https://i.imgur.com/ekFpQUg.png" alt="Regular expression"> <img src="./img/regexp-en.png" alt="Regular expression">
</p> </p>
この正規表現によって `john_doe, jo-hn_doe, john12_as` などは許容されることになります。 この正規表現によって `john_doe, jo-hn_doe, john12_as` などは許容されることになります。

282
README.md
View File

@ -15,20 +15,26 @@
> Regular expression is a group of characters or symbols which is used to find a specific pattern from a text. > Regular expression is a group of characters or symbols which is used to find a specific pattern from a text.
A regular expression is a pattern that is matched against a subject string from left to right. The word "Regular expression" is a A regular expression is a pattern that is matched against a subject string from
mouthful, you will usually find the term abbreviated as "regex" or "regexp". Regular expression is used for replacing a text within left to right. The word "Regular expression" is a mouthful, you will usually
a string, validating form, extract a substring from a string based upon a pattern match, and so much more. find the term abbreviated as "regex" or "regexp". Regular expression is used for
replacing a text within a string, validating form, extract a substring from a
string based upon a pattern match, and so much more.
Imagine you are writing an application and you want to set the rules for when a
user chooses their username. We want to allow the username to contain letters,
numbers, underscores and hyphens. We also want to limit the number of characters
in username so it does not look ugly. We use the following regular expression to
validate a username:
Imagine you are writing an application and you want to set the rules for when a user chooses their username. We want to
allow the username to contain letters, numbers, underscores and hyphens. We also want to limit the number of
characters in username so it does not look ugly. We use the following regular expression to validate a username:
<br/><br/> <br/><br/>
<p align="center"> <p align="center">
<img src="https://i.imgur.com/ekFpQUg.png" alt="Regular expression"> <img src="./img/regexp-en.png" alt="Regular expression">
</p> </p>
Above regular expression can accept the strings `john_doe`, `jo-hn_doe` and `john12_as`. It does not match `Jo` because that string Above regular expression can accept the strings `john_doe`, `jo-hn_doe` and
contains uppercase letter and also it is too short. `john12_as`. It does not match `Jo` because that string contains uppercase
letter and also it is too short.
## Table of Contents ## Table of Contents
@ -61,8 +67,9 @@ contains uppercase letter and also it is too short.
## 1. Basic Matchers ## 1. Basic Matchers
A regular expression is just a pattern of characters that we use to perform search in a text. For example, the regular expression A regular expression is just a pattern of characters that we use to perform
`the` means: the letter `t`, followed by the letter `h`, followed by the letter `e`. search in a text. For example, the regular expression `the` means: the letter
`t`, followed by the letter `h`, followed by the letter `e`.
<pre> <pre>
"the" => The fat cat sat on <a href="#learn-regex"><strong>the</strong></a> mat. "the" => The fat cat sat on <a href="#learn-regex"><strong>the</strong></a> mat.
@ -70,9 +77,11 @@ A regular expression is just a pattern of characters that we use to perform sear
[Test the regular expression](https://regex101.com/r/dmRygT/1) [Test the regular expression](https://regex101.com/r/dmRygT/1)
The regular expression `123` matches the string `123`. The regular expression is matched against an input string by comparing each The regular expression `123` matches the string `123`. The regular expression is
character in the regular expression to each character in the input string, one after another. Regular expressions are normally matched against an input string by comparing each character in the regular
case-sensitive so the regular expression `The` would not match the string `the`. expression to each character in the input string, one after another. Regular
expressions are normally case-sensitive so the regular expression `The` would
not match the string `the`.
<pre> <pre>
"The" => <a href="#learn-regex"><strong>The</strong></a> fat cat sat on the mat. "The" => <a href="#learn-regex"><strong>The</strong></a> fat cat sat on the mat.
@ -82,9 +91,10 @@ case-sensitive so the regular expression `The` would not match the string `the`.
## 2. Meta Characters ## 2. Meta Characters
Meta characters are the building blocks of the regular expressions. Meta characters do not stand for themselves but instead are Meta characters are the building blocks of the regular expressions. Meta
interpreted in some special way. Some meta characters have a special meaning and are written inside square brackets. characters do not stand for themselves but instead are interpreted in some
The meta characters are as follows: special way. Some meta characters have a special meaning and are written inside
square brackets. The meta characters are as follows:
|Meta character|Description| |Meta character|Description|
|:----:|----| |:----:|----|
@ -103,9 +113,10 @@ The meta characters are as follows:
## 2.1 Full stop ## 2.1 Full stop
Full stop `.` is the simplest example of meta character. The meta character `.` matches any single character. It will not match return Full stop `.` is the simplest example of meta character. The meta character `.`
or newline characters. For example, the regular expression `.ar` means: any character, followed by the letter `a`, followed by the matches any single character. It will not match return or newline characters.
letter `r`. For example, the regular expression `.ar` means: any character, followed by the
letter `a`, followed by the letter `r`.
<pre> <pre>
".ar" => The <a href="#learn-regex"><strong>car</strong></a> <a href="#learn-regex"><strong>par</strong></a>ked in the <a href="#learn-regex"><strong>gar</strong></a>age. ".ar" => The <a href="#learn-regex"><strong>car</strong></a> <a href="#learn-regex"><strong>par</strong></a>ked in the <a href="#learn-regex"><strong>gar</strong></a>age.
@ -115,9 +126,11 @@ letter `r`.
## 2.2 Character set ## 2.2 Character set
Character sets are also called character class. Square brackets are used to specify character sets. Use a hyphen inside a character set to Character sets are also called character class. Square brackets are used to
specify the characters' range. The order of the character range inside square brackets doesn't matter. For example, the regular specify character sets. Use a hyphen inside a character set to specify the
expression `[Tt]he` means: an uppercase `T` or lowercase `t`, followed by the letter `h`, followed by the letter `e`. characters' range. The order of the character range inside square brackets
doesn't matter. For example, the regular expression `[Tt]he` means: an uppercase
`T` or lowercase `t`, followed by the letter `h`, followed by the letter `e`.
<pre> <pre>
"[Tt]he" => <a href="#learn-regex"><strong>The</strong></a> car parked in <a href="#learn-regex"><strong>the</strong></a> garage. "[Tt]he" => <a href="#learn-regex"><strong>The</strong></a> car parked in <a href="#learn-regex"><strong>the</strong></a> garage.
@ -125,7 +138,9 @@ expression `[Tt]he` means: an uppercase `T` or lowercase `t`, followed by the le
[Test the regular expression](https://regex101.com/r/2ITLQ4/1) [Test the regular expression](https://regex101.com/r/2ITLQ4/1)
A period inside a character set, however, means a literal period. The regular expression `ar[.]` means: a lowercase character `a`, followed by letter `r`, followed by a period `.` character. A period inside a character set, however, means a literal period. The regular
expression `ar[.]` means: a lowercase character `a`, followed by letter `r`,
followed by a period `.` character.
<pre> <pre>
"ar[.]" => A garage is a good place to park a c<a href="#learn-regex"><strong>ar.</strong></a> "ar[.]" => A garage is a good place to park a c<a href="#learn-regex"><strong>ar.</strong></a>
@ -135,9 +150,10 @@ A period inside a character set, however, means a literal period. The regular ex
### 2.2.1 Negated character set ### 2.2.1 Negated character set
In general, the caret symbol represents the start of the string, but when it is typed after the opening square bracket it negates the In general, the caret symbol represents the start of the string, but when it is
character set. For example, the regular expression `[^c]ar` means: any character except `c`, followed by the character `a`, followed by typed after the opening square bracket it negates the character set. For
the letter `r`. example, the regular expression `[^c]ar` means: any character except `c`,
followed by the character `a`, followed by the letter `r`.
<pre> <pre>
"[^c]ar" => The car <a href="#learn-regex"><strong>par</strong></a>ked in the <a href="#learn-regex"><strong>gar</strong></a>age. "[^c]ar" => The car <a href="#learn-regex"><strong>par</strong></a>ked in the <a href="#learn-regex"><strong>gar</strong></a>age.
@ -147,14 +163,17 @@ the letter `r`.
## 2.3 Repetitions ## 2.3 Repetitions
Following meta characters `+`, `*` or `?` are used to specify how many times a subpattern can occur. These meta characters act Following meta characters `+`, `*` or `?` are used to specify how many times a
differently in different situations. subpattern can occur. These meta characters act differently in different
situations.
### 2.3.1 The Star ### 2.3.1 The Star
The symbol `*` matches zero or more repetitions of the preceding matcher. The regular expression `a*` means: zero or more repetitions The symbol `*` matches zero or more repetitions of the preceding matcher. The
of preceding lowercase character `a`. But if it appears after a character set or class then it finds the repetitions of the whole regular expression `a*` means: zero or more repetitions of preceding lowercase
character set. For example, the regular expression `[a-z]*` means: any number of lowercase letters in a row. character `a`. But if it appears after a character set or class then it finds
the repetitions of the whole character set. For example, the regular expression
`[a-z]*` means: any number of lowercase letters in a row.
<pre> <pre>
"[a-z]*" => T<a href="#learn-regex"><strong>he</strong></a> <a href="#learn-regex"><strong>car</strong></a> <a href="#learn-regex"><strong>parked</strong></a> <a href="#learn-regex"><strong>in</strong></a> <a href="#learn-regex"><strong>the</strong></a> <a href="#learn-regex"><strong>garage</strong></a> #21. "[a-z]*" => T<a href="#learn-regex"><strong>he</strong></a> <a href="#learn-regex"><strong>car</strong></a> <a href="#learn-regex"><strong>parked</strong></a> <a href="#learn-regex"><strong>in</strong></a> <a href="#learn-regex"><strong>the</strong></a> <a href="#learn-regex"><strong>garage</strong></a> #21.
@ -162,10 +181,12 @@ character set. For example, the regular expression `[a-z]*` means: any number of
[Test the regular expression](https://regex101.com/r/7m8me5/1) [Test the regular expression](https://regex101.com/r/7m8me5/1)
The `*` symbol can be used with the meta character `.` to match any string of characters `.*`. The `*` symbol can be used with the The `*` symbol can be used with the meta character `.` to match any string of
whitespace character `\s` to match a string of whitespace characters. For example, the expression `\s*cat\s*` means: zero or more characters `.*`. The `*` symbol can be used with the whitespace character `\s`
spaces, followed by lowercase character `c`, followed by lowercase character `a`, followed by lowercase character `t`, followed by to match a string of whitespace characters. For example, the expression
zero or more spaces. `\s*cat\s*` means: zero or more spaces, followed by lowercase character `c`,
followed by lowercase character `a`, followed by lowercase character `t`,
followed by zero or more spaces.
<pre> <pre>
"\s*cat\s*" => The fat<a href="#learn-regex"><strong> cat </strong></a>sat on the <a href="#learn-regex">con<strong>cat</strong>enation</a>. "\s*cat\s*" => The fat<a href="#learn-regex"><strong> cat </strong></a>sat on the <a href="#learn-regex">con<strong>cat</strong>enation</a>.
@ -175,8 +196,10 @@ zero or more spaces.
### 2.3.2 The Plus ### 2.3.2 The Plus
The symbol `+` matches one or more repetitions of the preceding character. For example, the regular expression `c.+t` means: lowercase The symbol `+` matches one or more repetitions of the preceding character. For
letter `c`, followed by at least one character, followed by the lowercase character `t`. It needs to be clarified that `t` is the last `t` in the sentence. example, the regular expression `c.+t` means: lowercase letter `c`, followed by
at least one character, followed by the lowercase character `t`. It needs to be
clarified that `t` is the last `t` in the sentence.
<pre> <pre>
"c.+t" => The fat <a href="#learn-regex"><strong>cat sat on the mat</strong></a>. "c.+t" => The fat <a href="#learn-regex"><strong>cat sat on the mat</strong></a>.
@ -186,9 +209,11 @@ letter `c`, followed by at least one character, followed by the lowercase charac
### 2.3.3 The Question Mark ### 2.3.3 The Question Mark
In regular expression the meta character `?` makes the preceding character optional. This symbol matches zero or one instance of In regular expression the meta character `?` makes the preceding character
the preceding character. For example, the regular expression `[T]?he` means: Optional the uppercase letter `T`, followed by the lowercase optional. This symbol matches zero or one instance of the preceding character.
character `h`, followed by the lowercase character `e`. For example, the regular expression `[T]?he` means: Optional the uppercase
letter `T`, followed by the lowercase character `h`, followed by the lowercase
character `e`.
<pre> <pre>
"[T]he" => <a href="#learn-regex"><strong>The</strong></a> car is parked in the garage. "[T]he" => <a href="#learn-regex"><strong>The</strong></a> car is parked in the garage.
@ -204,9 +229,10 @@ character `h`, followed by the lowercase character `e`.
## 2.4 Braces ## 2.4 Braces
In regular expression braces that are also called quantifiers are used to specify the number of times that a In regular expression braces that are also called quantifiers are used to
character or a group of characters can be repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least 2 digits but not more than 3 ( specify the number of times that a character or a group of characters can be
characters in the range of 0 to 9). repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least
2 digits but not more than 3 ( characters in the range of 0 to 9).
<pre> <pre>
"[0-9]{2,3}" => The number was 9.<a href="#learn-regex"><strong>999</strong></a>7 but we rounded it off to <a href="#learn-regex"><strong>10</strong></a>.0. "[0-9]{2,3}" => The number was 9.<a href="#learn-regex"><strong>999</strong></a>7 but we rounded it off to <a href="#learn-regex"><strong>10</strong></a>.0.
@ -214,8 +240,9 @@ characters in the range of 0 to 9).
[Test the regular expression](https://regex101.com/r/juM86s/1) [Test the regular expression](https://regex101.com/r/juM86s/1)
We can leave out the second number. For example, the regular expression `[0-9]{2,}` means: Match 2 or more digits. If we also remove We can leave out the second number. For example, the regular expression
the comma the regular expression `[0-9]{3}` means: Match exactly 3 digits. `[0-9]{2,}` means: Match 2 or more digits. If we also remove the comma the
regular expression `[0-9]{3}` means: Match exactly 3 digits.
<pre> <pre>
"[0-9]{2,}" => The number was 9.<a href="#learn-regex"><strong>9997</strong></a> but we rounded it off to <a href="#learn-regex"><strong>10</strong></a>.0. "[0-9]{2,}" => The number was 9.<a href="#learn-regex"><strong>9997</strong></a> but we rounded it off to <a href="#learn-regex"><strong>10</strong></a>.0.
@ -231,10 +258,13 @@ the comma the regular expression `[0-9]{3}` means: Match exactly 3 digits.
## 2.5 Character Group ## 2.5 Character Group
Character group is a group of sub-patterns that is written inside Parentheses `(...)`. As we discussed before that in regular expression Character group is a group of sub-patterns that is written inside Parentheses `(...)`.
if we put a quantifier after a character then it will repeat the preceding character. But if we put quantifier after a character group then As we discussed before that in regular expression if we put a quantifier after a
it repeats the whole character group. For example, the regular expression `(ab)*` matches zero or more repetitions of the character "ab". character then it will repeat the preceding character. But if we put quantifier
We can also use the alternation `|` meta character inside character group. For example, the regular expression `(c|g|p)ar` means: lowercase character `c`, after a character group then it repeats the whole character group. For example,
the regular expression `(ab)*` matches zero or more repetitions of the character
"ab". We can also use the alternation `|` meta character inside character group.
For example, the regular expression `(c|g|p)ar` means: lowercase character `c`,
`g` or `p`, followed by character `a`, followed by character `r`. `g` or `p`, followed by character `a`, followed by character `r`.
<pre> <pre>
@ -245,11 +275,15 @@ We can also use the alternation `|` meta character inside character group. For e
## 2.6 Alternation ## 2.6 Alternation
In regular expression Vertical bar `|` is used to define alternation. Alternation is like a condition between multiple expressions. Now, In regular expression Vertical bar `|` is used to define alternation.
you may be thinking that character set and alternation works the same way. But the big difference between character set and alternation Alternation is like a condition between multiple expressions. Now, you may be
is that character set works on character level but alternation works on expression level. For example, the regular expression thinking that character set and alternation works the same way. But the big
`(T|t)he|car` means: uppercase character `T` or lowercase `t`, followed by lowercase character `h`, followed by lowercase character `e` difference between character set and alternation is that character set works on
or lowercase character `c`, followed by lowercase character `a`, followed by lowercase character `r`. character level but alternation works on expression level. For example, the
regular expression `(T|t)he|car` means: uppercase character `T` or lowercase
`t`, followed by lowercase character `h`, followed by lowercase character `e` or
lowercase character `c`, followed by lowercase character `a`, followed by
lowercase character `r`.
<pre> <pre>
"(T|t)he|car" => <a href="#learn-regex"><strong>The</strong></a> <a href="#learn-regex"><strong>car</strong></a> is parked in <a href="#learn-regex"><strong>the</strong></a> garage. "(T|t)he|car" => <a href="#learn-regex"><strong>The</strong></a> <a href="#learn-regex"><strong>car</strong></a> is parked in <a href="#learn-regex"><strong>the</strong></a> garage.
@ -259,12 +293,16 @@ or lowercase character `c`, followed by lowercase character `a`, followed by low
## 2.7 Escaping special character ## 2.7 Escaping special character
Backslash `\` is used in regular expression to escape the next character. This allows us to specify a symbol as a matching character Backslash `\` is used in regular expression to escape the next character. This
including reserved characters `{ } [ ] / \ + * . $ ^ | ?`. To use a special character as a matching character prepend `\` before it. allows us to specify a symbol as a matching character including reserved
characters `{ } [ ] / \ + * . $ ^ | ?`. To use a special character as a matching
character prepend `\` before it.
For example, the regular expression `.` is used to match any character except newline. Now to match `.` in an input string the regular For example, the regular expression `.` is used to match any character except
expression `(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by lowercase character `a`, followed by lowercase letter newline. Now to match `.` in an input string the regular expression
`t`, followed by optional `.` character. `(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by lowercase
character `a`, followed by lowercase letter `t`, followed by optional `.`
character.
<pre> <pre>
"(f|c|m)at\.?" => The <a href="#learn-regex"><strong>fat</strong></a> <a href="#learn-regex"><strong>cat</strong></a> sat on the <a href="#learn-regex"><strong>mat.</strong></a> "(f|c|m)at\.?" => The <a href="#learn-regex"><strong>fat</strong></a> <a href="#learn-regex"><strong>cat</strong></a> sat on the <a href="#learn-regex"><strong>mat.</strong></a>
@ -274,18 +312,22 @@ expression `(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by l
## 2.8 Anchors ## 2.8 Anchors
In regular expressions, we use anchors to check if the matching symbol is the starting symbol or ending symbol of the In regular expressions, we use anchors to check if the matching symbol is the
input string. Anchors are of two types: First type is Caret `^` that check if the matching character is the start starting symbol or ending symbol of the input string. Anchors are of two types:
character of the input and the second type is Dollar `$` that checks if matching character is the last character of the First type is Caret `^` that check if the matching character is the start
input string. character of the input and the second type is Dollar `$` that checks if matching
character is the last character of the input string.
### 2.8.1 Caret ### 2.8.1 Caret
Caret `^` symbol is used to check if matching character is the first character of the input string. If we apply the following regular Caret `^` symbol is used to check if matching character is the first character
expression `^a` (if a is the starting symbol) to input string `abc` it matches `a`. But if we apply regular expression `^b` on above of the input string. If we apply the following regular expression `^a` (if a is
input string it does not match anything. Because in input string `abc` "b" is not the starting symbol. Let's take a look at another the starting symbol) to input string `abc` it matches `a`. But if we apply
regular expression `^(T|t)he` which means: uppercase character `T` or lowercase character `t` is the start symbol of the input string, regular expression `^b` on above input string it does not match anything.
followed by lowercase character `h`, followed by lowercase character `e`. Because in input string `abc` "b" is not the starting symbol. Let's take a look
at another regular expression `^(T|t)he` which means: uppercase character `T` or
lowercase character `t` is the start symbol of the input string, followed by
lowercase character `h`, followed by lowercase character `e`.
<pre> <pre>
"(T|t)he" => <a href="#learn-regex"><strong>The</strong></a> car is parked in <a href="#learn-regex"><strong>the</strong></a> garage. "(T|t)he" => <a href="#learn-regex"><strong>The</strong></a> car is parked in <a href="#learn-regex"><strong>the</strong></a> garage.
@ -301,9 +343,10 @@ followed by lowercase character `h`, followed by lowercase character `e`.
### 2.8.2 Dollar ### 2.8.2 Dollar
Dollar `$` symbol is used to check if matching character is the last character of the input string. For example, regular expression Dollar `$` symbol is used to check if matching character is the last character
`(at\.)$` means: a lowercase character `a`, followed by lowercase character `t`, followed by a `.` character and the matcher of the input string. For example, regular expression `(at\.)$` means: a
must be end of the string. lowercase character `a`, followed by lowercase character `t`, followed by a `.`
character and the matcher must be end of the string.
<pre> <pre>
"(at\.)" => The fat c<a href="#learn-regex"><strong>at.</strong></a> s<a href="#learn-regex"><strong>at.</strong></a> on the m<a href="#learn-regex"><strong>at.</strong></a> "(at\.)" => The fat c<a href="#learn-regex"><strong>at.</strong></a> s<a href="#learn-regex"><strong>at.</strong></a> on the m<a href="#learn-regex"><strong>at.</strong></a>
@ -319,8 +362,9 @@ must be end of the string.
## 3. Shorthand Character Sets ## 3. Shorthand Character Sets
Regular expression provides shorthands for the commonly used character sets, which offer convenient shorthands for commonly used Regular expression provides shorthands for the commonly used character sets,
regular expressions. The shorthand character sets are as follows: which offer convenient shorthands for commonly used regular expressions. The
shorthand character sets are as follows:
|Shorthand|Description| |Shorthand|Description|
|:----:|----| |:----:|----|
@ -334,11 +378,15 @@ regular expressions. The shorthand character sets are as follows:
## 4. Lookaround ## 4. Lookaround
Lookbehind and lookahead sometimes known as lookaround are specific type of ***non-capturing group*** (Use to match the pattern but not Lookbehind and lookahead sometimes known as lookaround are specific type of
included in matching list). Lookaheads are used when we have the condition that this pattern is preceded or followed by another certain ***non-capturing group*** (Use to match the pattern but not included in matching
pattern. For example, we want to get all numbers that are preceded by `$` character from the following input string `$4.44 and $10.88`. list). Lookaheads are used when we have the condition that this pattern is
We will use following regular expression `(?<=\$)[0-9\.]*` which means: get all the numbers which contain `.` character and are preceded preceded or followed by another certain pattern. For example, we want to get all
by `$` character. Following are the lookarounds that are used in regular expressions: numbers that are preceded by `$` character from the following input string
`$4.44 and $10.88`. We will use following regular expression `(?<=\$)[0-9\.]*`
which means: get all the numbers which contain `.` character and are preceded
by `$` character. Following are the lookarounds that are used in regular
expressions:
|Symbol|Description| |Symbol|Description|
|:----:|----| |:----:|----|
@ -349,12 +397,16 @@ by `$` character. Following are the lookarounds that are used in regular express
### 4.1 Positive Lookahead ### 4.1 Positive Lookahead
The positive lookahead asserts that the first part of the expression must be followed by the lookahead expression. The returned match The positive lookahead asserts that the first part of the expression must be
only contains the text that is matched by the first part of the expression. To define a positive lookahead, parentheses are used. Within followed by the lookahead expression. The returned match only contains the text
those parentheses, a question mark with equal sign is used like this: `(?=...)`. Lookahead expression is written after the equal sign inside that is matched by the first part of the expression. To define a positive
parentheses. For example, the regular expression `[T|t]he(?=\sfat)` means: optionally match lowercase letter `t` or uppercase letter `T`, lookahead, parentheses are used. Within those parentheses, a question mark with
followed by letter `h`, followed by letter `e`. In parentheses we define positive lookahead which tells regular expression engine to match equal sign is used like this: `(?=...)`. Lookahead expression is written after
`The` or `the` which are followed by the word `fat`. the equal sign inside parentheses. For example, the regular expression
`[T|t]he(?=\sfat)` means: optionally match lowercase letter `t` or uppercase
letter `T`, followed by letter `h`, followed by letter `e`. In parentheses we
define positive lookahead which tells regular expression engine to match `The`
or `the` which are followed by the word `fat`.
<pre> <pre>
"[T|t]he(?=\sfat)" => <a href="#learn-regex"><strong>The</strong></a> fat cat sat on the mat. "[T|t]he(?=\sfat)" => <a href="#learn-regex"><strong>The</strong></a> fat cat sat on the mat.
@ -364,10 +416,13 @@ followed by letter `h`, followed by letter `e`. In parentheses we define positiv
### 4.2 Negative Lookahead ### 4.2 Negative Lookahead
Negative lookahead is used when we need to get all matches from input string that are not followed by a pattern. Negative lookahead Negative lookahead is used when we need to get all matches from input string
defined same as we define positive lookahead but the only difference is instead of equal `=` character we use negation `!` character that are not followed by a pattern. Negative lookahead defined same as we define
i.e. `(?!...)`. Let's take a look at the following regular expression `[T|t]he(?!\sfat)` which means: get all `The` or `the` words from positive lookahead but the only difference is instead of equal `=` character we
input string that are not followed by the word `fat` precedes by a space character. use negation `!` character i.e. `(?!...)`. Let's take a look at the following
regular expression `[T|t]he(?!\sfat)` which means: get all `The` or `the` words
from input string that are not followed by the word `fat` precedes by a space
character.
<pre> <pre>
"[T|t]he(?!\sfat)" => The fat cat sat on <a href="#learn-regex"><strong>the</strong></a> mat. "[T|t]he(?!\sfat)" => The fat cat sat on <a href="#learn-regex"><strong>the</strong></a> mat.
@ -377,9 +432,10 @@ input string that are not followed by the word `fat` precedes by a space charact
### 4.3 Positive Lookbehind ### 4.3 Positive Lookbehind
Positive lookbehind is used to get all the matches that are preceded by a specific pattern. Positive lookbehind is denoted by Positive lookbehind is used to get all the matches that are preceded by a
`(?<=...)`. For example, the regular expression `(?<=[T|t]he\s)(fat|mat)` means: get all `fat` or `mat` words from input string that specific pattern. Positive lookbehind is denoted by `(?<=...)`. For example, the
are after the word `The` or `the`. regular expression `(?<=[T|t]he\s)(fat|mat)` means: get all `fat` or `mat` words
from input string that are after the word `The` or `the`.
<pre> <pre>
"(?<=[T|t]he\s)(fat|mat)" => The <a href="#learn-regex"><strong>fat</strong></a> cat sat on the <a href="#learn-regex"><strong>mat</strong></a>. "(?<=[T|t]he\s)(fat|mat)" => The <a href="#learn-regex"><strong>fat</strong></a> cat sat on the <a href="#learn-regex"><strong>mat</strong></a>.
@ -389,9 +445,10 @@ are after the word `The` or `the`.
### 4.4 Negative Lookbehind ### 4.4 Negative Lookbehind
Negative lookbehind is used to get all the matches that are not preceded by a specific pattern. Negative lookbehind is denoted by Negative lookbehind is used to get all the matches that are not preceded by a
`(?<!...)`. For example, the regular expression `(?<!(T|t)he\s)(cat)` means: get all `cat` words from input string that specific pattern. Negative lookbehind is denoted by `(?<!...)`. For example, the
are not after the word `The` or `the`. regular expression `(?<!(T|t)he\s)(cat)` means: get all `cat` words from input
string that are not after the word `The` or `the`.
<pre> <pre>
"(?&lt;![T|t]he\s)(cat)" => The cat sat on <a href="#learn-regex"><strong>cat</strong></a>. "(?&lt;![T|t]he\s)(cat)" => The cat sat on <a href="#learn-regex"><strong>cat</strong></a>.
@ -401,8 +458,9 @@ are not after the word `The` or `the`.
## 5. Flags ## 5. Flags
Flags are also called modifiers because they modify the output of a regular expression. These flags can be used in any order or Flags are also called modifiers because they modify the output of a regular
combination, and are an integral part of the RegExp. expression. These flags can be used in any order or combination, and are an
integral part of the RegExp.
|Flag|Description| |Flag|Description|
|:----:|----| |:----:|----|
@ -412,10 +470,12 @@ combination, and are an integral part of the RegExp.
### 5.1 Case Insensitive ### 5.1 Case Insensitive
The `i` modifier is used to perform case-insensitive matching. For example, the regular expression `/The/gi` means: uppercase letter The `i` modifier is used to perform case-insensitive matching. For example, the
`T`, followed by lowercase character `h`, followed by character `e`. And at the end of regular expression the `i` flag tells the regular expression `/The/gi` means: uppercase letter `T`, followed by lowercase
regular expression engine to ignore the case. As you can see we also provided `g` flag because we want to search for the pattern in character `h`, followed by character `e`. And at the end of regular expression
the whole input string. the `i` flag tells the regular expression engine to ignore the case. As you can
see we also provided `g` flag because we want to search for the pattern in the
whole input string.
<pre> <pre>
"The" => <a href="#learn-regex"><strong>The</strong></a> fat cat sat on the mat. "The" => <a href="#learn-regex"><strong>The</strong></a> fat cat sat on the mat.
@ -431,10 +491,11 @@ the whole input string.
### 5.2 Global search ### 5.2 Global search
The `g` modifier is used to perform a global match (find all matches rather than stopping after the first match). For example, the The `g` modifier is used to perform a global match (find all matches rather than
regular expression`/.(at)/g` means: any character except new line, followed by lowercase character `a`, followed by lowercase stopping after the first match). For example, the regular expression`/.(at)/g`
character `t`. Because we provided `g` flag at the end of the regular expression now it will find every matches from whole input means: any character except new line, followed by lowercase character `a`,
string. followed by lowercase character `t`. Because we provided `g` flag at the end of
the regular expression now it will find every matches from whole input string.
<pre> <pre>
"/.(at)/" => The <a href="#learn-regex"><strong>fat</strong></a> cat sat on the mat. "/.(at)/" => The <a href="#learn-regex"><strong>fat</strong></a> cat sat on the mat.
@ -450,10 +511,13 @@ string.
### 5.3 Multiline ### 5.3 Multiline
The `m` modifier is used to perform a multi-line match. As we discussed earlier anchors `(^, $)` are used to check if pattern is The `m` modifier is used to perform a multi-line match. As we discussed earlier
the beginning of the input or end of the input string. But if we want that anchors works on each line we use `m` flag. For example, the anchors `(^, $)` are used to check if pattern is the beginning of the input or
regular expression `/at(.)?$/gm` means: lowercase character `a`, followed by lowercase character `t`, optionally anything except new end of the input string. But if we want that anchors works on each line we use
line. And because of `m` flag now regular expression engine matches pattern at the end of each line in a string. `m` flag. For example, the regular expression `/at(.)?$/gm` means: lowercase
character `a`, followed by lowercase character `t`, optionally anything except
new line. And because of `m` flag now regular expression engine matches pattern
at the end of each line in a string.
<pre> <pre>
"/.at(.)?$/" => The fat "/.at(.)?$/" => The fat

BIN
img/img_original.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.5 KiB

BIN
img/regexp-en.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

BIN
img/regexp-es.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

BIN
img/regexp-fr.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

397
img/regexp.svg Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 35 KiB