diff --git a/README-es.md b/README-es.md
index 6988406..82ec6f6 100644
--- a/README-es.md
+++ b/README-es.md
@@ -21,7 +21,7 @@ Imagina que estas escribiendo una aplicación y quieres agregar reglas para cuan
-
+
-
+
-
+
-
+
"the" => The fat cat sat on the mat. @@ -70,9 +77,11 @@ A regular expression is just a pattern of characters that we use to perform sear [Test the regular expression](https://regex101.com/r/dmRygT/1) -The regular expression `123` matches the string `123`. The regular expression is matched against an input string by comparing each -character in the regular expression to each character in the input string, one after another. Regular expressions are normally -case-sensitive so the regular expression `The` would not match the string `the`. +The regular expression `123` matches the string `123`. The regular expression is +matched against an input string by comparing each character in the regular +expression to each character in the input string, one after another. Regular +expressions are normally case-sensitive so the regular expression `The` would +not match the string `the`."The" => The fat cat sat on the mat. @@ -82,9 +91,10 @@ case-sensitive so the regular expression `The` would not match the string `the`. ## 2. Meta Characters -Meta characters are the building blocks of the regular expressions. Meta characters do not stand for themselves but instead are -interpreted in some special way. Some meta characters have a special meaning and are written inside square brackets. -The meta characters are as follows: +Meta characters are the building blocks of the regular expressions. Meta +characters do not stand for themselves but instead are interpreted in some +special way. Some meta characters have a special meaning and are written inside +square brackets. The meta characters are as follows: |Meta character|Description| |:----:|----| @@ -103,9 +113,10 @@ The meta characters are as follows: ## 2.1 Full stop -Full stop `.` is the simplest example of meta character. The meta character `.` matches any single character. It will not match return -or newline characters. For example, the regular expression `.ar` means: any character, followed by the letter `a`, followed by the -letter `r`. +Full stop `.` is the simplest example of meta character. The meta character `.` +matches any single character. It will not match return or newline characters. +For example, the regular expression `.ar` means: any character, followed by the +letter `a`, followed by the letter `r`.".ar" => The car parked in the garage. @@ -115,9 +126,11 @@ letter `r`. ## 2.2 Character set -Character sets are also called character class. Square brackets are used to specify character sets. Use a hyphen inside a character set to -specify the characters' range. The order of the character range inside square brackets doesn't matter. For example, the regular -expression `[Tt]he` means: an uppercase `T` or lowercase `t`, followed by the letter `h`, followed by the letter `e`. +Character sets are also called character class. Square brackets are used to +specify character sets. Use a hyphen inside a character set to specify the +characters' range. The order of the character range inside square brackets +doesn't matter. For example, the regular expression `[Tt]he` means: an uppercase +`T` or lowercase `t`, followed by the letter `h`, followed by the letter `e`."[Tt]he" => The car parked in the garage. @@ -125,7 +138,9 @@ expression `[Tt]he` means: an uppercase `T` or lowercase `t`, followed by the le [Test the regular expression](https://regex101.com/r/2ITLQ4/1) -A period inside a character set, however, means a literal period. The regular expression `ar[.]` means: a lowercase character `a`, followed by letter `r`, followed by a period `.` character. +A period inside a character set, however, means a literal period. The regular +expression `ar[.]` means: a lowercase character `a`, followed by letter `r`, +followed by a period `.` character."ar[.]" => A garage is a good place to park a car. @@ -135,9 +150,10 @@ A period inside a character set, however, means a literal period. The regular ex ### 2.2.1 Negated character set -In general, the caret symbol represents the start of the string, but when it is typed after the opening square bracket it negates the -character set. For example, the regular expression `[^c]ar` means: any character except `c`, followed by the character `a`, followed by -the letter `r`. +In general, the caret symbol represents the start of the string, but when it is +typed after the opening square bracket it negates the character set. For +example, the regular expression `[^c]ar` means: any character except `c`, +followed by the character `a`, followed by the letter `r`."[^c]ar" => The car parked in the garage. @@ -147,14 +163,17 @@ the letter `r`. ## 2.3 Repetitions -Following meta characters `+`, `*` or `?` are used to specify how many times a subpattern can occur. These meta characters act -differently in different situations. +Following meta characters `+`, `*` or `?` are used to specify how many times a +subpattern can occur. These meta characters act differently in different +situations. ### 2.3.1 The Star -The symbol `*` matches zero or more repetitions of the preceding matcher. The regular expression `a*` means: zero or more repetitions -of preceding lowercase character `a`. But if it appears after a character set or class then it finds the repetitions of the whole -character set. For example, the regular expression `[a-z]*` means: any number of lowercase letters in a row. +The symbol `*` matches zero or more repetitions of the preceding matcher. The +regular expression `a*` means: zero or more repetitions of preceding lowercase +character `a`. But if it appears after a character set or class then it finds +the repetitions of the whole character set. For example, the regular expression +`[a-z]*` means: any number of lowercase letters in a row."[a-z]*" => The car parked in the garage #21. @@ -162,10 +181,12 @@ character set. For example, the regular expression `[a-z]*` means: any number of [Test the regular expression](https://regex101.com/r/7m8me5/1) -The `*` symbol can be used with the meta character `.` to match any string of characters `.*`. The `*` symbol can be used with the -whitespace character `\s` to match a string of whitespace characters. For example, the expression `\s*cat\s*` means: zero or more -spaces, followed by lowercase character `c`, followed by lowercase character `a`, followed by lowercase character `t`, followed by -zero or more spaces. +The `*` symbol can be used with the meta character `.` to match any string of +characters `.*`. The `*` symbol can be used with the whitespace character `\s` +to match a string of whitespace characters. For example, the expression +`\s*cat\s*` means: zero or more spaces, followed by lowercase character `c`, +followed by lowercase character `a`, followed by lowercase character `t`, +followed by zero or more spaces."\s*cat\s*" => The fat cat sat on the concatenation. @@ -175,8 +196,10 @@ zero or more spaces. ### 2.3.2 The Plus -The symbol `+` matches one or more repetitions of the preceding character. For example, the regular expression `c.+t` means: lowercase -letter `c`, followed by at least one character, followed by the lowercase character `t`. It needs to be clarified that `t` is the last `t` in the sentence. +The symbol `+` matches one or more repetitions of the preceding character. For +example, the regular expression `c.+t` means: lowercase letter `c`, followed by +at least one character, followed by the lowercase character `t`. It needs to be +clarified that `t` is the last `t` in the sentence."c.+t" => The fat cat sat on the mat. @@ -186,9 +209,11 @@ letter `c`, followed by at least one character, followed by the lowercase charac ### 2.3.3 The Question Mark -In regular expression the meta character `?` makes the preceding character optional. This symbol matches zero or one instance of -the preceding character. For example, the regular expression `[T]?he` means: Optional the uppercase letter `T`, followed by the lowercase -character `h`, followed by the lowercase character `e`. +In regular expression the meta character `?` makes the preceding character +optional. This symbol matches zero or one instance of the preceding character. +For example, the regular expression `[T]?he` means: Optional the uppercase +letter `T`, followed by the lowercase character `h`, followed by the lowercase +character `e`."[T]he" => The car is parked in the garage. @@ -204,9 +229,10 @@ character `h`, followed by the lowercase character `e`. ## 2.4 Braces -In regular expression braces that are also called quantifiers are used to specify the number of times that a -character or a group of characters can be repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least 2 digits but not more than 3 ( -characters in the range of 0 to 9). +In regular expression braces that are also called quantifiers are used to +specify the number of times that a character or a group of characters can be +repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least +2 digits but not more than 3 ( characters in the range of 0 to 9)."[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0. @@ -214,8 +240,9 @@ characters in the range of 0 to 9). [Test the regular expression](https://regex101.com/r/juM86s/1) -We can leave out the second number. For example, the regular expression `[0-9]{2,}` means: Match 2 or more digits. If we also remove -the comma the regular expression `[0-9]{3}` means: Match exactly 3 digits. +We can leave out the second number. For example, the regular expression +`[0-9]{2,}` means: Match 2 or more digits. If we also remove the comma the +regular expression `[0-9]{3}` means: Match exactly 3 digits."[0-9]{2,}" => The number was 9.9997 but we rounded it off to 10.0. @@ -231,10 +258,13 @@ the comma the regular expression `[0-9]{3}` means: Match exactly 3 digits. ## 2.5 Character Group -Character group is a group of sub-patterns that is written inside Parentheses `(...)`. As we discussed before that in regular expression -if we put a quantifier after a character then it will repeat the preceding character. But if we put quantifier after a character group then -it repeats the whole character group. For example, the regular expression `(ab)*` matches zero or more repetitions of the character "ab". -We can also use the alternation `|` meta character inside character group. For example, the regular expression `(c|g|p)ar` means: lowercase character `c`, +Character group is a group of sub-patterns that is written inside Parentheses `(...)`. +As we discussed before that in regular expression if we put a quantifier after a +character then it will repeat the preceding character. But if we put quantifier +after a character group then it repeats the whole character group. For example, +the regular expression `(ab)*` matches zero or more repetitions of the character +"ab". We can also use the alternation `|` meta character inside character group. +For example, the regular expression `(c|g|p)ar` means: lowercase character `c`, `g` or `p`, followed by character `a`, followed by character `r`.@@ -245,11 +275,15 @@ We can also use the alternation `|` meta character inside character group. For e ## 2.6 Alternation -In regular expression Vertical bar `|` is used to define alternation. Alternation is like a condition between multiple expressions. Now, -you may be thinking that character set and alternation works the same way. But the big difference between character set and alternation -is that character set works on character level but alternation works on expression level. For example, the regular expression -`(T|t)he|car` means: uppercase character `T` or lowercase `t`, followed by lowercase character `h`, followed by lowercase character `e` -or lowercase character `c`, followed by lowercase character `a`, followed by lowercase character `r`. +In regular expression Vertical bar `|` is used to define alternation. +Alternation is like a condition between multiple expressions. Now, you may be +thinking that character set and alternation works the same way. But the big +difference between character set and alternation is that character set works on +character level but alternation works on expression level. For example, the +regular expression `(T|t)he|car` means: uppercase character `T` or lowercase +`t`, followed by lowercase character `h`, followed by lowercase character `e` or +lowercase character `c`, followed by lowercase character `a`, followed by +lowercase character `r`."(T|t)he|car" => The car is parked in the garage. @@ -259,12 +293,16 @@ or lowercase character `c`, followed by lowercase character `a`, followed by low ## 2.7 Escaping special character -Backslash `\` is used in regular expression to escape the next character. This allows us to specify a symbol as a matching character -including reserved characters `{ } [ ] / \ + * . $ ^ | ?`. To use a special character as a matching character prepend `\` before it. +Backslash `\` is used in regular expression to escape the next character. This +allows us to specify a symbol as a matching character including reserved +characters `{ } [ ] / \ + * . $ ^ | ?`. To use a special character as a matching +character prepend `\` before it. -For example, the regular expression `.` is used to match any character except newline. Now to match `.` in an input string the regular -expression `(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by lowercase character `a`, followed by lowercase letter -`t`, followed by optional `.` character. +For example, the regular expression `.` is used to match any character except +newline. Now to match `.` in an input string the regular expression +`(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by lowercase +character `a`, followed by lowercase letter `t`, followed by optional `.` +character."(f|c|m)at\.?" => The fat cat sat on the mat. @@ -274,18 +312,22 @@ expression `(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by l ## 2.8 Anchors -In regular expressions, we use anchors to check if the matching symbol is the starting symbol or ending symbol of the -input string. Anchors are of two types: First type is Caret `^` that check if the matching character is the start -character of the input and the second type is Dollar `$` that checks if matching character is the last character of the -input string. +In regular expressions, we use anchors to check if the matching symbol is the +starting symbol or ending symbol of the input string. Anchors are of two types: +First type is Caret `^` that check if the matching character is the start +character of the input and the second type is Dollar `$` that checks if matching +character is the last character of the input string. ### 2.8.1 Caret -Caret `^` symbol is used to check if matching character is the first character of the input string. If we apply the following regular -expression `^a` (if a is the starting symbol) to input string `abc` it matches `a`. But if we apply regular expression `^b` on above -input string it does not match anything. Because in input string `abc` "b" is not the starting symbol. Let's take a look at another -regular expression `^(T|t)he` which means: uppercase character `T` or lowercase character `t` is the start symbol of the input string, -followed by lowercase character `h`, followed by lowercase character `e`. +Caret `^` symbol is used to check if matching character is the first character +of the input string. If we apply the following regular expression `^a` (if a is +the starting symbol) to input string `abc` it matches `a`. But if we apply +regular expression `^b` on above input string it does not match anything. +Because in input string `abc` "b" is not the starting symbol. Let's take a look +at another regular expression `^(T|t)he` which means: uppercase character `T` or +lowercase character `t` is the start symbol of the input string, followed by +lowercase character `h`, followed by lowercase character `e`."(T|t)he" => The car is parked in the garage. @@ -301,9 +343,10 @@ followed by lowercase character `h`, followed by lowercase character `e`. ### 2.8.2 Dollar -Dollar `$` symbol is used to check if matching character is the last character of the input string. For example, regular expression -`(at\.)$` means: a lowercase character `a`, followed by lowercase character `t`, followed by a `.` character and the matcher -must be end of the string. +Dollar `$` symbol is used to check if matching character is the last character +of the input string. For example, regular expression `(at\.)$` means: a +lowercase character `a`, followed by lowercase character `t`, followed by a `.` +character and the matcher must be end of the string."(at\.)" => The fat cat. sat. on the mat. @@ -319,8 +362,9 @@ must be end of the string. ## 3. Shorthand Character Sets -Regular expression provides shorthands for the commonly used character sets, which offer convenient shorthands for commonly used -regular expressions. The shorthand character sets are as follows: +Regular expression provides shorthands for the commonly used character sets, +which offer convenient shorthands for commonly used regular expressions. The +shorthand character sets are as follows: |Shorthand|Description| |:----:|----| @@ -334,11 +378,15 @@ regular expressions. The shorthand character sets are as follows: ## 4. Lookaround -Lookbehind and lookahead sometimes known as lookaround are specific type of ***non-capturing group*** (Use to match the pattern but not -included in matching list). Lookaheads are used when we have the condition that this pattern is preceded or followed by another certain -pattern. For example, we want to get all numbers that are preceded by `$` character from the following input string `$4.44 and $10.88`. -We will use following regular expression `(?<=\$)[0-9\.]*` which means: get all the numbers which contain `.` character and are preceded -by `$` character. Following are the lookarounds that are used in regular expressions: +Lookbehind and lookahead sometimes known as lookaround are specific type of +***non-capturing group*** (Use to match the pattern but not included in matching +list). Lookaheads are used when we have the condition that this pattern is +preceded or followed by another certain pattern. For example, we want to get all +numbers that are preceded by `$` character from the following input string +`$4.44 and $10.88`. We will use following regular expression `(?<=\$)[0-9\.]*` +which means: get all the numbers which contain `.` character and are preceded +by `$` character. Following are the lookarounds that are used in regular +expressions: |Symbol|Description| |:----:|----| @@ -349,12 +397,16 @@ by `$` character. Following are the lookarounds that are used in regular express ### 4.1 Positive Lookahead -The positive lookahead asserts that the first part of the expression must be followed by the lookahead expression. The returned match -only contains the text that is matched by the first part of the expression. To define a positive lookahead, parentheses are used. Within -those parentheses, a question mark with equal sign is used like this: `(?=...)`. Lookahead expression is written after the equal sign inside -parentheses. For example, the regular expression `[T|t]he(?=\sfat)` means: optionally match lowercase letter `t` or uppercase letter `T`, -followed by letter `h`, followed by letter `e`. In parentheses we define positive lookahead which tells regular expression engine to match -`The` or `the` which are followed by the word `fat`. +The positive lookahead asserts that the first part of the expression must be +followed by the lookahead expression. The returned match only contains the text +that is matched by the first part of the expression. To define a positive +lookahead, parentheses are used. Within those parentheses, a question mark with +equal sign is used like this: `(?=...)`. Lookahead expression is written after +the equal sign inside parentheses. For example, the regular expression +`[T|t]he(?=\sfat)` means: optionally match lowercase letter `t` or uppercase +letter `T`, followed by letter `h`, followed by letter `e`. In parentheses we +define positive lookahead which tells regular expression engine to match `The` +or `the` which are followed by the word `fat`."[T|t]he(?=\sfat)" => The fat cat sat on the mat. @@ -364,10 +416,13 @@ followed by letter `h`, followed by letter `e`. In parentheses we define positiv ### 4.2 Negative Lookahead -Negative lookahead is used when we need to get all matches from input string that are not followed by a pattern. Negative lookahead -defined same as we define positive lookahead but the only difference is instead of equal `=` character we use negation `!` character -i.e. `(?!...)`. Let's take a look at the following regular expression `[T|t]he(?!\sfat)` which means: get all `The` or `the` words from -input string that are not followed by the word `fat` precedes by a space character. +Negative lookahead is used when we need to get all matches from input string +that are not followed by a pattern. Negative lookahead defined same as we define +positive lookahead but the only difference is instead of equal `=` character we +use negation `!` character i.e. `(?!...)`. Let's take a look at the following +regular expression `[T|t]he(?!\sfat)` which means: get all `The` or `the` words +from input string that are not followed by the word `fat` precedes by a space +character."[T|t]he(?!\sfat)" => The fat cat sat on the mat. @@ -377,9 +432,10 @@ input string that are not followed by the word `fat` precedes by a space charact ### 4.3 Positive Lookbehind -Positive lookbehind is used to get all the matches that are preceded by a specific pattern. Positive lookbehind is denoted by -`(?<=...)`. For example, the regular expression `(?<=[T|t]he\s)(fat|mat)` means: get all `fat` or `mat` words from input string that -are after the word `The` or `the`. +Positive lookbehind is used to get all the matches that are preceded by a +specific pattern. Positive lookbehind is denoted by `(?<=...)`. For example, the +regular expression `(?<=[T|t]he\s)(fat|mat)` means: get all `fat` or `mat` words +from input string that are after the word `The` or `the`."(?<=[T|t]he\s)(fat|mat)" => The fat cat sat on the mat. @@ -389,9 +445,10 @@ are after the word `The` or `the`. ### 4.4 Negative Lookbehind -Negative lookbehind is used to get all the matches that are not preceded by a specific pattern. Negative lookbehind is denoted by -`(? "(?<![T|t]he\s)(cat)" => The cat sat on cat. @@ -401,8 +458,9 @@ are not after the word `The` or `the`. ## 5. Flags -Flags are also called modifiers because they modify the output of a regular expression. These flags can be used in any order or -combination, and are an integral part of the RegExp. +Flags are also called modifiers because they modify the output of a regular +expression. These flags can be used in any order or combination, and are an +integral part of the RegExp. |Flag|Description| |:----:|----| @@ -412,10 +470,12 @@ combination, and are an integral part of the RegExp. ### 5.1 Case Insensitive -The `i` modifier is used to perform case-insensitive matching. For example, the regular expression `/The/gi` means: uppercase letter -`T`, followed by lowercase character `h`, followed by character `e`. And at the end of regular expression the `i` flag tells the -regular expression engine to ignore the case. As you can see we also provided `g` flag because we want to search for the pattern in -the whole input string. +The `i` modifier is used to perform case-insensitive matching. For example, the +regular expression `/The/gi` means: uppercase letter `T`, followed by lowercase +character `h`, followed by character `e`. And at the end of regular expression +the `i` flag tells the regular expression engine to ignore the case. As you can +see we also provided `g` flag because we want to search for the pattern in the +whole input string."The" => The fat cat sat on the mat. @@ -431,10 +491,11 @@ the whole input string. ### 5.2 Global search -The `g` modifier is used to perform a global match (find all matches rather than stopping after the first match). For example, the -regular expression`/.(at)/g` means: any character except new line, followed by lowercase character `a`, followed by lowercase -character `t`. Because we provided `g` flag at the end of the regular expression now it will find every matches from whole input -string. +The `g` modifier is used to perform a global match (find all matches rather than +stopping after the first match). For example, the regular expression`/.(at)/g` +means: any character except new line, followed by lowercase character `a`, +followed by lowercase character `t`. Because we provided `g` flag at the end of +the regular expression now it will find every matches from whole input string."/.(at)/" => The fat cat sat on the mat. @@ -450,10 +511,13 @@ string. ### 5.3 Multiline -The `m` modifier is used to perform a multi-line match. As we discussed earlier anchors `(^, $)` are used to check if pattern is -the beginning of the input or end of the input string. But if we want that anchors works on each line we use `m` flag. For example, the -regular expression `/at(.)?$/gm` means: lowercase character `a`, followed by lowercase character `t`, optionally anything except new -line. And because of `m` flag now regular expression engine matches pattern at the end of each line in a string. +The `m` modifier is used to perform a multi-line match. As we discussed earlier +anchors `(^, $)` are used to check if pattern is the beginning of the input or +end of the input string. But if we want that anchors works on each line we use +`m` flag. For example, the regular expression `/at(.)?$/gm` means: lowercase +character `a`, followed by lowercase character `t`, optionally anything except +new line. And because of `m` flag now regular expression engine matches pattern +at the end of each line in a string."/.at(.)?$/" => The fat diff --git a/img/img_original.png b/img/img_original.png new file mode 100644 index 0000000..da1c5fa Binary files /dev/null and b/img/img_original.png differ diff --git a/img/regexp-en.png b/img/regexp-en.png new file mode 100644 index 0000000..f233149 Binary files /dev/null and b/img/regexp-en.png differ diff --git a/img/regexp-es.png b/img/regexp-es.png new file mode 100644 index 0000000..3efa6ee Binary files /dev/null and b/img/regexp-es.png differ diff --git a/img/regexp-fr.png b/img/regexp-fr.png new file mode 100644 index 0000000..26e0c9e Binary files /dev/null and b/img/regexp-fr.png differ diff --git a/img/regexp.svg b/img/regexp.svg new file mode 100644 index 0000000..f08690b --- /dev/null +++ b/img/regexp.svg @@ -0,0 +1,397 @@ + + + +