3.0 KiB
What is Regular Expression?
Regular expression is a group of character or symbols which is used to find a specific pattern from a text. The word "Regular expression" is a mouthful, you will usually find the term abbreviated as "regex" or "regexp". Regular expression is used for replacing a text withing a string, validating form, extract a substring from a string based upon a pattern match, and so much more.
Table of Contents
- Basic Matchers
- Meta character
- Quantifiers
- OR operator
- Character Sets
- Shorthand Character Sets
- Grouping
- Lookaheads
- Flags
1. Basic Matchers
A regular expression is just a pattern of letters and digits that we used to search in a text. For example the regular expression cat
means: the letter c
, followed by the letter a
, followed by the letter t
.
"cat" => The cat sat on the mat
The regular expression 123
matches the string "123". The regular expression is matched against an input string by comparing each character in the regular expression to each character in the input string, one after another. Regular expressions are normally case-sensitive so the regular expression Cat
would not match the string "cat".
"Cat" => The cat sat on the Cat
2. Meta Characters
Meta characters are the building blocks of the regular expressions. Some meta characters have a special meaning that are written inside the square brackets.
2.1 Full stop
Full stop .
is the simplest example of meta character. The meta character .
matches any single character. It will not match return or new line characters. For example the regular expression .ar
means: any character, followed by the letter a
, followed by the letter r
.
".ar" => The car parked in the garage.
2.2 Character set
Character sets are also called character class. Square brackets are used to specify character sets. Use hyphen inside character set to specify the characters range. The order of the character range inside square brackets doesn't matter. For example the regular expression [Tt]he
means: an uppercase T
or lowercase t
, followed by the letter h
, followed by the letter e
.
"[Tt]he" => The car parked in the garage.
2.2.1 Negated character set
In general the caret symbol represents the start of the string, but when it is typed after the opening square bracket it negates the character set. For example the regular expression [^c]ar
means: any character except c
, followed by the character a
, followed by the letter r
.
"[^c]ar" => The car parked in the garage.
2.2.2 Repeating character set
We can repeat a character class by using +
, *
or ?
operators. For example the regular expression [a-z]+
means: any number of lowercase letters in a row.
"[a-z]+" => The car parked in the garage.