From 7b00db73fc8b0bdafcf129ab15428f7acb34d485 Mon Sep 17 00:00:00 2001 From: Caleb Mazalevskis Date: Wed, 18 Oct 2017 13:32:47 +0800 Subject: [PATCH 01/24] Patch. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Sync language list (all translations). - Sync some punctuation (Chinese ",.:?" -> ",。:?" and Japanese "," -> "、"). - Sync some spacing (various translations). - One missing translation (Chinese). - (PR is NOT a complete audit or proofreading). --- README-cn.md | 233 ++++++++++++++++++++++++------------------------ README-es.md | 5 +- README-fr.md | 3 +- README-gr.md | 2 +- README-ja.md | 29 +++--- README-ko.md | 3 +- README-pt_BR.md | 3 +- README-tr.md | 15 ++-- README.md | 2 +- 9 files changed, 151 insertions(+), 144 deletions(-) diff --git a/README-cn.md b/README-cn.md index add4168..de8d14f 100644 --- a/README-cn.md +++ b/README-cn.md @@ -3,36 +3,37 @@ Learn Regex


-## 翻译: +## 翻译: * [English](README.md) * [Español](README-es.md) * [Français](README-fr.md) -* [中文版](README-cn.md) +* [Português do Brasil](README-pt_BR.md) +* [中文(简体)版](README-cn.md) * [日本語](README-ja.md) * [한국어](README-ko.md) * [Turkish](README-tr.md) * [Greek](README-gr.md) -## 什么是正则表达式? +## 什么是正则表达式? -> 正则表达式是一组由字母和符号组成的特殊文本, 它可以用来从文本中找出满足你想要的格式的句子. +> 正则表达式是一组由字母和符号组成的特殊文本,它可以用来从文本中找出满足你想要的格式的句子。 -一个正则表达式是在一个主体字符串中从左到右匹配字符串时的一种样式. -"Regular expression"这个词比较拗口, 我们常使用缩写的术语"regex"或"regexp". -正则表达式可以从一个基础字符串中根据一定的匹配模式替换文本中的字符串、验证表单、提取字符串等等. +一个正则表达式是在一个主体字符串中从左到右匹配字符串时的一种样式。 +“Regular expression”这个词比较拗口,我们常使用缩写的术语“regex”或“regexp”。 +正则表达式可以从一个基础字符串中根据一定的匹配模式替换文本中的字符串、验证表单、提取字符串等等。 -想象你正在写一个应用, 然后你想设定一个用户命名的规则, 让用户名包含字符,数字,下划线和连字符,以及限制字符的个数,好让名字看起来没那么丑. -我们使用以下正则表达式来验证一个用户名: +想象你正在写一个应用,然后你想设定一个用户命名的规则,让用户名包含字符,数字,下划线和连字符,以及限制字符的个数,好让名字看起来没那么丑。 +我们使用以下正则表达式来验证一个用户名:

Regular expression

-以上的正则表达式可以接受 `john_doe`, `jo-hn_doe`, `john12_as`. -但不匹配`Jo`, 因为它包含了大写的字母而且太短了. +以上的正则表达式可以接受 `john_doe`、`jo-hn_doe`、`john12_as`。 +但不匹配`Jo`,因为它包含了大写的字母而且太短了。 目录 ================= @@ -69,8 +70,8 @@ ## 1. 基本匹配 -正则表达式其实就是在执行搜索时的格式, 它由一些字母和数字组合而成. -例如: 一个正则表达式 `the`, 它表示一个规则: 由字母`t`开始,接着是`h`,再接着是`e`. +正则表达式其实就是在执行搜索时的格式,它由一些字母和数字组合而成。 +例如:一个正则表达式 `the`,它表示一个规则:由字母`t`开始,接着是`h`,再接着是`e`。
 "the" => The fat cat sat on the mat.
@@ -78,9 +79,9 @@
 
 [在线练习](https://regex101.com/r/dmRygT/1)
 
-正则表达式`123`匹配字符串`123`. 它逐个字符的与输入的正则表达式做比较.
+正则表达式`123`匹配字符串`123`. 它逐个字符的与输入的正则表达式做比较。
 
-正则表达式是大小写敏感的, 所以`The`不会匹配`the`.
+正则表达式是大小写敏感的,所以`The`不会匹配`the`。
 
 
 "The" => The fat cat sat on the mat.
@@ -90,29 +91,29 @@
 
 ## 2. 元字符
 
-正则表达式主要依赖于元字符.
-元字符不代表他们本身的字面意思, 他们都有特殊的含义. 一些元字符写在方括号中的时候有一些特殊的意思. 以下是一些元字符的介绍:
+正则表达式主要依赖于元字符。
+元字符不代表他们本身的字面意思,他们都有特殊的含义。一些元字符写在方括号中的时候有一些特殊的意思. 以下是一些元字符的介绍:
 
 |元字符|描述|
 |:----:|----|
-|.|句号匹配任意单个字符除了换行符.|
-|[ ]|字符种类. 匹配方括号内的任意字符.|
-|[^ ]|否定的字符种类. 匹配除了方括号里的任意字符|
-|*|匹配>=0个重复的在*号之前的字符.|
-|+|匹配>=1个重复的+号前的字符.
-|?|标记?之前的字符为可选.|
-|{n,m}|匹配num个大括号之前的字符 (n <= num <= m).|
-|(xyz)|字符集, 匹配与 xyz 完全相等的字符串.|
-|||或运算符,匹配符号前或后的字符.|
-|\|转义字符,用于匹配一些保留的字符 [ ] ( ) { } . * + ? ^ $ \ ||
-|^|从开始行开始匹配.|
-|$|从末端开始匹配.|
+|.|句号匹配任意单个字符除了换行符。|
+|[ ]|字符种类. 匹配方括号内的任意字符。|
+|[^ ]|否定的字符种类。匹配除了方括号里的任意字符。|
+|*|匹配>=0个重复的在*号之前的字符。|
+|+|匹配>=1个重复的+号前的字符。|
+|?|标记?之前的字符为可选。|
+|{n,m}|匹配num个大括号之前的字符 (n <= num <= m)。|
+|(xyz)|字符集,匹配与 xyz 完全相等的字符串。|
+|||或运算符,匹配符号前或后的字符。|
+|\|转义字符,用于匹配一些保留的字符 [ ] ( ) { } . * + ? ^ $ \ ||
+|^|从开始行开始匹配。|
+|$|从末端开始匹配。|
 
 ## 2.1 点运算符 `.`
 
-`.`是元字符中最简单的例子.
-`.`匹配任意单个字符, 但不匹配换行符.
-例如, 表达式`.ar`匹配一个任意字符后面跟着是`a`和`r`的字符串.
+`.`是元字符中最简单的例子。
+`.`匹配任意单个字符,但不匹配换行符。
+例如,表达式`.ar`匹配一个任意字符后面跟着是`a`和`r`的字符串。
 
 
 ".ar" => The car parked in the garage.
@@ -122,11 +123,11 @@
 
 ## 2.2 字符集
 
-字符集也叫做字符类.
-方括号用来指定一个字符集.
-在方括号中使用连字符来指定字符集的范围.
-在方括号中的字符集不关心顺序.
-例如, 表达式`[Tt]he` 匹配 `the` 和 `The`.
+字符集也叫做字符类。
+方括号用来指定一个字符集。
+在方括号中使用连字符来指定字符集的范围。
+在方括号中的字符集不关心顺序。
+例如,表达式`[Tt]he` 匹配 `the` 和 `The`。
 
 
 "[Tt]he" => The car parked in the garage.
@@ -134,8 +135,8 @@
 
 [在线练习](https://regex101.com/r/2ITLQ4/1)
 
-方括号的句号就表示句号.
-表达式 `ar[.]` 匹配 `ar.`字符串
+方括号的句号就表示句号。
+表达式 `ar[.]` 匹配 `ar.`字符串。
 
 
 "ar[.]" => A garage is a good place to park a car.
@@ -145,8 +146,8 @@
 
 ### 2.2.1 否定字符集
 
-一般来说 `^` 表示一个字符串的开头, 但它用在一个方括号的开头的时候, 它表示这个字符集是否定的.
-例如, 表达式`[^c]ar` 匹配一个后面跟着`ar`的除了`c`的任意字符.
+一般来说 `^` 表示一个字符串的开头,但它用在一个方括号的开头的时候,它表示这个字符集是否定的。
+例如,表达式`[^c]ar` 匹配一个后面跟着`ar`的除了`c`的任意字符。
 
 
 "[^c]ar" => The car parked in the garage.
@@ -156,13 +157,13 @@
 
 ## 2.3 重复次数
 
-后面跟着元字符 `+`, `*` or `?` 的, 用来指定匹配子模式的次数.
-这些元字符在不同的情况下有着不同的意思.
+后面跟着元字符 `+`,`*`或`?`的,用来指定匹配子模式的次数。
+这些元字符在不同的情况下有着不同的意思。
 
 ### 2.3.1 `*` 号
 
-`*`号匹配 在`*`之前的字符出现`大于等于0`次.
-例如, 表达式 `a*` 匹配以0或更多个a开头的字符, 因为有0个这个条件, 其实也就匹配了所有的字符. 表达式`[a-z]*` 匹配一个行中所有以小写字母开头的字符串.
+`*`号匹配 在`*`之前的字符出现`大于等于0`次。
+例如,表达式 `a*` 匹配以0或更多个a开头的字符,因为有0个这个条件,其实也就匹配了所有的字符. 表达式`[a-z]*` 匹配一个行中所有以小写字母开头的字符串。
 
 
 "[a-z]*" => The car parked in the garage #21.
@@ -170,8 +171,8 @@
 
 [在线练习](https://regex101.com/r/7m8me5/1)
 
-`*`字符和`.`字符搭配可以匹配所有的字符`.*`.
-`*`和表示匹配空格的符号`\s`连起来用, 如表达式`\s*cat\s*`匹配0或更多个空格开头和0或更多个空格结尾的cat字符串.
+`*`字符和`.`字符搭配可以匹配所有的字符`.*`。
+`*`和表示匹配空格的符号`\s`连起来用,如表达式`\s*cat\s*`匹配0或更多个空格开头和0或更多个空格结尾的cat字符串。
 
 
 "\s*cat\s*" => The fat cat sat on the concatenation.
@@ -181,8 +182,8 @@
 
 ### 2.3.2 `+` 号
 
-`+`号匹配`+`号之前的字符出现 >=1 次.
-例如表达式`c.+t` 匹配以首字母`c`开头以`t`结尾,中间跟着任意个字符的字符串.
+`+`号匹配`+`号之前的字符出现 >=1 次。
+例如表达式`c.+t` 匹配以首字母`c`开头以`t`结尾,中间跟着任意个字符的字符串。
 
 
 "c.+t" => The fat cat sat on the mat.
@@ -192,8 +193,8 @@
 
 ### 2.3.3 `?` 号
 
-在正则表达式中元字符 `?` 标记在符号前面的字符为可选, 即出现 0 或 1 次.
-例如, 表达式 `[T]?he` 匹配字符串 `he` 和 `The`.
+在正则表达式中元字符 `?` 标记在符号前面的字符为可选,即出现 `0` 或 `1` 次。
+例如,表达式 `[T]?he` 匹配字符串 `he` 和 `The`。
 
 
 "[T]he" => The car is parked in the garage.
@@ -209,8 +210,8 @@
 
 ## 2.4 `{}` 号
 
-在正则表达式中 `{}` 是一个量词, 常用来一个或一组字符可以重复出现的次数.
-例如,  表达式 `[0-9]{2,3}` 匹配最少 2 位最多 3 位 0~9 的数字.
+在正则表达式中 `{}` 是一个量词,常用来一个或一组字符可以重复出现的次数。
+例如,表达式 `[0-9]{2,3}` 匹配最少 2 位最多 3 位 0~9 的数字。
 
 
 "[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -218,11 +219,11 @@
 
 [在线练习](https://regex101.com/r/juM86s/1)
 
-我们可以省略第二个参数.
-例如, `[0-9]{2,}` 匹配至少两位 0~9 的数字.
+我们可以省略第二个参数。
+例如, `[0-9]{2,}` 匹配至少两位 0~9 的数字。
 
-如果逗号也省略掉则表示重复固定的次数.
-例如, `[0-9]{3}` 匹配3位数字
+如果逗号也省略掉则表示重复固定的次数。
+例如, `[0-9]{3}` 匹配3位数字。
 
 
 "[0-9]{2,}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -238,9 +239,9 @@
 
 ## 2.5 `(...)` 特征标群
 
-特征标群是一组写在 `(...)` 中的子模式. 例如之前说的 `{}` 是用来表示前面一个字符出现指定次数. 但如果在 `{}` 前加入特征标群则表示整个标群内的字符重复 N 次. 例如, 表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`.
+特征标群是一组写在 `(...)` 中的子模式. 例如之前说的 `{}` 是用来表示前面一个字符出现指定次数. 但如果在 `{}` 前加入特征标群则表示整个标群内的字符重复 N 次. 例如,表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`。
 
-我们还可以在 `()` 中用或字符 `|` 表示或. 例如, `(c|g|p)ar` 匹配 `car` 或 `gar` 或 `par`.
+我们还可以在 `()` 中用或字符 `|` 表示或。 例如, `(c|g|p)ar` 匹配 `car` 或 `gar` 或 `par`。
 
 
 "(c|g|p)ar" => The car is parked in the garage.
@@ -250,9 +251,9 @@
 
 ## 2.6 `|` 或运算符
 
-或运算符就表示或, 用作判断条件.
+或运算符就表示或,用作判断条件。
 
-例如 `(T|t)he|car` 匹配 `(T|t)he` 或 `car`.
+例如 `(T|t)he|car` 匹配 `(T|t)he` 或 `car`。
 
 
 "(T|t)he|car" => The car is parked in the garage.
@@ -262,9 +263,9 @@
 
 ## 2.7 转码特殊字符
 
-反斜线 `\` 在表达式中用于转码紧跟其后的字符. 用于指定 `{ } [ ] / \ + * . $ ^ | ?` 这些特殊字符. 如果想要匹配这些特殊字符则要在其前面加上反斜线 `\`.
+反斜线 `\` 在表达式中用于转码紧跟其后的字符。用于指定 `{ } [ ] / \ + * . $ ^ | ?` 这些特殊字符. 如果想要匹配这些特殊字符则要在其前面加上反斜线 `\`。
 
-例如 `.` 是用来匹配除换行符外的所有字符的. 如果想要匹配句子中的 `.` 则要写成 `\.` 以下这个例子 `\.?`是选择性匹配`.`
+例如 `.` 是用来匹配除换行符外的所有字符的。如果想要匹配句子中的 `.` 则要写成 `\.` 以下这个例子 `\.?`是选择性匹配`.`。
 
 
 "(f|c|m)at\.?" => The fat cat sat on the mat.
@@ -274,15 +275,15 @@
 
 ## 2.8 锚点
 
-在正则表达式中, 想要匹配指定开头或结尾的字符串就要使用到锚点. `^` 指定开头, `$` 指定结尾.
+在正则表达式中,想要匹配指定开头或结尾的字符串就要使用到锚点。 `^` 指定开头, `$` 指定结尾。
 
 ### 2.8.1 `^` 号
 
-`^` 用来检查匹配的字符串是否在所匹配字符串的开头.
+`^` 用来检查匹配的字符串是否在所匹配字符串的开头。
 
-例如, 在 `abc` 中使用表达式 `^a` 会得到结果 `a`. 但如果使用 `^b` 将匹配不到任何结果. 因为在字符串 `abc` 中并不是以 `b` 开头.
+例如,在 `abc` 中使用表达式 `^a` 会得到结果 `a`. 但如果使用 `^b` 将匹配不到任何结果. 因为在字符串 `abc` 中并不是以 `b` 开头。
 
-例如, `^(T|t)he` 匹配以 `The` 或 `the` 开头的字符串.
+例如, `^(T|t)he` 匹配以 `The` 或 `the` 开头的字符串。
 
 
 "(T|t)he" => The car is parked in the garage.
@@ -298,9 +299,9 @@
 
 ### 2.8.2 `$` 号
 
-同理于 `^` 号, `$` 号用来匹配字符是否是最后一个.
+同理于 `^` 号, `$` 号用来匹配字符是否是最后一个。
 
-例如, `(at\.)$` 匹配以 `at.` 结尾的字符串.
+例如, `(at\.)$` 匹配以 `at.` 结尾的字符串。
 
 
 "(at\.)" => The fat cat. sat. on the mat.
@@ -316,50 +317,50 @@
 
 ##  3. 简写字符集
 
-正则表达式提供一些常用的字符集简写. 如下:
+正则表达式提供一些常用的字符集简写. 如下:
 
 |简写|描述|
 |:----:|----|
-|.|除换行符外的所有字符|
-|\w|匹配所有字母数字, 等同于 `[a-zA-Z0-9_]`|
-|\W|匹配所有非字母数字, 即符号, 等同于: `[^\w]`|
-|\d|匹配数字: `[0-9]`|
-|\D|匹配非数字: `[^\d]`|
-|\s|匹配所有空格字符, 等同于: `[\t\n\f\r\p{Z}]`|
-|\S|匹配所有非空格字符: `[^\s]`|
-|\f|匹配一个换页符|
-|\n|匹配一个换行符|
-|\r|匹配一个回车符|
-|\t|匹配一个制表符|
-|\v|匹配一个垂直制表符|
-|\p|匹配 CR/LF (等同于 `\r\n`),用来匹配 DOS 行终止符|
+|.|除换行符外的所有字符。|
+|\w|匹配所有字母数字,等同于 `[a-zA-Z0-9_]`。|
+|\W|匹配所有非字母数字,即符号,等同于: `[^\w]`。|
+|\d|匹配数字: `[0-9]`。|
+|\D|匹配非数字: `[^\d]`。|
+|\s|匹配所有空格字符,等同于: `[\t\n\f\r\p{Z}]`。|
+|\S|匹配所有非空格字符: `[^\s]`。|
+|\f|匹配一个换页符。|
+|\n|匹配一个换行符。|
+|\r|匹配一个回车符。|
+|\t|匹配一个制表符。|
+|\v|匹配一个垂直制表符。|
+|\p|匹配 CR/LF (等同于 `\r\n`),用来匹配 DOS 行终止符。|
 
 ## 4. 前后关联约束(前后预查)
 
-前置约束和后置约束都属于**非捕获簇**(用于匹配不在匹配列表中的格式).
-前置约束用于判断所匹配的格式是否在另一个确定的格式之后.
+前置约束和后置约束都属于**非捕获簇**(用于匹配不在匹配列表中的格式)。
+前置约束用于判断所匹配的格式是否在另一个确定的格式之后。
 
-例如, 我们想要获得所有跟在 `$` 符号后的数字, 我们可以使用正向向后约束 `(?<=\$)[0-9\.]*`.
-这个表达式匹配 `$` 开头, 之后跟着 `0,1,2,3,4,5,6,7,8,9,.` 这些字符可以出现大于等于 0 次.
+例如,我们想要获得所有跟在 `$` 符号后的数字,我们可以使用正向向后约束 `(?<=\$)[0-9\.]*`。
+这个表达式匹配 `$` 开头,之后跟着 `0,1,2,3,4,5,6,7,8,9,.` 这些字符可以出现大于等于 0 次。
 
-前后关联约束如下:
+前后关联约束如下:
 
 |符号|描述|
 |:----:|----|
-|?=|前置约束-存在|
-|?!|前置约束-排除|
-|?<=|后置约束-存在|
-|?
 "(T|t)he(?=\sfat)" => The fat cat sat on the mat.
@@ -369,10 +370,10 @@
 
 ### 4.2 `?!...` 前置约束-排除
 
-前置约束-排除 `?!` 用于筛选所有匹配结果, 筛选条件为 其后不跟随着定义的格式
-`前置约束-排除`  定义和 `前置约束(存在)` 一样, 区别就是 `=` 替换成 `!` 也就是 `(?!...)`.
+前置约束-排除 `?!` 用于筛选所有匹配结果,筛选条件为 其后不跟随着定义的格式。
+`前置约束-排除`  定义和 `前置约束(存在)` 一样,区别就是 `=` 替换成 `!` 也就是 `(?!...)`。
 
-表达式 `(T|t)he(?!\sfat)` 匹配 `The` 和 `the`, 且其后不跟着 `(空格)fat`.
+表达式 `(T|t)he(?!\sfat)` 匹配 `The` 和 `the`,且其后不跟着 `fat`。
 
 
 "(T|t)he(?!\sfat)" => The fat cat sat on the mat.
@@ -382,8 +383,8 @@
 
 ### 4.3 `?<= ...` 后置约束-存在
 
-后置约束-存在 记作`(?<=...)` 用于筛选所有匹配结果, 筛选条件为 其前跟随着定义的格式.
-例如, 表达式 `(?<=(T|t)he\s)(fat|mat)` 匹配 `fat` 和 `mat`, 且其前跟着 `The` 或 `the`.
+后置约束-存在 记作`(?<=...)` 用于筛选所有匹配结果,筛选条件为 其前跟随着定义的格式。
+例如,表达式 `(?<=(T|t)he\s)(fat|mat)` 匹配 `fat` 和 `mat`,且其前跟着 `The` 或 `the`。
 
 
 "(?<=(T|t)he\s)(fat|mat)" => The fat cat sat on the mat.
@@ -393,8 +394,8 @@
 
 ### 4.4 `?
 "(?<!(T|t)he\s)(cat)" => The cat sat on cat.
@@ -404,19 +405,19 @@
 
 ## 5. 标志
 
-标志也叫修饰语, 因为它可以用来修改表达式的搜索结果.
-这些标志可以任意的组合使用, 它也是整个正则表达式的一部分.
+标志也叫修饰语,因为它可以用来修改表达式的搜索结果。
+这些标志可以任意的组合使用,它也是整个正则表达式的一部分。
 
 |标志|描述|
 |:----:|----|
-|i|忽略大小写.|
-|g|全局搜索.|
-|m|多行的: 锚点元字符 `^` `$` 工作范围在每行的起始.|
+|i|忽略大小写。|
+|g|全局搜索。|
+|m|多行的: 锚点元字符 `^` `$` 工作范围在每行的起始。|
 
 ### 5.1 忽略大小写 (Case Insensitive)
 
-修饰语 `i` 用于忽略大小写.
-例如, 表达式 `/The/gi` 表示在全局搜索 `The`, 在后面的 `i` 将其条件修改为忽略大小写, 则变成搜索 `the` 和 `The`, `g` 表示全局搜索.
+修饰语 `i` 用于忽略大小写。
+例如,表达式 `/The/gi` 表示在全局搜索 `The`,在后面的 `i` 将其条件修改为忽略大小写,则变成搜索 `the` 和 `The`, `g` 表示全局搜索。
 
 
 "The" => The fat cat sat on the mat.
@@ -432,8 +433,8 @@
 
 ### 5.2 全局搜索 (Global search)
 
-修饰符 `g` 常用语执行一个全局搜索匹配, 即(不仅仅返回第一个匹配的, 而是返回全部).
-例如, 表达式 `/.(at)/g` 表示搜索 任意字符(除了换行) + `at`, 并返回全部结果.
+修饰符 `g` 常用语执行一个全局搜索匹配,即(不仅仅返回第一个匹配的,而是返回全部)。
+例如,表达式 `/.(at)/g` 表示搜索 任意字符(除了换行) + `at`,并返回全部结果。
 
 
 "/.(at)/" => The fat cat sat on the mat.
@@ -449,11 +450,11 @@
 
 ### 5.3 多行修饰符 (Multiline)
 
-多行修饰符 `m` 常用语执行一个多行匹配.
+多行修饰符 `m` 常用语执行一个多行匹配。
 
-像之前介绍的 `(^,$)` 用于检查格式是否是在待检测字符串的开头或结尾. 但我们如果想要它在每行的开头和结尾生效, 我们需要用到多行修饰符 `m`.
+像之前介绍的 `(^,$)` 用于检查格式是否是在待检测字符串的开头或结尾。但我们如果想要它在每行的开头和结尾生效,我们需要用到多行修饰符 `m`。
 
-例如, 表达式 `/at(.)?$/gm` 表示在待检测字符串每行的末尾搜索 `at`后跟一个或多个 `.` 的字符串, 并返回全部结果.
+例如,表达式 `/at(.)?$/gm` 表示在待检测字符串每行的末尾搜索 `at`后跟一个或多个 `.` 的字符串,并返回全部结果。
 
 
 "/.at(.)?$/" => The fat
diff --git a/README-es.md b/README-es.md
index 48d54da..41ec2bd 100644
--- a/README-es.md
+++ b/README-es.md
@@ -9,13 +9,14 @@
 * [Español](README-es.md)
 * [Français](README-fr.md)
 * [Português do Brasil](README-pt_BR.md)
-* [中文版](README-cn.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)
 * [Greek](README-gr.md)
 
 ## Qué es una expresión regular?
+
 > Una expresión regular es un grupo de caracteres o símbolos, los cuales son usados para buscar un patrón específico dentro de un texto.
 
 Una expresión regular es un patrón que que se compara con una cadena de caracteres de izquierda a derecha. La palabra "expresión regular" puede también ser escrita como "Regex" o "Regexp". Las expresiones regulares se utilizan para remplazar un texto dentro de una cadena de caracteres (*string*), validar formularios, extraer una porción de una cadena de caracteres (*substring*) basado en la coincidencia de una patrón, y muchas cosas más.
@@ -414,7 +415,7 @@ El modificador `g` se utiliza para realizar una coincidencia global
 Por ejemplo, la expresión regular `/.(At)/g` significa: cualquier carácter,
 excepto la nueva línea, seguido del carácter en minúscula `a`, seguido del carácter
 en minúscula `t`. Debido a que proveimos el indicador `g` al final de la expresión
-regular, ahora encontrará todas las coincidencias de toda la cadena de entrada, no sólo la 
+regular, ahora encontrará todas las coincidencias de toda la cadena de entrada, no sólo la
 primera instancia (el cual es el comportamiento normal).
 
 
diff --git a/README-fr.md b/README-fr.md
index 6e7c65d..8bd28b5 100644
--- a/README-fr.md
+++ b/README-fr.md
@@ -8,7 +8,8 @@
 * [English](README.md)
 * [Español](README-es.md)
 * [Français](README-fr.md)
-* [中文版](README-cn.md)
+* [Português do Brasil](README-pt_BR.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)
diff --git a/README-gr.md b/README-gr.md
index 8fb7da2..1fed229 100644
--- a/README-gr.md
+++ b/README-gr.md
@@ -9,7 +9,7 @@
 * [Español](README-es.md)
 * [Français](README-fr.md)
 * [Português do Brasil](README-pt_BR.md)
-* [中文版](README-cn.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)
diff --git a/README-ja.md b/README-ja.md
index 126cde8..aa3693f 100644
--- a/README-ja.md
+++ b/README-ja.md
@@ -8,7 +8,8 @@
 * [English](README.md)
 * [Español](README-es.md)
 * [Français](README-fr.md)
-* [中文版](README-cn.md)
+* [Português do Brasil](README-pt_BR.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)
@@ -33,7 +34,7 @@
   Regular expression
 

-この正規表現によって `john_doe, jo-hn_doe, john12_as` などは許容されることになります。 +この正規表現によって `john_doe`、`jo-hn_doe`、`john12_as` などは許容されることになります。 一方で `Jo` は大文字を含む上に短すぎるため許容されません。 ## 目次 @@ -129,7 +130,7 @@ 文字集合を指定するには角括弧でくくります。 文字の範囲を指定するにはハイフンを使用します。 角括弧内の文字の記述順はマッチングには関係ありません。 -例えば `[Tt]he` という正規表現は大文字 `T` または小文字 `t` の後に `h`, `e` が続く文字列を表します。 +例えば `[Tt]he` という正規表現は大文字 `T` または小文字 `t` の後に `h`、 `e` が続く文字列を表します。
 "[Tt]he" => The car parked in the garage.
@@ -151,7 +152,7 @@
 通常キャレットは文字列の開始を意味するメタ文字ですが、角括弧内で最初に使用されると
 文字集合を否定する意味を持つようになります。
 例えば `[^c]ar` という正規表現は `c` 以外の任意の文字列の後に
-`a`, `r` が続く文字列を表します。
+`a`、`r` が続く文字列を表します。
 
 
 "[^c]ar" => The car parked in the garage.
@@ -161,7 +162,7 @@
 
 ## 2.3 繰り返し
 
-`+`, `*`, `?` はパターンが何回続くのかを指定するためのメタ文字になります。
+`+`、 `*`、 `?` はパターンが何回続くのかを指定するためのメタ文字になります。
 これらのメタ文字は異なるシチュエーションで異なる振る舞いをします。
 
 ### 2.3.1 アスタリスク
@@ -181,7 +182,7 @@
 任意の文字列を表現できます。
 またスペースを表す `\s` と併用することで空白文字を表現できます。
 例えば `\s*cat\s*` という正規表現は 0 個以上のスペースの後に
-小文字の `c`, `a`, `t` が続き、その後に 0 個以上のスペースが続きます。
+小文字の `c`、 `a`、 `t` が続き、その後に 0 個以上のスペースが続きます。
 
 
 "\s*cat\s*" => The fat cat sat on the concatenation.
@@ -207,7 +208,7 @@
 正規表現におけるメタ文字 `?` は直前の文字がオプションであることを意味します。
 すなわち直前の文字が 0 個または 1 個現れることを意味します。
 例えば `[T]?he` という正規表現は大文字の `T` が 0 個または 1 個出現し、
-その後に小文字の `h`, `e` が続くことを意味します。
+その後に小文字の `h`、 `e` が続くことを意味します。
 
 
 "[T]he" => The car is parked in the garage.
@@ -258,7 +259,7 @@
 文字グループ全体が繰り返すことを意味します。
 例えば、 `(ab)*` という正規表現は "ab" という文字列の 0 個以上の繰り返しにマッチします。
 文字グループ内では選言 `|` も使用することができます。
-例えば、`(c|g|p)ar` という正規表現は小文字の `c`, `g`, `p` のいずれかの後に
+例えば、`(c|g|p)ar` という正規表現は小文字の `c`、 `g`、 `p` のいずれかの後に
 `a` が続き、さらに `r` が続くことを意味します。
 
 
@@ -289,7 +290,7 @@
 記号として指定できるようになります。
 例えば `.` という正規表現は改行を除く任意の文字として使用されますが、
 `(f|c|m)at\.?` という正規表現では `.` 自体にマッチします。
-この正規表現は小文字の `f`, `c` または `m` の後に小文字の `a`, `t` が続き、
+この正規表現は小文字の `f`、 `c` または `m` の後に小文字の `a`、 `t` が続き、
 さらに `.` が 0 個または 1 個続きます。
 
 
@@ -312,7 +313,7 @@
 しかし `^b` という正規表現は前の文字列に対してはどれにもマッチしません。
 "b" は `abc` という入力文字列の開始ではないからです。
 他の例を見てみます。`^(T|t)he` は大文字の `T` または小文字の `t` から始まる文字列で
-その後に小文字の `h`, `e` が続くことを意味します。
+その後に小文字の `h`、 `e` が続くことを意味します。
 
 
 "(T|t)he" => The car is parked in the garage.
@@ -385,7 +386,7 @@
 肯定的な先読みを定義するには括弧を使用します。
 その括弧の中で疑問符と等号を合わせて `(?=...)` のようにします。
 先読みのパターンは括弧の中の等号の後に記述します。
-例えば `(T|t)he(?=\sfat)` という正規表現は小文字の `t` か大文字の `T` のどちらかの後に `h`, `e` が続きます。
+例えば `(T|t)he(?=\sfat)` という正規表現は小文字の `t` か大文字の `T` のどちらかの後に `h`、 `e` が続きます。
 括弧内で肯定的な先読みを定義していますが、これは `The` または `the` の後に
 `fat` が続くことを表しています。
 
@@ -448,7 +449,7 @@
 ### 5.1 大文字・小文字を区別しない
 
 修飾子 `i` は大文字・小文字を区別したくないときに使用します。
-例えば `/The/gi` という正規表現は大文字の `T` の後に小文字の `h`, `e` が続くという意味ですが、
+例えば `/The/gi` という正規表現は大文字の `T` の後に小文字の `h`、 `e` が続くという意味ですが、
 最後の `i` で大文字・小文字を区別しない設定にしています。
 文字列内の全マッチ列を検索したいのでフラグ `g` も渡しています。
 
@@ -469,7 +470,7 @@
 修飾子 `g` はグローバル検索(最初のマッチ列を検索する代わりに全マッチ列を検索する)を
 行うために使用します。
 例えば `/.(at)/g` という正規表現は、改行を除く任意の文字列の後に
-小文字の `a`, `t` が続きます。正規表現の最後にフラグ `g` を渡すことで、
+小文字の `a`、 `t` が続きます。正規表現の最後にフラグ `g` を渡すことで、
 最初のマッチだけではなく(これがデフォルトの動作です)、入力文字列内の全マッチ列を検索するようにしています。
 
 
@@ -489,7 +490,7 @@
 修飾子 `m` は複数行でマッチさせたいときに使用します。
 前述で `(^, $)` という入力文字列の開始と終了を示すためのアンカーについて説明しましたが、
 フラグ `m` は複数行でマッチさせるためのアンカーとして使用できます。
-例えば `/at(.)?$/gm` という正規表現は小文字の `a`, `t` に続き、改行を除く
+例えば `/at(.)?$/gm` という正規表現は小文字の `a`、 `t` に続き、改行を除く
 任意の文字が 0 個または 1 個続くという意味ですが、
 フラグ `m` を渡すことで入力文字列の各行でパターンを検索させることができます。
 
diff --git a/README-ko.md b/README-ko.md
index 7d0097e..56d59b1 100644
--- a/README-ko.md
+++ b/README-ko.md
@@ -8,7 +8,8 @@
 * [English](README.md)
 * [Español](README-es.md)
 * [Français](README-fr.md)
-* [中文版](README-cn.md)
+* [Português do Brasil](README-pt_BR.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)
diff --git a/README-pt_BR.md b/README-pt_BR.md
index de90728..9c57b17 100644
--- a/README-pt_BR.md
+++ b/README-pt_BR.md
@@ -9,9 +9,10 @@
 * [Español](README-es.md)
 * [Français](README-fr.md)
 * [Português do Brasil](README-pt_BR.md)
-* [中文版](README-cn.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
+* [Turkish](README-tr.md)
 * [Greek](README-gr.md)
 
 ## O que é uma Expressão Regular?
diff --git a/README-tr.md b/README-tr.md
index 956e85e..f066ba9 100644
--- a/README-tr.md
+++ b/README-tr.md
@@ -8,7 +8,8 @@
 * [English](README.md)
 * [Español](README-es.md)
 * [Français](README-fr.md)
-* [中文版](README-cn.md)
+* [Português do Brasil](README-pt_BR.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)
@@ -113,7 +114,7 @@ Nokta `.` meta karakterin en basit örneğidir. `.` meta karakteri satır başla
 ## 2.2 Karakter Takımı
 
 Karakter takımları aryıca Karakter sınıfı olarak bilinir. Karakter takımlarını belirtmek için köşeli ayraçlar kullanılır.
-Karakterin aralığını belirtmek için bir karakter takımında tire kullanın. Köşeli parantezlerdeki karakter aralığının sıralaması önemli değildir. 
+Karakterin aralığını belirtmek için bir karakter takımında tire kullanın. Köşeli parantezlerdeki karakter aralığının sıralaması önemli değildir.
 
 Örneğin, `[Tt]he` düzenli ifadesinin anlamı: bir büyük `T` veya küçük `t` harflerinin ardından sırasıyla `h` ve `e` harfi gelir.
 
@@ -153,7 +154,7 @@ Genellikle, şapka `^` sembolü harf öbeğinin başlangıcını temsil eder, am
 
 `*` sembolü, kendinden önce girilen eşlemenin sıfır veya daha fazla tekrarıyla eşleşir. Ama bir karakter seti ya da sınıf sonrasına girildiğinde, tüm karakter setinin tekrarlarını bulur.
 
-`a*` düzenli ifadesinin anlamı: `a` karakterinin sıfır veya daha fazla tekrarı. 
+`a*` düzenli ifadesinin anlamı: `a` karakterinin sıfır veya daha fazla tekrarı.
 `[a-z]*` düzenli ifadesinin anlamı: bir satırdaki herhangi bir sayıdaki küçük harfler.
 
 
@@ -250,7 +251,7 @@ Ayrıca karakter grubu içinde `|` meta karakterini kullanabiliriz.
 
 ## 2.6 Değişim
 
-Düzenli ifadede dik çizgi alternasyon(değişim, dönüşüm) tanımlamak için kullanılır. Alternasyon birden fazla ifade arasındaki bir koşul gibidir. Şu an, karakter grubu ve alternasyonun aynı şekilde çalıştığını düşünüyor olabilirsiniz. Ama, Karakter grubu ve alternasyon arasındaki büyük fark karakter grubu karakter düzeyinde çalışır ama alternasyon ifade düzeyinde çalışır. 
+Düzenli ifadede dik çizgi alternasyon(değişim, dönüşüm) tanımlamak için kullanılır. Alternasyon birden fazla ifade arasındaki bir koşul gibidir. Şu an, karakter grubu ve alternasyonun aynı şekilde çalıştığını düşünüyor olabilirsiniz. Ama, Karakter grubu ve alternasyon arasındaki büyük fark karakter grubu karakter düzeyinde çalışır ama alternasyon ifade düzeyinde çalışır.
 
 Örneğin, `(T|t)he|car` düzenli ifadesinin anlamı: Büyük `T` ya da küçük `t` karakteri, ardından küçük `h` karakteri, ardından küçük `e` ya da `c` karakteri, ardından küçük `a`, ardından küçük `r` karakteri gelir.
 
@@ -264,7 +265,7 @@ Düzenli ifadede dik çizgi alternasyon(değişim, dönüşüm) tanımlamak içi
 
 `\` işareti sonraki karakteri hariç tutmak için kullanılır. Bu bir semboülü ayrılmış karakterlerde `{ } [ ] / \ + * . $ ^ | ?` dahil olmak üzere eşleşen bir karakter olarak belirtmemizi sağlar. Bir özel karakteri eşleşen bir karakter olarak kullanmak için önüne `\` işareti getirin.
 
-Örneğin, `.` düzenli ifadesi yeni satır hariç herhangi bir karakteri eşleştirmek için kullanılır. 
+Örneğin, `.` düzenli ifadesi yeni satır hariç herhangi bir karakteri eşleştirmek için kullanılır.
 Bir harf öbeği içinde nokta `.` karakterini yakalamak için `.` ayrılmış karakterini hariç tutmamız gerekir. Bunun için nokta önüne `\` işaretini koymamız gereklidir.
 
 `(f|c|m)at\.?` düzenli ifadesinin anlamı: küçük `f`, `c`ya da `m` harfi, ardından küçük `a` harfi, ardından küçük `t` harfi, ardından opsiyonel `.` karakteri gelir.
@@ -277,7 +278,7 @@ Bir harf öbeği içinde nokta `.` karakterini yakalamak için `.` ayrılmış k
 
 ## 2.8 Sabitleyiciler
 
-Düzenli ifadelerde, eşleşen sembolün girilen harf öbeğinin başlangıç sembolü veya bitiş sembolü olup olmadığını kontrol etmek için sabitleyicileri kullanırız. 
+Düzenli ifadelerde, eşleşen sembolün girilen harf öbeğinin başlangıç sembolü veya bitiş sembolü olup olmadığını kontrol etmek için sabitleyicileri kullanırız.
 Sabitleyiciler iki çeşittir: İlk çeşit eşleşen karakterin girişin ilk karakteri olup olmadığını kontrol eden şapka `^` karakteri, ve ikinci çeşit eşleşen karakterin girişin son karakteri olup olmadığını kontrol eden dolar `$` karakteridir.
 
 ### 2.8.1 Şapka İşareti
@@ -285,7 +286,7 @@ Sabitleyiciler iki çeşittir: İlk çeşit eşleşen karakterin girişin ilk ka
 Şapka `^` işareti eşleşen karakterin giriş harf öbeğinin ilk karakteri olup olmadığını kontrol etmek için kullanılır.
 Eğer `^a` düzenli ifadesini `abc` harf öbeğine uygularsak `a` ile eşleşir. Ama `^b` ifadesini uygularsak bir eşleşme bulamayız. Bunun nedeni `abc` harf öbeğinde `b` karakterinin başlangıç karakteri olmamasıdır.
 
-Bir başka örnek üzerinden ilerlersek, 
+Bir başka örnek üzerinden ilerlersek,
 
 `^(T|t)he` düzenli ifadesinin anlamı: büyük `T` ya da `t` karakteri giriş harf öbeğinin ilk karakteri olmak üzere, ardından küçük `h`, ardından küçük `e` karakteri gelir.
 
diff --git a/README.md b/README.md
index 4e1a6f3..e35e3ff 100644
--- a/README.md
+++ b/README.md
@@ -9,7 +9,7 @@
 * [Español](README-es.md)
 * [Français](README-fr.md)
 * [Português do Brasil](README-pt_BR.md)
-* [中文版](README-cn.md)
+* [中文(简体)版](README-cn.md)
 * [日本語](README-ja.md)
 * [한국어](README-ko.md)
 * [Turkish](README-tr.md)

From 88c778f971d04bf70a11bdeb03ebbcab36a9ead2 Mon Sep 17 00:00:00 2001
From: Jovibor 
Date: Sun, 11 Aug 2019 20:36:53 +1000
Subject: [PATCH 02/24] Fixed typos

---
 translations/README-ru.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/translations/README-ru.md b/translations/README-ru.md
index 8e7ccde..8f3ad11 100644
--- a/translations/README-ru.md
+++ b/translations/README-ru.md
@@ -170,7 +170,7 @@
 
 ## 2.3 Повторения
 
-Символы `+`, `*` или `?` используются для обозначения того, как сколько раз появляется какой-либо подшаблон.
+Символы `+`, `*` или `?` используются для обозначения того сколько раз появляется какой-либо подшаблон.
 Данные метасимволы могут вести себя по-разному, в зависимости от ситуации.
 
 ### 2.3.1 Звёздочка
@@ -289,7 +289,7 @@
 
 [Запустить регулярное выражение](https://regex101.com/r/Rm7Me8/1)
 
-Не запоминающиеся группы пригодиться, когда они используются в функциях поиска и замены
+Незапоминающиеся группы могут пригодиться, когда они используются в функциях поиска и замены,
 или в сочетании со скобочными группами, например, для предпросмотра при создании скобочной группы или другого вида выходных данных,
 смотрите также [4. Опережающие и ретроспективные проверки](#4-опережающие-и-ретроспективные-проверки).
 
@@ -392,8 +392,8 @@
 
 Опережающие и ретроспективные проверки (в английской литературе lookbehind, lookahead) это особый вид
 ***не запоминающих скобочных групп*** (находящих совпадения, но не добавляющих в массив).
-Данные проверки используются, мы знаем, что шаблон предшествует или сопровождается другим шаблоном.
-Например, мы хотим получить получить цену в долларах `$`, из следующей входной строки
+Данные проверки используются когда мы знаем, что шаблон предшествует или сопровождается другим шаблоном.
+Например, мы хотим получить цену в долларах `$` из следующей входной строки
 `$4.44 and $10.88`. Для этого используем следующее регулярное выражение `(?<=\$)[0-9\.]*`, означающее
 получение всех дробных (с точкой `.`) цифр, которым предшествует знак доллара `$`. Существуют
 следующие виды проверок:

From 807c7071aaee4cb4ed084b9afd5cb5ba7b1c0717 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E8=B0=AD=E4=B9=9D=E9=BC=8E?= <109224573@qq.com>
Date: Mon, 7 Oct 2019 11:17:00 +0800
Subject: [PATCH 03/24] Update zh-CN

---
 translations/README-cn.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/translations/README-cn.md b/translations/README-cn.md
index 6cce5de..c885630 100644
--- a/translations/README-cn.md
+++ b/translations/README-cn.md
@@ -179,7 +179,7 @@
 ### 2.3.1 `*` 号
 
 `*`号匹配 在`*`之前的字符出现`大于等于0`次.
-例如, 表达式 `a*` 匹配以0或更多个a开头的字符, 因为有0个这个条件, 其实也就匹配了所有的字符. 表达式`[a-z]*` 匹配一个行中所有以小写字母开头的字符串.
+例如, 表达式 `a*` 匹配0或更多个以a开头的字符. 表达式`[a-z]*` 匹配一个行中所有以小写字母开头的字符串.
 
 
 "[a-z]*" => The car parked in the garage #21.
@@ -199,7 +199,7 @@
 ### 2.3.2 `+` 号
 
 `+`号匹配`+`号之前的字符出现 >=1 次.
-例如表达式`c.+t` 匹配以首字母`c`开头以`t`结尾,中间跟着任意个字符的字符串.
+例如表达式`c.+t` 匹配以首字母`c`开头以`t`结尾,中间跟着至少一个字符的字符串.
 
 
 "c.+t" => The fat cat sat on the mat.

From 73403e6fdfa9748ddfae845372ee78fa2f051afc Mon Sep 17 00:00:00 2001
From: Caleb Mazalevskis 
Date: Sun, 13 Oct 2019 21:04:51 +0800
Subject: [PATCH 04/24] Fix typo.

(See: https://www.diccionariodedudas.com/remplazar-o-reemplazar/)
---
 translations/README-es.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/translations/README-es.md b/translations/README-es.md
index dbb8f29..da6d3e1 100644
--- a/translations/README-es.md
+++ b/translations/README-es.md
@@ -34,7 +34,7 @@
 ## Qué es una expresión regular?
 > Una expresión regular es un grupo de caracteres o símbolos, los cuales son usados para buscar un patrón específico dentro de un texto.
 
-Una expresión regular es un patrón que que se compara con una cadena de caracteres de izquierda a derecha. La palabra "expresión regular" puede también ser escrita como "Regex" o "Regexp". Las expresiones regulares se utilizan para remplazar un texto dentro de una cadena de caracteres (*string*), validar formularios, extraer una porción de una cadena de caracteres (*substring*) basado en la coincidencia de una patrón, y muchas cosas más.
+Una expresión regular es un patrón que que se compara con una cadena de caracteres de izquierda a derecha. La palabra "expresión regular" puede también ser escrita como "Regex" o "Regexp". Las expresiones regulares se utilizan para reemplazar un texto dentro de una cadena de caracteres (*string*), validar formularios, extraer una porción de una cadena de caracteres (*substring*) basado en la coincidencia de una patrón, y muchas cosas más.
 
 Imagina que estás escribiendo una aplicación y quieres agregar reglas para cuando el usuario elija su nombre de usuario. Nosotros queremos permitir que el nombre de usuario contenga letras, números, guión bajo (raya), y guión medio. También queremos limitar el número de caracteres en el nombre de usuario para que no se vea feo. Para ello usamos la siguiente expresión regular para validar el nombre de usuario.
 

From 65c7e3ee7e81c20c804955723c343c0218cb3575 Mon Sep 17 00:00:00 2001
From: Roberto Ruccia 
Date: Tue, 8 Oct 2019 15:52:51 +0200
Subject: [PATCH 05/24] Fix typos

---
 README.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 3a5c055..23a2b86 100644
--- a/README.md
+++ b/README.md
@@ -252,7 +252,7 @@ character `e`.
 In regular expression braces that are also called quantifiers are used to
 specify the number of times that a character or a group of characters can be
 repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least
-2 digits but not more than 3 ( characters in the range of 0 to 9).
+2 digits but not more than 3 (characters in the range of 0 to 9).
 
 
 "[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -421,7 +421,7 @@ shorthand character sets are as follows:
 ## 4. Lookaround
 
 Lookbehind and lookahead (also called lookaround) are specific types of
-***non-capturing groups*** (Used to match the pattern but not included in matching
+***non-capturing groups*** (used to match the pattern but not included in matching
 list). Lookarounds are used when we have the condition that this pattern is
 preceded or followed by another certain pattern. For example, we want to get all
 numbers that are preceded by `$` character from the following input string
@@ -578,8 +578,8 @@ at the end of each line in a string.
 [Test the regular expression](https://regex101.com/r/E88WE2/1)
 
 ## 6. Greedy vs lazy matching
-By default regex will do greedy matching , means it will match as long as
-possible. we can use `?` to match in lazy way means as short as possible
+By default regex will do greedy matching which means it will match as long as
+possible. We can use `?` to match in lazy way which means as short as possible.
 
 
 "/(.*at)/" => The fat cat sat on the mat. 
From 5e624ad0fe3dde51e9350da814e4d779575bbd2a Mon Sep 17 00:00:00 2001 From: chenyuheng Date: Fri, 18 Oct 2019 00:41:44 +0800 Subject: [PATCH 06/24] update the Chinese text punctuations to fullwidth marks reference: https://en.wikipedia.org/wiki/Chinese_punctuation#Marks_similar_to_European_punctuation --- translations/README-cn.md | 217 +++++++++++++++++++------------------- 1 file changed, 108 insertions(+), 109 deletions(-) diff --git a/translations/README-cn.md b/translations/README-cn.md index c885630..06c2a36 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -15,7 +15,7 @@

-## 翻译: +## 翻译: * [English](../README.md) * [Español](../translations/README-es.md) @@ -31,25 +31,25 @@ * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -## 什么是正则表达式? +## 什么是正则表达式? -> 正则表达式是一组由字母和符号组成的特殊文本, 它可以用来从文本中找出满足你想要的格式的句子. +> 正则表达式是一组由字母和符号组成的特殊文本,它可以用来从文本中找出满足你想要的格式的句子。 +一个正则表达式是在一个主体字符串中从左到右匹配字符串时的一种样式。 +“Regular expression”这个词比较拗口,我们常使用缩写的术语“regex”或“regexp”。 +正则表达式可以从一个基础字符串中根据一定的匹配模式替换文本中的字符串、验证表单、提取字符串等等。 -一个正则表达式是在一个主体字符串中从左到右匹配字符串时的一种样式. -"Regular expression"这个词比较拗口, 我们常使用缩写的术语"regex"或"regexp". -正则表达式可以从一个基础字符串中根据一定的匹配模式替换文本中的字符串、验证表单、提取字符串等等. - -想象你正在写一个应用, 然后你想设定一个用户命名的规则, 让用户名包含字符,数字,下划线和连字符,以及限制字符的个数,好让名字看起来没那么丑. -我们使用以下正则表达式来验证一个用户名: +想象你正在写一个应用,然后你想设定一个用户命名的规则,让用户名包含字符、数字、下划线和连字符,以及限制字符的个数,好让名字看起来没那么丑。 +我们使用以下正则表达式来验证一个用户名:

+

Regular expression

-以上的正则表达式可以接受 `john_doe`, `jo-hn_doe`, `john12_as`. -但不匹配`Jo`, 因为它包含了大写的字母而且太短了. +以上的正则表达式可以接受 `john_doe`、`jo-hn_doe`、`john12_as`。 +但不匹配`Jo`,因为它包含了大写的字母而且太短了。 目录 ================= @@ -77,17 +77,17 @@ * [4.3 ?<= ... 正后发断言](#43---正后发断言) * [4.4 ?<!... 负后发断言](#44--负后发断言) * [5. 标志](#5-标志) - * [5.1 忽略大小写 (Case Insensitive)](#51-忽略大小写-case-insensitive) - * [5.2 全局搜索 (Global search)](#52-全局搜索-global-search) - * [5.3 多行修饰符 (Multiline)](#53-多行修饰符-multiline) + * [5.1 忽略大小写(Case Insensitive)](#51-忽略大小写-case-insensitive) + * [5.2 全局搜索(Global search)](#52-全局搜索-global-search) + * [5.3 多行修饰符(Multiline)](#53-多行修饰符-multiline) * [额外补充](#额外补充) * [贡献](#贡献) * [许可证](#许可证) ## 1. 基本匹配 -正则表达式其实就是在执行搜索时的格式, 它由一些字母和数字组合而成. -例如: 一个正则表达式 `the`, 它表示一个规则: 由字母`t`开始,接着是`h`,再接着是`e`. +正则表达式其实就是在执行搜索时的格式,它由一些字母和数字组合而成。 +例如:一个正则表达式 `the`,它表示一个规则:由字母`t`开始,接着是`h`,再接着是`e`。
 "the" => The fat cat sat on the mat.
@@ -95,9 +95,9 @@
 
 [在线练习](https://regex101.com/r/dmRygT/1)
 
-正则表达式`123`匹配字符串`123`. 它逐个字符的与输入的正则表达式做比较.
+正则表达式`123`匹配字符串`123`。它逐个字符的与输入的正则表达式做比较。
 
-正则表达式是大小写敏感的, 所以`The`不会匹配`the`.
+正则表达式是大小写敏感的,所以`The`不会匹配`the`。
 
 
 "The" => The fat cat sat on the mat.
@@ -107,29 +107,29 @@
 
 ## 2. 元字符
 
-正则表达式主要依赖于元字符.
-元字符不代表他们本身的字面意思, 他们都有特殊的含义. 一些元字符写在方括号中的时候有一些特殊的意思. 以下是一些元字符的介绍:
+正则表达式主要依赖于元字符。
+元字符不代表他们本身的字面意思,他们都有特殊的含义。一些元字符写在方括号中的时候有一些特殊的意思。以下是一些元字符的介绍:
 
 |元字符|描述|
 |:----:|----|
-|.|句号匹配任意单个字符除了换行符.|
-|[ ]|字符种类. 匹配方括号内的任意字符.|
-|[^ ]|否定的字符种类. 匹配除了方括号里的任意字符|
-|*|匹配>=0个重复的在*号之前的字符.|
-|+|匹配>=1个重复的+号前的字符.
+|.|句号匹配任意单个字符除了换行符。|
+|[ ]|字符种类。匹配方括号内的任意字符。|
+|[^ ]|否定的字符种类。匹配除了方括号里的任意字符|
+|*|匹配>=0个重复的在*号之前的字符。|
+|+|匹配>=1个重复的+号前的字符。
 |?|标记?之前的字符为可选.|
 |{n,m}|匹配num个大括号之前的字符 (n <= num <= m).|
-|(xyz)|字符集, 匹配与 xyz 完全相等的字符串.|
-|||或运算符,匹配符号前或后的字符.|
+|(xyz)|字符集,匹配与 xyz 完全相等的字符串.|
+|||或运算符,匹配符号前或后的字符.|
 |\|转义字符,用于匹配一些保留的字符 [ ] ( ) { } . * + ? ^ $ \ ||
 |^|从开始行开始匹配.|
 |$|从末端开始匹配.|
 
 ## 2.1 点运算符 `.`
 
-`.`是元字符中最简单的例子.
-`.`匹配任意单个字符, 但不匹配换行符.
-例如, 表达式`.ar`匹配一个任意字符后面跟着是`a`和`r`的字符串.
+`.`是元字符中最简单的例子。
+`.`匹配任意单个字符,但不匹配换行符。
+例如,表达式`.ar`匹配一个任意字符后面跟着是`a`和`r`的字符串。
 
 
 ".ar" => The car parked in the garage.
@@ -139,11 +139,11 @@
 
 ## 2.2 字符集
 
-字符集也叫做字符类.
-方括号用来指定一个字符集.
-在方括号中使用连字符来指定字符集的范围.
-在方括号中的字符集不关心顺序.
-例如, 表达式`[Tt]he` 匹配 `the` 和 `The`.
+字符集也叫做字符类。
+方括号用来指定一个字符集。
+在方括号中使用连字符来指定字符集的范围。
+在方括号中的字符集不关心顺序。
+例如,表达式`[Tt]he` 匹配 `the` 和 `The`。
 
 
 "[Tt]he" => The car parked in the garage.
@@ -151,7 +151,7 @@
 
 [在线练习](https://regex101.com/r/2ITLQ4/1)
 
-方括号的句号就表示句号.
+方括号的句号就表示句号。
 表达式 `ar[.]` 匹配 `ar.`字符串
 
 
@@ -162,8 +162,8 @@
 
 ### 2.2.1 否定字符集
 
-一般来说 `^` 表示一个字符串的开头, 但它用在一个方括号的开头的时候, 它表示这个字符集是否定的.
-例如, 表达式`[^c]ar` 匹配一个后面跟着`ar`的除了`c`的任意字符.
+一般来说 `^` 表示一个字符串的开头,但它用在一个方括号的开头的时候,它表示这个字符集是否定的。
+例如,表达式`[^c]ar` 匹配一个后面跟着`ar`的除了`c`的任意字符。
 
 
 "[^c]ar" => The car parked in the garage.
@@ -173,13 +173,13 @@
 
 ## 2.3 重复次数
 
-后面跟着元字符 `+`, `*` or `?` 的, 用来指定匹配子模式的次数.
-这些元字符在不同的情况下有着不同的意思.
+后面跟着元字符 `+`,`*` or `?` 的,用来指定匹配子模式的次数。
+这些元字符在不同的情况下有着不同的意思。
 
 ### 2.3.1 `*` 号
 
-`*`号匹配 在`*`之前的字符出现`大于等于0`次.
-例如, 表达式 `a*` 匹配0或更多个以a开头的字符. 表达式`[a-z]*` 匹配一个行中所有以小写字母开头的字符串.
+`*`号匹配 在`*`之前的字符出现`大于等于0`次。
+例如,表达式 `a*` 匹配0或更多个以a开头的字符。表达式`[a-z]*` 匹配一个行中所有以小写字母开头的字符串。
 
 
 "[a-z]*" => The car parked in the garage #21.
@@ -187,8 +187,8 @@
 
 [在线练习](https://regex101.com/r/7m8me5/1)
 
-`*`字符和`.`字符搭配可以匹配所有的字符`.*`.
-`*`和表示匹配空格的符号`\s`连起来用, 如表达式`\s*cat\s*`匹配0或更多个空格开头和0或更多个空格结尾的cat字符串.
+`*`字符和`.`字符搭配可以匹配所有的字符`.*`。
+`*`和表示匹配空格的符号`\s`连起来用,如表达式`\s*cat\s*`匹配0或更多个空格开头和0或更多个空格结尾的cat字符串。
 
 
 "\s*cat\s*" => The fat cat sat on the concatenation.
@@ -198,8 +198,8 @@
 
 ### 2.3.2 `+` 号
 
-`+`号匹配`+`号之前的字符出现 >=1 次.
-例如表达式`c.+t` 匹配以首字母`c`开头以`t`结尾,中间跟着至少一个字符的字符串.
+`+`号匹配`+`号之前的字符出现 >=1 次。
+例如表达式`c.+t` 匹配以首字母`c`开头以`t`结尾,中间跟着至少一个字符的字符串。
 
 
 "c.+t" => The fat cat sat on the mat.
@@ -209,8 +209,8 @@
 
 ### 2.3.3 `?` 号
 
-在正则表达式中元字符 `?` 标记在符号前面的字符为可选, 即出现 0 或 1 次.
-例如, 表达式 `[T]?he` 匹配字符串 `he` 和 `The`.
+在正则表达式中元字符 `?` 标记在符号前面的字符为可选,即出现 0 或 1 次。
+例如,表达式 `[T]?he` 匹配字符串 `he` 和 `The`。
 
 
 "[T]he" => The car is parked in the garage.
@@ -226,8 +226,8 @@
 
 ## 2.4 `{}` 号
 
-在正则表达式中 `{}` 是一个量词, 常用来一个或一组字符可以重复出现的次数.
-例如,  表达式 `[0-9]{2,3}` 匹配最少 2 位最多 3 位 0~9 的数字.
+在正则表达式中 `{}` 是一个量词,常用来一个或一组字符可以重复出现的次数。
+例如, 表达式 `[0-9]{2,3}` 匹配最少 2 位最多 3 位 0~9 的数字。
 
 
 "[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -235,8 +235,8 @@
 
 [在线练习](https://regex101.com/r/juM86s/1)
 
-我们可以省略第二个参数.
-例如, `[0-9]{2,}` 匹配至少两位 0~9 的数字.
+我们可以省略第二个参数。
+例如,`[0-9]{2,}` 匹配至少两位 0~9 的数字。
 
 
 "[0-9]{2,}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -244,8 +244,8 @@
 
 [在线练习](https://regex101.com/r/Gdy4w5/1)
 
-如果逗号也省略掉则表示重复固定的次数.
-例如, `[0-9]{3}` 匹配3位数字
+如果逗号也省略掉则表示重复固定的次数。
+例如,`[0-9]{3}` 匹配3位数字
 
 
 "[0-9]{3}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -255,9 +255,9 @@
 
 ## 2.5 `(...)` 特征标群
 
-特征标群是一组写在 `(...)` 中的子模式. 例如之前说的 `{}` 是用来表示前面一个字符出现指定次数. 但如果在 `{}` 前加入特征标群则表示整个标群内的字符重复 N 次. 例如, 表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`.
+特征标群是一组写在 `(...)` 中的子模式。例如之前说的 `{}` 是用来表示前面一个字符出现指定次数。但如果在 `{}` 前加入特征标群则表示整个标群内的字符重复 N 次。例如,表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`。
 
-我们还可以在 `()` 中用或字符 `|` 表示或. 例如, `(c|g|p)ar` 匹配 `car` 或 `gar` 或 `par`.
+我们还可以在 `()` 中用或字符 `|` 表示或。例如,`(c|g|p)ar` 匹配 `car` 或 `gar` 或 `par`.
 
 
 "(c|g|p)ar" => The car is parked in the garage.
@@ -267,9 +267,9 @@
 
 ## 2.6 `|` 或运算符
 
-或运算符就表示或, 用作判断条件.
+或运算符就表示或,用作判断条件。
 
-例如 `(T|t)he|car` 匹配 `(T|t)he` 或 `car`.
+例如 `(T|t)he|car` 匹配 `(T|t)he` 或 `car`。
 
 
 "(T|t)he|car" => The car is parked in the garage.
@@ -279,9 +279,9 @@
 
 ## 2.7 转码特殊字符
 
-反斜线 `\` 在表达式中用于转码紧跟其后的字符. 用于指定 `{ } [ ] / \ + * . $ ^ | ?` 这些特殊字符. 如果想要匹配这些特殊字符则要在其前面加上反斜线 `\`.
+反斜线 `\` 在表达式中用于转码紧跟其后的字符。用于指定 `{ } [ ] / \ + * . $ ^ | ?` 这些特殊字符。如果想要匹配这些特殊字符则要在其前面加上反斜线 `\`。
 
-例如 `.` 是用来匹配除换行符外的所有字符的. 如果想要匹配句子中的 `.` 则要写成 `\.` 以下这个例子 `\.?`是选择性匹配`.`
+例如 `.` 是用来匹配除换行符外的所有字符的。如果想要匹配句子中的 `.` 则要写成 `\.` 以下这个例子 `\.?`是选择性匹配`.`
 
 
 "(f|c|m)at\.?" => The fat cat sat on the mat.
@@ -291,15 +291,15 @@
 
 ## 2.8 锚点
 
-在正则表达式中, 想要匹配指定开头或结尾的字符串就要使用到锚点. `^` 指定开头, `$` 指定结尾.
+在正则表达式中,想要匹配指定开头或结尾的字符串就要使用到锚点。`^` 指定开头,`$` 指定结尾。
 
 ### 2.8.1 `^` 号
 
-`^` 用来检查匹配的字符串是否在所匹配字符串的开头.
+`^` 用来检查匹配的字符串是否在所匹配字符串的开头。
 
-例如, 在 `abc` 中使用表达式 `^a` 会得到结果 `a`. 但如果使用 `^b` 将匹配不到任何结果. 因为在字符串 `abc` 中并不是以 `b` 开头.
+例如,在 `abc` 中使用表达式 `^a` 会得到结果 `a`。但如果使用 `^b` 将匹配不到任何结果。因为在字符串 `abc` 中并不是以 `b` 开头。
 
-例如, `^(T|t)he` 匹配以 `The` 或 `the` 开头的字符串.
+例如,`^(T|t)he` 匹配以 `The` 或 `the` 开头的字符串。
 
 
 "(T|t)he" => The car is parked in the garage.
@@ -315,9 +315,9 @@
 
 ### 2.8.2 `$` 号
 
-同理于 `^` 号, `$` 号用来匹配字符是否是最后一个.
+同理于 `^` 号,`$` 号用来匹配字符是否是最后一个。
 
-例如, `(at\.)$` 匹配以 `at.` 结尾的字符串.
+例如,`(at\.)$` 匹配以 `at.` 结尾的字符串。
 
 
 "(at\.)" => The fat cat. sat. on the mat.
@@ -333,33 +333,33 @@
 
 ##  3. 简写字符集
 
-正则表达式提供一些常用的字符集简写. 如下:
+正则表达式提供一些常用的字符集简写。如下:
 
 |简写|描述|
 |:----:|----|
 |.|除换行符外的所有字符|
-|\w|匹配所有字母数字, 等同于 `[a-zA-Z0-9_]`|
-|\W|匹配所有非字母数字, 即符号, 等同于: `[^\w]`|
-|\d|匹配数字: `[0-9]`|
-|\D|匹配非数字: `[^\d]`|
-|\s|匹配所有空格字符, 等同于: `[\t\n\f\r\p{Z}]`|
-|\S|匹配所有非空格字符: `[^\s]`|
+|\w|匹配所有字母数字,等同于 `[a-zA-Z0-9_]`|
+|\W|匹配所有非字母数字,即符号,等同于: `[^\w]`|
+|\d|匹配数字: `[0-9]`|
+|\D|匹配非数字: `[^\d]`|
+|\s|匹配所有空格字符,等同于: `[\t\n\f\r\p{Z}]`|
+|\S|匹配所有非空格字符: `[^\s]`|
 |\f|匹配一个换页符|
 |\n|匹配一个换行符|
 |\r|匹配一个回车符|
 |\t|匹配一个制表符|
 |\v|匹配一个垂直制表符|
-|\p|匹配 CR/LF (等同于 `\r\n`),用来匹配 DOS 行终止符|
+|\p|匹配 CR/LF(等同于 `\r\n`),用来匹配 DOS 行终止符|
 
-## 4. 零宽度断言(前后预查)
+## 4. 零宽度断言(前后预查)
 
-先行断言和后发断言都属于**非捕获簇**(不捕获文本 ,也不针对组合计进行计数).
-先行断言用于判断所匹配的格式是否在另一个确定的格式之前, 匹配结果不包含该确定格式(仅作为约束).
+先行断言和后发断言都属于**非捕获簇**(不捕获文本 ,也不针对组合计进行计数)。
+先行断言用于判断所匹配的格式是否在另一个确定的格式之前,匹配结果不包含该确定格式(仅作为约束)。
 
-例如, 我们想要获得所有跟在 `$` 符号后的数字, 我们可以使用正后发断言 `(?<=\$)[0-9\.]*`.
-这个表达式匹配 `$` 开头, 之后跟着 `0,1,2,3,4,5,6,7,8,9,.` 这些字符可以出现大于等于 0 次.
+例如,我们想要获得所有跟在 `$` 符号后的数字,我们可以使用正后发断言 `(?<=\$)[0-9\.]*`。
+这个表达式匹配 `$` 开头,之后跟着 `0,1,2,3,4,5,6,7,8,9,.` 这些字符可以出现大于等于 0 次。
 
-零宽度断言如下:
+零宽度断言如下:
 
 |符号|描述|
 |:----:|----|
@@ -370,13 +370,13 @@
 
 ### 4.1 `?=...` 正先行断言
 
-`?=...` 正先行断言, 表示第一部分表达式之后必须跟着 `?=...`定义的表达式.
+`?=...` 正先行断言,表示第一部分表达式之后必须跟着 `?=...`定义的表达式。
 
-返回结果只包含满足匹配条件的第一部分表达式.
-定义一个正先行断言要使用 `()`. 在括号内部使用一个问号和等号: `(?=...)`. 
+返回结果只包含满足匹配条件的第一部分表达式。
+定义一个正先行断言要使用 `()`。在括号内部使用一个问号和等号: `(?=...)`。
 
-正先行断言的内容写在括号中的等号后面.
-例如, 表达式 `(T|t)he(?=\sfat)` 匹配 `The` 和 `the`, 在括号中我们又定义了正先行断言 `(?=\sfat)` ,即 `The` 和 `the` 后面紧跟着 `(空格)fat`.
+正先行断言的内容写在括号中的等号后面。
+例如,表达式 `(T|t)he(?=\sfat)` 匹配 `The` 和 `the`,在括号中我们又定义了正先行断言 `(?=\sfat)` ,即 `The` 和 `the` 后面紧跟着 `(空格)fat`。
 
 
 "(T|t)he(?=\sfat)" => The fat cat sat on the mat.
@@ -386,10 +386,10 @@
 
 ### 4.2 `?!...` 负先行断言
 
-负先行断言 `?!` 用于筛选所有匹配结果, 筛选条件为 其后不跟随着断言中定义的格式.
-`正先行断言`  定义和 `负先行断言` 一样, 区别就是 `=` 替换成 `!` 也就是 `(?!...)`.
+负先行断言 `?!` 用于筛选所有匹配结果,筛选条件为 其后不跟随着断言中定义的格式。
+`正先行断言`  定义和 `负先行断言` 一样,区别就是 `=` 替换成 `!` 也就是 `(?!...)`。
 
-表达式 `(T|t)he(?!\sfat)` 匹配 `The` 和 `the`, 且其后不跟着 `(空格)fat`.
+表达式 `(T|t)he(?!\sfat)` 匹配 `The` 和 `the`,且其后不跟着 `(空格)fat`。
 
 
 "(T|t)he(?!\sfat)" => The fat cat sat on the mat.
@@ -399,8 +399,8 @@
 
 ### 4.3 `?<= ...` 正后发断言
 
-正后发断言 记作`(?<=...)` 用于筛选所有匹配结果, 筛选条件为 其前跟随着断言中定义的格式.
-例如, 表达式 `(?<=(T|t)he\s)(fat|mat)` 匹配 `fat` 和 `mat`, 且其前跟着 `The` 或 `the`.
+正后发断言 记作`(?<=...)` 用于筛选所有匹配结果,筛选条件为 其前跟随着断言中定义的格式。
+例如,表达式 `(?<=(T|t)he\s)(fat|mat)` 匹配 `fat` 和 `mat`,且其前跟着 `The` 或 `the`。
 
 
 "(?<=(T|t)he\s)(fat|mat)" => The fat cat sat on the mat.
@@ -410,8 +410,8 @@
 
 ### 4.4 `?
 "(?<!(T|t)he\s)(cat)" => The cat sat on cat.
@@ -421,19 +421,19 @@
 
 ## 5. 标志
 
-标志也叫模式修正符, 因为它可以用来修改表达式的搜索结果.
-这些标志可以任意的组合使用, 它也是整个正则表达式的一部分.
+标志也叫模式修正符,因为它可以用来修改表达式的搜索结果。
+这些标志可以任意的组合使用,它也是整个正则表达式的一部分。
 
 |标志|描述|
 |:----:|----|
-|i|忽略大小写.|
-|g|全局搜索.|
-|m|多行的: 锚点元字符 `^` `$` 工作范围在每行的起始.|
+|i|忽略大小写。|
+|g|全局搜索。|
+|m|多行修饰符:锚点元字符 `^` `$` 工作范围在每行的起始。|
 
-### 5.1 忽略大小写 (Case Insensitive)
+### 5.1 忽略大小写(Case Insensitive)
 
-修饰语 `i` 用于忽略大小写.
-例如, 表达式 `/The/gi` 表示在全局搜索 `The`, 在后面的 `i` 将其条件修改为忽略大小写, 则变成搜索 `the` 和 `The`, `g` 表示全局搜索.
+修饰语 `i` 用于忽略大小写。
+例如,表达式 `/The/gi` 表示在全局搜索 `The`,在后面的 `i` 将其条件修改为忽略大小写,则变成搜索 `the` 和 `The`,`g` 表示全局搜索。
 
 
 "The" => The fat cat sat on the mat.
@@ -447,10 +447,10 @@
 
 [在线练习](https://regex101.com/r/ahfiuh/1)
 
-### 5.2 全局搜索 (Global search)
+### 5.2 全局搜索(Global search)
 
-修饰符 `g` 常用于执行一个全局搜索匹配, 即(不仅仅返回第一个匹配的, 而是返回全部).
-例如, 表达式 `/.(at)/g` 表示搜索 任意字符(除了换行) + `at`, 并返回全部结果.
+修饰符 `g` 常用于执行一个全局搜索匹配,即(不仅仅返回第一个匹配的,而是返回全部)。
+例如,表达式 `/.(at)/g` 表示搜索 任意字符(除了换行)+ `at`,并返回全部结果。
 
 
 "/.(at)/" => The fat cat sat on the mat.
@@ -464,13 +464,13 @@
 
 [在线练习](https://regex101.com/r/dO1nef/1)
 
-### 5.3 多行修饰符 (Multiline)
+### 5.3 多行修饰符(Multiline)
 
-多行修饰符 `m` 常用于执行一个多行匹配.
+多行修饰符 `m` 常用于执行一个多行匹配。
 
-像之前介绍的 `(^,$)` 用于检查格式是否是在待检测字符串的开头或结尾. 但我们如果想要它在每行的开头和结尾生效, 我们需要用到多行修饰符 `m`.
+像之前介绍的 `(^,$)` 用于检查格式是否是在待检测字符串的开头或结尾。但我们如果想要它在每行的开头和结尾生效,我们需要用到多行修饰符 `m`。
 
-例如, 表达式 `/at(.)?$/gm` 表示小写字符 `a` 后跟小写字符 `t` , 末尾可选除换行符外任意字符. 根据 `m` 修饰符, 现在表达式匹配每行的结尾.
+例如,表达式 `/at(.)?$/gm` 表示小写字符 `a` 后跟小写字符 `t` ,末尾可选除换行符外任意字符。根据 `m` 修饰符,现在表达式匹配每行的结尾。
 
 
 "/.at(.)?$/" => The fat
@@ -488,7 +488,7 @@
 
 [在线练习](https://regex101.com/r/E88WE2/1)
 
-### 6. 贪婪匹配与惰性匹配 (Greedy vs lazy matching)
+### 6. 贪婪匹配与惰性匹配(Greedy vs lazy matching)
 
 正则表达式默认采用贪婪匹配模式,在该模式下意味着会匹配尽可能长的子串。我们可以使用 `?` 将贪婪匹配模式转化为惰性匹配模式。
 
@@ -500,7 +500,6 @@
 
 "/(.*?at)/" => The fat cat sat on the mat. 
- [在线练习](https://regex101.com/r/AyAdgJ/2) ## 贡献 From fbc008255920613f55ac6d80e36c6f0ec30eeb53 Mon Sep 17 00:00:00 2001 From: Yang Jin Date: Mon, 21 Oct 2019 15:33:41 +0800 Subject: [PATCH 07/24] update part of Chinese translation --- translations/README-cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/translations/README-cn.md b/translations/README-cn.md index 06c2a36..eac7056 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -35,7 +35,7 @@ > 正则表达式是一组由字母和符号组成的特殊文本,它可以用来从文本中找出满足你想要的格式的句子。 -一个正则表达式是在一个主体字符串中从左到右匹配字符串时的一种样式。 +一个正则表达式是一种从左到右匹配主体字符串的模式。 “Regular expression”这个词比较拗口,我们常使用缩写的术语“regex”或“regexp”。 正则表达式可以从一个基础字符串中根据一定的匹配模式替换文本中的字符串、验证表单、提取字符串等等。 From 7e5a3f43ace034ace31beb743340bf0e09432153 Mon Sep 17 00:00:00 2001 From: Hamzeh Javadi Date: Sun, 27 Oct 2019 08:47:47 +0330 Subject: [PATCH 08/24] =?UTF-8?q?=D8=AF=D9=88=20=D9=BE=D8=A7=D8=B1=D8=A7?= =?UTF-8?q?=DA=AF=D8=B1=D8=A7=D9=81=20=D8=A2=D8=BA=D8=A7=D8=B2=20=DA=A9?= =?UTF-8?q?=D8=A7=D8=B1=DB=8C?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit به همراه لینک برگردان فارسی از تمام زبانهای موجود --- README.md | 1 + translations/README-cn.md | 1 + translations/README-es.md | 1 + translations/README-fa.md | 598 +++++++++++++++++++++++++++++++++++ translations/README-fr.md | 1 + translations/README-gr.md | 1 + translations/README-hu.md | 1 + translations/README-ja.md | 1 + translations/README-ko.md | 1 + translations/README-pl.md | 1 + translations/README-pt_BR.md | 1 + translations/README-ru.md | 1 + translations/README-tr.md | 1 + translations/README-vn.md | 1 + 14 files changed, 611 insertions(+) create mode 100644 translations/README-fa.md diff --git a/README.md b/README.md index 3a5c055..f6b409b 100644 --- a/README.md +++ b/README.md @@ -29,6 +29,7 @@ * [Polish](translations/README-pl.md) * [Русский](translations/README-ru.md) * [Tiếng Việt](translations/README-vn.md) +* [قارسی](translations/README-fa.md) ## What is Regular Expression? diff --git a/translations/README-cn.md b/translations/README-cn.md index eac7056..f0d89cc 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## 什么是正则表达式? diff --git a/translations/README-es.md b/translations/README-es.md index da6d3e1..782d94e 100644 --- a/translations/README-es.md +++ b/translations/README-es.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Qué es una expresión regular? > Una expresión regular es un grupo de caracteres o símbolos, los cuales son usados para buscar un patrón específico dentro de un texto. diff --git a/translations/README-fa.md b/translations/README-fa.md new file mode 100644 index 0000000..8a7a98c --- /dev/null +++ b/translations/README-fa.md @@ -0,0 +1,598 @@ +

+
+ + Learn Regex + +

+

+ + + + + + +

+

+ +## برگردان ها: + +* [English](../README.md) +* [Español](../translations/README-es.md) +* [Français](../translations/README-fr.md) +* [Português do Brasil](../translations/README-pt_BR.md) +* [中文版](../translations/README-cn.md) +* [日本語](../translations/README-ja.md) +* [한국어](../translations/README-ko.md) +* [Turkish](../translations/README-tr.md) +* [Greek](../translations/README-gr.md) +* [Magyar](../translations/README-hu.md) +* [Polish](../translations/README-pl.md) +* [Русский](../translations/README-ru.md) +* [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) + +
+## عبارت منظم چیست؟ +
+> عبارت منظم یک گروه از کارکترها یا نمادهاست که برای پیدا کردن یک الگوی مشخص در یک متن به کار گرفته می شود. + +یک عبارت منظم یک الگو است که با رشته ای حاص مطابقت دارد. عبارت منظم در اعتبار سنجی داده های ورودی فرم ها، پیدا کردن یک زیر متن در یک متن بزرگتر بر اساس یک الگوی ویژ] و مواردی از این دست به کار گرفته می شود. عبارت "Regular expression" کمی ثقیل است، پس معمولا بیشتر مخفف آن - "regex" یا "regexp" - را به کار می برند. + +فرض کنید یه برنامه نوشته اید و می خواهید قوانینی برای گزینش نام کاربری برا کاربران بگزارید. می خواهیم اجازه دهی که نام کاربری شامل حروف، اعداد، خط زیر و خط فاصله باشد. همچنین می خواهیم تعداد مشخصه ها یا همان کارکترها در نام کاربری محدود کنیم . ما از چنین عبارت منظمی برای اعتبار سنجی نام کاربری استفاده می کنیم: + +

+

+ Regular expression +

+ +عبارت منظم به کار رفته در اینجا رشته `john_doe` و `jo-hn_doe` و `john12_as` می پذیرد ولی `Jo` را به دلیل کوتاه بودن بیش از حد و همچنین به کار بردن حروف بزرگ نمی پذیرد. + +## Table of Contents + +- [Basic Matchers](#1-basic-matchers) +- [Meta character](#2-meta-characters) + - [Full stop](#21-full-stop) + - [Character set](#22-character-set) + - [Negated character set](#221-negated-character-set) + - [Repetitions](#23-repetitions) + - [The Star](#231-the-star) + - [The Plus](#232-the-plus) + - [The Question Mark](#233-the-question-mark) + - [Braces](#24-braces) + - [Character Group](#25-character-group) + - [Alternation](#26-alternation) + - [Escaping special character](#27-escaping-special-character) + - [Anchors](#28-anchors) + - [Caret](#281-caret) + - [Dollar](#282-dollar) +- [Shorthand Character Sets](#3-shorthand-character-sets) +- [Lookaround](#4-lookaround) + - [Positive Lookahead](#41-positive-lookahead) + - [Negative Lookahead](#42-negative-lookahead) + - [Positive Lookbehind](#43-positive-lookbehind) + - [Negative Lookbehind](#44-negative-lookbehind) +- [Flags](#5-flags) + - [Case Insensitive](#51-case-insensitive) + - [Global search](#52-global-search) + - [Multiline](#53-multiline) +- [Greedy vs lazy matching](#6-greedy-vs-lazy-matching) + +## 1. Basic Matchers + +A regular expression is just a pattern of characters that we use to perform +search in a text. For example, the regular expression `the` means: the letter +`t`, followed by the letter `h`, followed by the letter `e`. + +
+"the" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/dmRygT/1) + +The regular expression `123` matches the string `123`. The regular expression is +matched against an input string by comparing each character in the regular +expression to each character in the input string, one after another. Regular +expressions are normally case-sensitive so the regular expression `The` would +not match the string `the`. + +
+"The" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/1paXsy/1) + +## 2. Meta Characters + +Meta characters are the building blocks of the regular expressions. Meta +characters do not stand for themselves but instead are interpreted in some +special way. Some meta characters have a special meaning and are written inside +square brackets. The meta characters are as follows: + +|Meta character|Description| +|:----:|----| +|.|Period matches any single character except a line break.| +|[ ]|Character class. Matches any character contained between the square brackets.| +|[^ ]|Negated character class. Matches any character that is not contained between the square brackets| +|*|Matches 0 or more repetitions of the preceding symbol.| +|+|Matches 1 or more repetitions of the preceding symbol.| +|?|Makes the preceding symbol optional.| +|{n,m}|Braces. Matches at least "n" but not more than "m" repetitions of the preceding symbol.| +|(xyz)|Character group. Matches the characters xyz in that exact order.| +|||Alternation. Matches either the characters before or the characters after the symbol.| +|\|Escapes the next character. This allows you to match reserved characters [ ] ( ) { } . * + ? ^ $ \ || +|^|Matches the beginning of the input.| +|$|Matches the end of the input.| + +## 2.1 Full stop + +Full stop `.` is the simplest example of meta character. The meta character `.` +matches any single character. It will not match return or newline characters. +For example, the regular expression `.ar` means: any character, followed by the +letter `a`, followed by the letter `r`. + +
+".ar" => The car parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/xc9GkU/1) + +## 2.2 Character set + +Character sets are also called character class. Square brackets are used to +specify character sets. Use a hyphen inside a character set to specify the +characters' range. The order of the character range inside square brackets +doesn't matter. For example, the regular expression `[Tt]he` means: an uppercase +`T` or lowercase `t`, followed by the letter `h`, followed by the letter `e`. + +
+"[Tt]he" => The car parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/2ITLQ4/1) + +A period inside a character set, however, means a literal period. The regular +expression `ar[.]` means: a lowercase character `a`, followed by letter `r`, +followed by a period `.` character. + +
+"ar[.]" => A garage is a good place to park a car.
+
+ +[Test the regular expression](https://regex101.com/r/wL3xtE/1) + +### 2.2.1 Negated character set + +In general, the caret symbol represents the start of the string, but when it is +typed after the opening square bracket it negates the character set. For +example, the regular expression `[^c]ar` means: any character except `c`, +followed by the character `a`, followed by the letter `r`. + +
+"[^c]ar" => The car parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/nNNlq3/1) + +## 2.3 Repetitions + +Following meta characters `+`, `*` or `?` are used to specify how many times a +subpattern can occur. These meta characters act differently in different +situations. + +### 2.3.1 The Star + +The symbol `*` matches zero or more repetitions of the preceding matcher. The +regular expression `a*` means: zero or more repetitions of preceding lowercase +character `a`. But if it appears after a character set or class then it finds +the repetitions of the whole character set. For example, the regular expression +`[a-z]*` means: any number of lowercase letters in a row. + +
+"[a-z]*" => The car parked in the garage #21.
+
+ +[Test the regular expression](https://regex101.com/r/7m8me5/1) + +The `*` symbol can be used with the meta character `.` to match any string of +characters `.*`. The `*` symbol can be used with the whitespace character `\s` +to match a string of whitespace characters. For example, the expression +`\s*cat\s*` means: zero or more spaces, followed by lowercase character `c`, +followed by lowercase character `a`, followed by lowercase character `t`, +followed by zero or more spaces. + +
+"\s*cat\s*" => The fat cat sat on the concatenation.
+
+ +[Test the regular expression](https://regex101.com/r/gGrwuz/1) + +### 2.3.2 The Plus + +The symbol `+` matches one or more repetitions of the preceding character. For +example, the regular expression `c.+t` means: lowercase letter `c`, followed by +at least one character, followed by the lowercase character `t`. It needs to be +clarified that `t` is the last `t` in the sentence. + +
+"c.+t" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/Dzf9Aa/1) + +### 2.3.3 The Question Mark + +In regular expression the meta character `?` makes the preceding character +optional. This symbol matches zero or one instance of the preceding character. +For example, the regular expression `[T]?he` means: Optional the uppercase +letter `T`, followed by the lowercase character `h`, followed by the lowercase +character `e`. + +
+"[T]he" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/cIg9zm/1) + +
+"[T]?he" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/kPpO2x/1) + +## 2.4 Braces + +In regular expression braces that are also called quantifiers are used to +specify the number of times that a character or a group of characters can be +repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least +2 digits but not more than 3 ( characters in the range of 0 to 9). + +
+"[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0.
+
+ +[Test the regular expression](https://regex101.com/r/juM86s/1) + +We can leave out the second number. For example, the regular expression +`[0-9]{2,}` means: Match 2 or more digits. If we also remove the comma the +regular expression `[0-9]{3}` means: Match exactly 3 digits. + +
+"[0-9]{2,}" => The number was 9.9997 but we rounded it off to 10.0.
+
+ +[Test the regular expression](https://regex101.com/r/Gdy4w5/1) + +
+"[0-9]{3}" => The number was 9.9997 but we rounded it off to 10.0.
+
+ +[Test the regular expression](https://regex101.com/r/Sivu30/1) + +## 2.5 Capturing Group + +A capturing group is a group of sub-patterns that is written inside Parentheses +`(...)`. Like as we discussed before that in regular expression if we put a quantifier +after a character then it will repeat the preceding character. But if we put quantifier +after a capturing group then it repeats the whole capturing group. For example, +the regular expression `(ab)*` matches zero or more repetitions of the character +"ab". We can also use the alternation `|` meta character inside capturing group. +For example, the regular expression `(c|g|p)ar` means: lowercase character `c`, +`g` or `p`, followed by character `a`, followed by character `r`. + +
+"(c|g|p)ar" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/tUxrBG/1) + +Note that capturing groups do not only match but also capture the characters for use in +the parent language. The parent language could be python or javascript or virtually any +language that implements regular expressions in a function definition. + +### 2.5.1 Non-capturing group + +A non-capturing group is a capturing group that only matches the characters, but +does not capture the group. A non-capturing group is denoted by a `?` followed by a `:` +within parenthesis `(...)`. For example, the regular expression `(?:c|g|p)ar` is similar to +`(c|g|p)ar` in that it matches the same characters but will not create a capture group. + +
+"(?:c|g|p)ar" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/Rm7Me8/1) + +Non-capturing groups can come in handy when used in find-and-replace functionality or +when mixed with capturing groups to keep the overview when producing any other kind of output. +See also [4. Lookaround](#4-lookaround). + +## 2.6 Alternation + +In a regular expression, the vertical bar `|` is used to define alternation. +Alternation is like an OR statement between multiple expressions. Now, you may be +thinking that character set and alternation works the same way. But the big +difference between character set and alternation is that character set works on +character level but alternation works on expression level. For example, the +regular expression `(T|t)he|car` means: either (uppercase character `T` or lowercase +`t`, followed by lowercase character `h`, followed by lowercase character `e`) OR +(lowercase character `c`, followed by lowercase character `a`, followed by +lowercase character `r`). Note that I put the parentheses for clarity, to show that either expression +in parentheses can be met and it will match. + +
+"(T|t)he|car" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/fBXyX0/1) + +## 2.7 Escaping special character + +Backslash `\` is used in regular expression to escape the next character. This +allows us to specify a symbol as a matching character including reserved +characters `{ } [ ] / \ + * . $ ^ | ?`. To use a special character as a matching +character prepend `\` before it. + +For example, the regular expression `.` is used to match any character except +newline. Now to match `.` in an input string the regular expression +`(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by lowercase +character `a`, followed by lowercase letter `t`, followed by optional `.` +character. + +
+"(f|c|m)at\.?" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/DOc5Nu/1) + +## 2.8 Anchors + +In regular expressions, we use anchors to check if the matching symbol is the +starting symbol or ending symbol of the input string. Anchors are of two types: +First type is Caret `^` that check if the matching character is the start +character of the input and the second type is Dollar `$` that checks if matching +character is the last character of the input string. + +### 2.8.1 Caret + +Caret `^` symbol is used to check if matching character is the first character +of the input string. If we apply the following regular expression `^a` (if a is +the starting symbol) to input string `abc` it matches `a`. But if we apply +regular expression `^b` on above input string it does not match anything. +Because in input string `abc` "b" is not the starting symbol. Let's take a look +at another regular expression `^(T|t)he` which means: uppercase character `T` or +lowercase character `t` is the start symbol of the input string, followed by +lowercase character `h`, followed by lowercase character `e`. + +
+"(T|t)he" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/5ljjgB/1) + +
+"^(T|t)he" => The car is parked in the garage.
+
+ +[Test the regular expression](https://regex101.com/r/jXrKne/1) + +### 2.8.2 Dollar + +Dollar `$` symbol is used to check if matching character is the last character +of the input string. For example, regular expression `(at\.)$` means: a +lowercase character `a`, followed by lowercase character `t`, followed by a `.` +character and the matcher must be end of the string. + +
+"(at\.)" => The fat cat. sat. on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/y4Au4D/1) + +
+"(at\.)$" => The fat cat. sat. on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/t0AkOd/1) + +## 3. Shorthand Character Sets + +Regular expression provides shorthands for the commonly used character sets, +which offer convenient shorthands for commonly used regular expressions. The +shorthand character sets are as follows: + +|Shorthand|Description| +|:----:|----| +|.|Any character except new line| +|\w|Matches alphanumeric characters: `[a-zA-Z0-9_]`| +|\W|Matches non-alphanumeric characters: `[^\w]`| +|\d|Matches digit: `[0-9]`| +|\D|Matches non-digit: `[^\d]`| +|\s|Matches whitespace character: `[\t\n\f\r\p{Z}]`| +|\S|Matches non-whitespace character: `[^\s]`| + +## 4. Lookaround + +Lookbehind and lookahead (also called lookaround) are specific types of +***non-capturing groups*** (Used to match the pattern but not included in matching +list). Lookarounds are used when we have the condition that this pattern is +preceded or followed by another certain pattern. For example, we want to get all +numbers that are preceded by `$` character from the following input string +`$4.44 and $10.88`. We will use following regular expression `(?<=\$)[0-9\.]*` +which means: get all the numbers which contain `.` character and are preceded +by `$` character. Following are the lookarounds that are used in regular +expressions: + +|Symbol|Description| +|:----:|----| +|?=|Positive Lookahead| +|?!|Negative Lookahead| +|?<=|Positive Lookbehind| +|? +"(T|t)he(?=\sfat)" => The fat cat sat on the mat. +
+ +[Test the regular expression](https://regex101.com/r/IDDARt/1) + +### 4.2 Negative Lookahead + +Negative lookahead is used when we need to get all matches from input string +that are not followed by a pattern. Negative lookahead is defined same as we define +positive lookahead but the only difference is instead of equal `=` character we +use negation `!` character i.e. `(?!...)`. Let's take a look at the following +regular expression `(T|t)he(?!\sfat)` which means: get all `The` or `the` words +from input string that are not followed by the word `fat` precedes by a space +character. + +
+"(T|t)he(?!\sfat)" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/V32Npg/1) + +### 4.3 Positive Lookbehind + +Positive lookbehind is used to get all the matches that are preceded by a +specific pattern. Positive lookbehind is denoted by `(?<=...)`. For example, the +regular expression `(?<=(T|t)he\s)(fat|mat)` means: get all `fat` or `mat` words +from input string that are after the word `The` or `the`. + +
+"(?<=(T|t)he\s)(fat|mat)" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/avH165/1) + +### 4.4 Negative Lookbehind + +Negative lookbehind is used to get all the matches that are not preceded by a +specific pattern. Negative lookbehind is denoted by `(? +"(?<!(T|t)he\s)(cat)" => The cat sat on cat. +
+ +[Test the regular expression](https://regex101.com/r/8Efx5G/1) + +## 5. Flags + +Flags are also called modifiers because they modify the output of a regular +expression. These flags can be used in any order or combination, and are an +integral part of the RegExp. + +|Flag|Description| +|:----:|----| +|i|Case insensitive: Sets matching to be case-insensitive.| +|g|Global Search: Search for a pattern throughout the input string.| +|m|Multiline: Anchor meta character works on each line.| + +### 5.1 Case Insensitive + +The `i` modifier is used to perform case-insensitive matching. For example, the +regular expression `/The/gi` means: uppercase letter `T`, followed by lowercase +character `h`, followed by character `e`. And at the end of regular expression +the `i` flag tells the regular expression engine to ignore the case. As you can +see we also provided `g` flag because we want to search for the pattern in the +whole input string. + +
+"The" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/dpQyf9/1) + +
+"/The/gi" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/ahfiuh/1) + +### 5.2 Global search + +The `g` modifier is used to perform a global match (find all matches rather than +stopping after the first match). For example, the regular expression`/.(at)/g` +means: any character except new line, followed by lowercase character `a`, +followed by lowercase character `t`. Because we provided `g` flag at the end of +the regular expression now it will find all matches in the input string, not just the first one (which is the default behavior). + +
+"/.(at)/" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/jnk6gM/1) + +
+"/.(at)/g" => The fat cat sat on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/dO1nef/1) + +### 5.3 Multiline + +The `m` modifier is used to perform a multi-line match. As we discussed earlier +anchors `(^, $)` are used to check if pattern is the beginning of the input or +end of the input string. But if we want that anchors works on each line we use +`m` flag. For example, the regular expression `/at(.)?$/gm` means: lowercase +character `a`, followed by lowercase character `t`, optionally anything except +new line. And because of `m` flag now regular expression engine matches pattern +at the end of each line in a string. + +
+"/.at(.)?$/" => The fat
+                cat sat
+                on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/hoGMkP/1) + +
+"/.at(.)?$/gm" => The fat
+                  cat sat
+                  on the mat.
+
+ +[Test the regular expression](https://regex101.com/r/E88WE2/1) + +## 6. Greedy vs lazy matching +By default regex will do greedy matching , means it will match as long as +possible. we can use `?` to match in lazy way means as short as possible + +
+"/(.*at)/" => The fat cat sat on the mat. 
+ + +[Test the regular expression](https://regex101.com/r/AyAdgJ/1) + +
+"/(.*?at)/" => The fat cat sat on the mat. 
+ + +[Test the regular expression](https://regex101.com/r/AyAdgJ/2) + + +## Contribution + +* Open pull request with improvements +* Discuss ideas in issues +* Spread the word +* Reach out with any feedback [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/ziishaned.svg?style=social&label=Follow%20%40ziishaned)](https://twitter.com/ziishaned) + +## License + +MIT © [Zeeshan Ahmad](https://twitter.com/ziishaned) diff --git a/translations/README-fr.md b/translations/README-fr.md index 4d495e9..db28f7b 100644 --- a/translations/README-fr.md +++ b/translations/README-fr.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Qu'est-ce qu'une expression régulière? diff --git a/translations/README-gr.md b/translations/README-gr.md index 7d62a6e..2e2141c 100644 --- a/translations/README-gr.md +++ b/translations/README-gr.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Τι είναι μια Κανονική Έκφραση (Regular Expression); diff --git a/translations/README-hu.md b/translations/README-hu.md index 1017260..e0c5ab0 100644 --- a/translations/README-hu.md +++ b/translations/README-hu.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Mi az a reguláris kifejezés? diff --git a/translations/README-ja.md b/translations/README-ja.md index cb8df84..18ed0f1 100644 --- a/translations/README-ja.md +++ b/translations/README-ja.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## 正規表現とは diff --git a/translations/README-ko.md b/translations/README-ko.md index 05ae297..577bbfb 100644 --- a/translations/README-ko.md +++ b/translations/README-ko.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## 정규표현식이란 무엇인가? diff --git a/translations/README-pl.md b/translations/README-pl.md index c5213f5..84c50ef 100644 --- a/translations/README-pl.md +++ b/translations/README-pl.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Co to jest wyrażenie regularne? diff --git a/translations/README-pt_BR.md b/translations/README-pt_BR.md index 90170f4..d20b67b 100644 --- a/translations/README-pt_BR.md +++ b/translations/README-pt_BR.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## O que é uma Expressão Regular? diff --git a/translations/README-ru.md b/translations/README-ru.md index 8e7ccde..0dc6c67 100644 --- a/translations/README-ru.md +++ b/translations/README-ru.md @@ -29,6 +29,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Что такое Регулярное выражение? diff --git a/translations/README-tr.md b/translations/README-tr.md index 8e60118..d471337 100644 --- a/translations/README-tr.md +++ b/translations/README-tr.md @@ -30,6 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Düzenli İfade Nedir? diff --git a/translations/README-vn.md b/translations/README-vn.md index 60c51e8..a292055 100644 --- a/translations/README-vn.md +++ b/translations/README-vn.md @@ -31,6 +31,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [قارسی](../translations/README-fa.md) ## Biểu thức chính quy là gì? From 3faa6199f48138f3f5f8ac8cd8844cc027099bfd Mon Sep 17 00:00:00 2001 From: Hamzeh Javadi Date: Sun, 27 Oct 2019 08:52:54 +0330 Subject: [PATCH 09/24] Update README-fa.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit راست به چپ کردن و اصلاح تصویر --- translations/README-fa.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/translations/README-fa.md b/translations/README-fa.md index 8a7a98c..7cdee38 100644 --- a/translations/README-fa.md +++ b/translations/README-fa.md @@ -31,10 +31,12 @@ * [Tiếng Việt](../translations/README-vn.md) * [قارسی](../translations/README-fa.md) -
+
## عبارت منظم چیست؟
+
> عبارت منظم یک گروه از کارکترها یا نمادهاست که برای پیدا کردن یک الگوی مشخص در یک متن به کار گرفته می شود. +
یک عبارت منظم یک الگو است که با رشته ای حاص مطابقت دارد. عبارت منظم در اعتبار سنجی داده های ورودی فرم ها، پیدا کردن یک زیر متن در یک متن بزرگتر بر اساس یک الگوی ویژ] و مواردی از این دست به کار گرفته می شود. عبارت "Regular expression" کمی ثقیل است، پس معمولا بیشتر مخفف آن - "regex" یا "regexp" - را به کار می برند. @@ -42,11 +44,11 @@

- Regular expression + Regular expression

- +
عبارت منظم به کار رفته در اینجا رشته `john_doe` و `jo-hn_doe` و `john12_as` می پذیرد ولی `Jo` را به دلیل کوتاه بودن بیش از حد و همچنین به کار بردن حروف بزرگ نمی پذیرد. - +
## Table of Contents - [Basic Matchers](#1-basic-matchers) From f8088499c869b37922edc86201bcab2a1b40af49 Mon Sep 17 00:00:00 2001 From: Hamzeh Javadi Date: Sun, 27 Oct 2019 09:06:43 +0330 Subject: [PATCH 10/24] Update README-fa.md some rtl test --- translations/README-fa.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/translations/README-fa.md b/translations/README-fa.md index 7cdee38..02a76be 100644 --- a/translations/README-fa.md +++ b/translations/README-fa.md @@ -32,16 +32,19 @@ * [قارسی](../translations/README-fa.md)
+ ## عبارت منظم چیست؟
+ > عبارت منظم یک گروه از کارکترها یا نمادهاست که برای پیدا کردن یک الگوی مشخص در یک متن به کار گرفته می شود.
+
یک عبارت منظم یک الگو است که با رشته ای حاص مطابقت دارد. عبارت منظم در اعتبار سنجی داده های ورودی فرم ها، پیدا کردن یک زیر متن در یک متن بزرگتر بر اساس یک الگوی ویژ] و مواردی از این دست به کار گرفته می شود. عبارت "Regular expression" کمی ثقیل است، پس معمولا بیشتر مخفف آن - "regex" یا "regexp" - را به کار می برند. فرض کنید یه برنامه نوشته اید و می خواهید قوانینی برای گزینش نام کاربری برا کاربران بگزارید. می خواهیم اجازه دهی که نام کاربری شامل حروف، اعداد، خط زیر و خط فاصله باشد. همچنین می خواهیم تعداد مشخصه ها یا همان کارکترها در نام کاربری محدود کنیم . ما از چنین عبارت منظمی برای اعتبار سنجی نام کاربری استفاده می کنیم: - +


Regular expression @@ -49,7 +52,9 @@

عبارت منظم به کار رفته در اینجا رشته `john_doe` و `jo-hn_doe` و `john12_as` می پذیرد ولی `Jo` را به دلیل کوتاه بودن بیش از حد و همچنین به کار بردن حروف بزرگ نمی پذیرد.
-## Table of Contents +
+ +## فهرست - [Basic Matchers](#1-basic-matchers) - [Meta character](#2-meta-characters) @@ -78,6 +83,7 @@ - [Global search](#52-global-search) - [Multiline](#53-multiline) - [Greedy vs lazy matching](#6-greedy-vs-lazy-matching) +
## 1. Basic Matchers From a817604833c338b71b956d8537b56f1169f4a175 Mon Sep 17 00:00:00 2001 From: Hamzeh Javadi Date: Sun, 27 Oct 2019 09:18:20 +0330 Subject: [PATCH 11/24] paragraph #1 --- translations/README-fa.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/translations/README-fa.md b/translations/README-fa.md index 02a76be..e17f3bf 100644 --- a/translations/README-fa.md +++ b/translations/README-fa.md @@ -84,24 +84,23 @@ - [Multiline](#53-multiline) - [Greedy vs lazy matching](#6-greedy-vs-lazy-matching)
+
-## 1. Basic Matchers - -A regular expression is just a pattern of characters that we use to perform -search in a text. For example, the regular expression `the` means: the letter -`t`, followed by the letter `h`, followed by the letter `e`. +## 1. پایه ای ترین همخوانی +یک عبارت منظم در واقع یک الگو برای جست و جو در یک متن است. برای مثال عبارت منظم `the` به معنی : حرف +`t`, پس از آن حرف `h`, پس از آن حرف `e` است. +
 "the" => The fat cat sat on the mat.
 
-[Test the regular expression](https://regex101.com/r/dmRygT/1) -The regular expression `123` matches the string `123`. The regular expression is -matched against an input string by comparing each character in the regular -expression to each character in the input string, one after another. Regular -expressions are normally case-sensitive so the regular expression `The` would -not match the string `the`. +
+[عبارت منظم را در عمل ببینید](https://regex101.com/r/dmRygT/1) + +عبارت منظم `123` با رشته `123` مطابقت دارد. عبارت منظم با مقایسه حرف به حرف و کارکتر به کارکترش با متن مورد نظر تطابق را می یابد. همچنین عبارت منظم حساس به اندازه (بزرگی یا کوچکی حروف) هستند. بنابر این واژه ی `The` با `the` همخوان نیست. +
 "The" => The fat cat sat on the mat.

From fa254f2de75b566c8f08415bfe55d732c95158bb Mon Sep 17 00:00:00 2001
From: Hamzeh Javadi 
Date: Sun, 27 Oct 2019 09:26:46 +0330
Subject: [PATCH 12/24] =?UTF-8?q?=D8=A7=D8=B5=D9=84=D8=A7=D8=AD=20=D9=84?=
 =?UTF-8?q?=DB=8C=D9=86=DA=A9=20=D8=A8=D9=87=20=D8=A8=D8=B1=DA=AF=D8=B1?=
 =?UTF-8?q?=D8=AF=D8=A7=D9=86=20=D9=81=D8=A7=D8=B1=D8=B3=DB=8C?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 README.md                        |  2 +-
 translations/README-cn.md        |  2 +-
 translations/README-es.md        |  2 +-
 translations/README-fa.md        | 11 +++++++----
 translations/README-fr.md        |  2 +-
 translations/README-gr.md        |  2 +-
 translations/README-hu.md        |  2 +-
 translations/README-ja.md        |  2 +-
 translations/README-ko.md        |  2 +-
 translations/README-pl.md        |  2 +-
 translations/README-pt_BR.md     |  2 +-
 translations/README-ru.md        |  2 +-
 translations/README-tr.md        |  2 +-
 translations/README-vn.md        |  2 +-
 translations/README-zh-simple.md |  1 +
 15 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/README.md b/README.md
index f6b409b..3c50e7c 100644
--- a/README.md
+++ b/README.md
@@ -29,7 +29,7 @@
 * [Polish](translations/README-pl.md)
 * [Русский](translations/README-ru.md)
 * [Tiếng Việt](translations/README-vn.md)
-* [قارسی](translations/README-fa.md)
+* [فارسی](translations/README-fa.md)
 
 ## What is Regular Expression?
 
diff --git a/translations/README-cn.md b/translations/README-cn.md
index f0d89cc..a16a71d 100644
--- a/translations/README-cn.md
+++ b/translations/README-cn.md
@@ -30,7 +30,7 @@
 * [Polish](../translations/README-pl.md)
 * [Русский](../translations/README-ru.md)
 * [Tiếng Việt](../translations/README-vn.md)
-* [قارسی](../translations/README-fa.md)
+* [فارسی](../translations/README-fa.md)
 
 ## 什么是正则表达式?
 
diff --git a/translations/README-es.md b/translations/README-es.md
index 782d94e..51d54a7 100644
--- a/translations/README-es.md
+++ b/translations/README-es.md
@@ -30,7 +30,7 @@
 * [Polish](../translations/README-pl.md)
 * [Русский](../translations/README-ru.md)
 * [Tiếng Việt](../translations/README-vn.md)
-* [قارسی](../translations/README-fa.md)
+* [فارسی](../translations/README-fa.md)
 
 ## Qué es una expresión regular?
 > Una expresión regular es un grupo de caracteres o símbolos, los cuales son usados para buscar un patrón específico dentro de un texto.
diff --git a/translations/README-fa.md b/translations/README-fa.md
index e17f3bf..c95869a 100644
--- a/translations/README-fa.md
+++ b/translations/README-fa.md
@@ -29,7 +29,7 @@
 * [Polish](../translations/README-pl.md)
 * [Русский](../translations/README-ru.md)
 * [Tiếng Việt](../translations/README-vn.md)
-* [قارسی](../translations/README-fa.md)
+* [فارسی](../translations/README-fa.md)
 
 
@@ -56,7 +56,7 @@ ## فهرست -- [Basic Matchers](#1-basic-matchers) +- [پایه ای ترین همخوانی](#1-basic-matchers) - [Meta character](#2-meta-characters) - [Full stop](#21-full-stop) - [Character set](#22-character-set) @@ -95,8 +95,8 @@ "the" => The fat cat sat on the mat.
-
+ [عبارت منظم را در عمل ببینید](https://regex101.com/r/dmRygT/1) عبارت منظم `123` با رشته `123` مطابقت دارد. عبارت منظم با مقایسه حرف به حرف و کارکتر به کارکترش با متن مورد نظر تطابق را می یابد. همچنین عبارت منظم حساس به اندازه (بزرگی یا کوچکی حروف) هستند. بنابر این واژه ی `The` با `the` همخوان نیست. @@ -106,7 +106,10 @@ "The" => The fat cat sat on the mat.
-[Test the regular expression](https://regex101.com/r/1paXsy/1) +
+ +[این عبارت منظم را در عمل ببنیند](https://regex101.com/r/1paXsy/1) +
## 2. Meta Characters diff --git a/translations/README-fr.md b/translations/README-fr.md index db28f7b..c51668c 100644 --- a/translations/README-fr.md +++ b/translations/README-fr.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Qu'est-ce qu'une expression régulière? diff --git a/translations/README-gr.md b/translations/README-gr.md index 2e2141c..2e6dd9d 100644 --- a/translations/README-gr.md +++ b/translations/README-gr.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Τι είναι μια Κανονική Έκφραση (Regular Expression); diff --git a/translations/README-hu.md b/translations/README-hu.md index e0c5ab0..e8c5239 100644 --- a/translations/README-hu.md +++ b/translations/README-hu.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Mi az a reguláris kifejezés? diff --git a/translations/README-ja.md b/translations/README-ja.md index 18ed0f1..6acf9ae 100644 --- a/translations/README-ja.md +++ b/translations/README-ja.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## 正規表現とは diff --git a/translations/README-ko.md b/translations/README-ko.md index 577bbfb..d4c9c0b 100644 --- a/translations/README-ko.md +++ b/translations/README-ko.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## 정규표현식이란 무엇인가? diff --git a/translations/README-pl.md b/translations/README-pl.md index 84c50ef..fd1d04c 100644 --- a/translations/README-pl.md +++ b/translations/README-pl.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Co to jest wyrażenie regularne? diff --git a/translations/README-pt_BR.md b/translations/README-pt_BR.md index d20b67b..774cd21 100644 --- a/translations/README-pt_BR.md +++ b/translations/README-pt_BR.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## O que é uma Expressão Regular? diff --git a/translations/README-ru.md b/translations/README-ru.md index 0dc6c67..ed2e668 100644 --- a/translations/README-ru.md +++ b/translations/README-ru.md @@ -29,7 +29,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Что такое Регулярное выражение? diff --git a/translations/README-tr.md b/translations/README-tr.md index d471337..41be9dd 100644 --- a/translations/README-tr.md +++ b/translations/README-tr.md @@ -30,7 +30,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Düzenli İfade Nedir? diff --git a/translations/README-vn.md b/translations/README-vn.md index a292055..0c38faa 100644 --- a/translations/README-vn.md +++ b/translations/README-vn.md @@ -31,7 +31,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) -* [قارسی](../translations/README-fa.md) +* [فارسی](../translations/README-fa.md) ## Biểu thức chính quy là gì? diff --git a/translations/README-zh-simple.md b/translations/README-zh-simple.md index 8ee4847..292e877 100644 --- a/translations/README-zh-simple.md +++ b/translations/README-zh-simple.md @@ -29,6 +29,7 @@ * [Polish](../translations/README-pl.md) * [Русский](../translations/README-ru.md) * [Tiếng Việt](../translations/README-vn.md) +* [فارسی](../translations/README-fa.md) ## 什么是正则表达式? From 3fc465e0dd6f3e287ccbb538272cb00c13888917 Mon Sep 17 00:00:00 2001 From: hanhan9449 <1481220484@qq.com> Date: Wed, 30 Oct 2019 09:03:21 +0800 Subject: [PATCH 13/24] Fix a translation error --- translations/README-cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/translations/README-cn.md b/translations/README-cn.md index eac7056..77cb90e 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -118,7 +118,7 @@ |*|匹配>=0个重复的在*号之前的字符。| |+|匹配>=1个重复的+号前的字符。 |?|标记?之前的字符为可选.| -|{n,m}|匹配num个大括号之前的字符 (n <= num <= m).| +|{n,m}|匹配num个大括号之间的字符 (n <= num <= m).| |(xyz)|字符集,匹配与 xyz 完全相等的字符串.| |||或运算符,匹配符号前或后的字符.| |\|转义字符,用于匹配一些保留的字符 [ ] ( ) { } . * + ? ^ $ \ || From 2411b31babe77ce6ebe37165ecef2f6e1ba41050 Mon Sep 17 00:00:00 2001 From: Jigao Luo_X1 Date: Mon, 6 Jan 2020 21:20:47 +0100 Subject: [PATCH 14/24] 5.1, 5.2, 5.3 hyperlink fix --- translations/README-cn.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/translations/README-cn.md b/translations/README-cn.md index 55984ea..9f5dddf 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -431,7 +431,7 @@ |g|全局搜索。| |m|多行修饰符:锚点元字符 `^` `$` 工作范围在每行的起始。| -### 5.1 忽略大小写(Case Insensitive) +### 5.1 忽略大小写 (Case Insensitive) 修饰语 `i` 用于忽略大小写。 例如,表达式 `/The/gi` 表示在全局搜索 `The`,在后面的 `i` 将其条件修改为忽略大小写,则变成搜索 `the` 和 `The`,`g` 表示全局搜索。 @@ -448,7 +448,7 @@ [在线练习](https://regex101.com/r/ahfiuh/1) -### 5.2 全局搜索(Global search) +### 5.2 全局搜索 (Global search) 修饰符 `g` 常用于执行一个全局搜索匹配,即(不仅仅返回第一个匹配的,而是返回全部)。 例如,表达式 `/.(at)/g` 表示搜索 任意字符(除了换行)+ `at`,并返回全部结果。 @@ -465,7 +465,7 @@ [在线练习](https://regex101.com/r/dO1nef/1) -### 5.3 多行修饰符(Multiline) +### 5.3 多行修饰符 (Multiline) 多行修饰符 `m` 常用于执行一个多行匹配。 @@ -489,7 +489,7 @@ [在线练习](https://regex101.com/r/E88WE2/1) -### 6. 贪婪匹配与惰性匹配(Greedy vs lazy matching) +### 6. 贪婪匹配与惰性匹配 (Greedy vs lazy matching) 正则表达式默认采用贪婪匹配模式,在该模式下意味着会匹配尽可能长的子串。我们可以使用 `?` 将贪婪匹配模式转化为惰性匹配模式。 From 65cbea955e20035434d7492903f86451a4258177 Mon Sep 17 00:00:00 2001 From: wakeheart <60534224+wakeheart@users.noreply.github.com> Date: Sat, 7 Mar 2020 14:24:18 +0800 Subject: [PATCH 15/24] Update README-cn.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 修改了2.4和2.5的讲解,使其更加符合中文的阅读方式和思考习惯,降低初学者的理解难度。 --- translations/README-cn.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/translations/README-cn.md b/translations/README-cn.md index 55984ea..fcd78ac 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -227,7 +227,7 @@ ## 2.4 `{}` 号 -在正则表达式中 `{}` 是一个量词,常用来一个或一组字符可以重复出现的次数。 +在正则表达式中 `{}` 是一个量词,常用来限定一个或一组字符可以重复出现的次数。 例如, 表达式 `[0-9]{2,3}` 匹配最少 2 位最多 3 位 0~9 的数字。
@@ -256,7 +256,10 @@
 
 ## 2.5 `(...)` 特征标群
 
-特征标群是一组写在 `(...)` 中的子模式。例如之前说的 `{}` 是用来表示前面一个字符出现指定次数。但如果在 `{}` 前加入特征标群则表示整个标群内的字符重复 N 次。例如,表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`。
+## 2.5 `(...)` 特征标群
+
+特征标群是一组写在 `(...)` 中的子模式。`(...)` 中包含的内容将会被看成一个整体,和数学中小括号( )的作用相同。例如, 表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`。如果没有使用 `(...)` ,那么表达式 `ab*` 将匹配连续出现 0 或更多个 `b` 。再比如之前说的 `{}` 是用来表示前面一个字符出现指定次数。但如果在 `{}` 前加上特征标群 `(...)` 则表示整个标群内的字符重复 N 次。
+
 
 我们还可以在 `()` 中用或字符 `|` 表示或。例如,`(c|g|p)ar` 匹配 `car` 或 `gar` 或 `par`.
 

From 6436a420594a5e3d326992a00d73ffa0e8fbc1fc Mon Sep 17 00:00:00 2001
From: wakeheart <60534224+wakeheart@users.noreply.github.com>
Date: Sat, 7 Mar 2020 14:25:32 +0800
Subject: [PATCH 16/24] Update README-cn.md

---
 translations/README-cn.md | 2 --
 1 file changed, 2 deletions(-)

diff --git a/translations/README-cn.md b/translations/README-cn.md
index fcd78ac..237f4c0 100644
--- a/translations/README-cn.md
+++ b/translations/README-cn.md
@@ -256,8 +256,6 @@
 
 ## 2.5 `(...)` 特征标群
 
-## 2.5 `(...)` 特征标群
-
 特征标群是一组写在 `(...)` 中的子模式。`(...)` 中包含的内容将会被看成一个整体,和数学中小括号( )的作用相同。例如, 表达式 `(ab)*` 匹配连续出现 0 或更多个 `ab`。如果没有使用 `(...)` ,那么表达式 `ab*` 将匹配连续出现 0 或更多个 `b` 。再比如之前说的 `{}` 是用来表示前面一个字符出现指定次数。但如果在 `{}` 前加上特征标群 `(...)` 则表示整个标群内的字符重复 N 次。
 
 

From 18c6bf5a984bdc73126201858664858a7aa25cd6 Mon Sep 17 00:00:00 2001
From: Tom McAndrew <42588609+tommcandrew@users.noreply.github.com>
Date: Tue, 10 Mar 2020 17:11:31 +0000
Subject: [PATCH 17/24] Improve grammar and punctuation

---
 README.md | 300 +++++++++++++++++++++++++++---------------------------
 1 file changed, 148 insertions(+), 152 deletions(-)

diff --git a/README.md b/README.md
index 0546c6d..6d534fb 100644
--- a/README.md
+++ b/README.md
@@ -33,62 +33,63 @@
 
 ## What is Regular Expression?
 
-> Regular expression is a group of characters or symbols which is used to find a specific pattern from a text.
+> A regular expression is a group of characters or symbols which is used to find a specific pattern in a text.
 
 A regular expression is a pattern that is matched against a subject string from
-left to right. Regular expression is used for replacing a text within a string, 
-validating form, extract a substring from a string based upon a pattern match, 
-and so much more. The word "Regular expression" is a mouthful, so you will usually
-find the term abbreviated as "regex" or "regexp". 
+left to right. Regular expressions are used to replace text within a string, 
+validating forms, extracting a substring from a string based on a pattern match, 
+and so much more. The term "regular expression" is a mouthful, so you will usually
+find the term abbreviated to "regex" or "regexp". 
 
 Imagine you are writing an application and you want to set the rules for when a
 user chooses their username. We want to allow the username to contain letters,
 numbers, underscores and hyphens. We also want to limit the number of characters
-in username so it does not look ugly. We use the following regular expression to
-validate a username:
+in the username so it does not look ugly. We can use the following regular expression to
+validate the username:
 
 

Regular expression

-Above regular expression can accept the strings `john_doe`, `jo-hn_doe` and -`john12_as`. It does not match `Jo` because that string contains uppercase +The regular expression above can accept the strings `john_doe`, `jo-hn_doe` and +`john12_as`. It does not match `Jo` because that string contains an uppercase letter and also it is too short. ## Table of Contents - [Basic Matchers](#1-basic-matchers) -- [Meta character](#2-meta-characters) - - [Full stop](#21-full-stop) - - [Character set](#22-character-set) - - [Negated character set](#221-negated-character-set) +- [Meta Characters](#2-meta-characters) + - [Full Stops](#21-full-stops) + - [Character Sets](#22-character-sets) + - [Negated Character Sets](#221-negated-character-sets) - [Repetitions](#23-repetitions) - [The Star](#231-the-star) - [The Plus](#232-the-plus) - [The Question Mark](#233-the-question-mark) - [Braces](#24-braces) - - [Character Group](#25-character-group) + - [Capturing Groups](#25-capturing-groups) + - [Non-Capturing Groups](#251-non-capturing-groups) - [Alternation](#26-alternation) - - [Escaping special character](#27-escaping-special-character) + - [Escaping Special Characters](#27-escaping-special-characters) - [Anchors](#28-anchors) - - [Caret](#281-caret) - - [Dollar](#282-dollar) + - [The Caret](#281-the-caret) + - [The Dollar Sign](#282-the-dollar-sign) - [Shorthand Character Sets](#3-shorthand-character-sets) -- [Lookaround](#4-lookaround) - - [Positive Lookahead](#41-positive-lookahead) - - [Negative Lookahead](#42-negative-lookahead) - - [Positive Lookbehind](#43-positive-lookbehind) - - [Negative Lookbehind](#44-negative-lookbehind) +- [Lookarounds](#4-lookarounds) + - [Positive Lookaheads](#41-positive-lookaheads) + - [Negative Lookaheads](#42-negative-lookaheads) + - [Positive Lookbehinds](#43-positive-lookbehinds) + - [Negative Lookbehinds](#44-negative-lookbehinds) - [Flags](#5-flags) - [Case Insensitive](#51-case-insensitive) - - [Global search](#52-global-search) + - [Global Search](#52-global-search) - [Multiline](#53-multiline) -- [Greedy vs lazy matching](#6-greedy-vs-lazy-matching) +- [Greedy vs Lazy Matching](#6-greedy-vs-lazy-matching) ## 1. Basic Matchers -A regular expression is just a pattern of characters that we use to perform +A regular expression is just a pattern of characters that we use to perform a search in a text. For example, the regular expression `the` means: the letter `t`, followed by the letter `h`, followed by the letter `e`. @@ -112,7 +113,7 @@ not match the string `the`. ## 2. Meta Characters -Meta characters are the building blocks of the regular expressions. Meta +Meta characters are the building blocks of regular expressions. Meta characters do not stand for themselves but instead are interpreted in some special way. Some meta characters have a special meaning and are written inside square brackets. The meta characters are as follows: @@ -132,9 +133,9 @@ square brackets. The meta characters are as follows: |^|Matches the beginning of the input.| |$|Matches the end of the input.| -## 2.1 Full stop +## 2.1 Full Stops -Full stop `.` is the simplest example of meta character. The meta character `.` +The full stop `.` is the simplest example of a meta character. The meta character `.` matches any single character. It will not match return or newline characters. For example, the regular expression `.ar` means: any character, followed by the letter `a`, followed by the letter `r`. @@ -145,11 +146,11 @@ letter `a`, followed by the letter `r`. [Test the regular expression](https://regex101.com/r/xc9GkU/1) -## 2.2 Character set +## 2.2 Character Sets -Character sets are also called character class. Square brackets are used to +Character sets are also called character classes. Square brackets are used to specify character sets. Use a hyphen inside a character set to specify the -characters' range. The order of the character range inside square brackets +characters' range. The order of the character range inside the square brackets doesn't matter. For example, the regular expression `[Tt]he` means: an uppercase `T` or lowercase `t`, followed by the letter `h`, followed by the letter `e`. @@ -160,7 +161,7 @@ doesn't matter. For example, the regular expression `[Tt]he` means: an uppercase [Test the regular expression](https://regex101.com/r/2ITLQ4/1) A period inside a character set, however, means a literal period. The regular -expression `ar[.]` means: a lowercase character `a`, followed by letter `r`, +expression `ar[.]` means: a lowercase character `a`, followed by the letter `r`, followed by a period `.` character.
@@ -169,7 +170,7 @@ followed by a period `.` character.
 
 [Test the regular expression](https://regex101.com/r/wL3xtE/1)
 
-### 2.2.1 Negated character set
+### 2.2.1 Negated Character Sets
 
 In general, the caret symbol represents the start of the string, but when it is
 typed after the opening square bracket it negates the character set. For
@@ -184,14 +185,14 @@ followed by the character `a`, followed by the letter `r`.
 
 ## 2.3 Repetitions
 
-Following meta characters `+`, `*` or `?` are used to specify how many times a
+The meta characters `+`, `*` or `?` are used to specify how many times a
 subpattern can occur. These meta characters act differently in different
 situations.
 
 ### 2.3.1 The Star
 
-The symbol `*` matches zero or more repetitions of the preceding matcher. The
-regular expression `a*` means: zero or more repetitions of preceding lowercase
+The `*` symbol matches zero or more repetitions of the preceding matcher. The
+regular expression `a*` means: zero or more repetitions of the preceding lowercase
 character `a`. But if it appears after a character set or class then it finds
 the repetitions of the whole character set. For example, the regular expression
 `[a-z]*` means: any number of lowercase letters in a row.
@@ -205,8 +206,8 @@ the repetitions of the whole character set. For example, the regular expression
 The `*` symbol can be used with the meta character `.` to match any string of
 characters `.*`. The `*` symbol can be used with the whitespace character `\s`
 to match a string of whitespace characters. For example, the expression
-`\s*cat\s*` means: zero or more spaces, followed by lowercase character `c`,
-followed by lowercase character `a`, followed by lowercase character `t`,
+`\s*cat\s*` means: zero or more spaces, followed by a lowercase `c`,
+followed by a lowercase `a`, followed by a lowercase `t`,
 followed by zero or more spaces.
 
 
@@ -217,10 +218,10 @@ followed by zero or more spaces.
 
 ### 2.3.2 The Plus
 
-The symbol `+` matches one or more repetitions of the preceding character. For
-example, the regular expression `c.+t` means: lowercase letter `c`, followed by
-at least one character, followed by the lowercase character `t`. It needs to be
-clarified that `t` is the last `t` in the sentence.
+The `+` symbol matches one or more repetitions of the preceding character. For
+example, the regular expression `c.+t` means: a lowercase `c`, followed by
+at least one character, followed by a lowercase `t`. It needs to be
+clarified that`t` is the last `t` in the sentence.
 
 
 "c.+t" => The fat cat sat on the mat.
@@ -230,11 +231,10 @@ clarified that `t` is the last `t` in the sentence.
 
 ### 2.3.3 The Question Mark
 
-In regular expression the meta character `?` makes the preceding character
+In regular expressions, the meta character `?` makes the preceding character
 optional. This symbol matches zero or one instance of the preceding character.
-For example, the regular expression `[T]?he` means: Optional the uppercase
-letter `T`, followed by the lowercase character `h`, followed by the lowercase
-character `e`.
+For example, the regular expression `[T]?he` means: Optional uppercase
+`T`, followed by a lowercase `h`, followed bya lowercase `e`.
 
 
 "[T]he" => The car is parked in the garage.
@@ -250,10 +250,10 @@ character `e`.
 
 ## 2.4 Braces
 
-In regular expression braces that are also called quantifiers are used to
+In regular expressions, braces (also called quantifiers) are used to
 specify the number of times that a character or a group of characters can be
 repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least
-2 digits but not more than 3 (characters in the range of 0 to 9).
+2 digits, but not more than 3, ranging from 0 to 9.
 
 
 "[0-9]{2,3}" => The number was 9.9997 but we rounded it off to 10.0.
@@ -262,7 +262,7 @@ repeated. For example, the regular expression `[0-9]{2,3}` means: Match at least
 [Test the regular expression](https://regex101.com/r/juM86s/1)
 
 We can leave out the second number. For example, the regular expression
-`[0-9]{2,}` means: Match 2 or more digits. If we also remove the comma the
+`[0-9]{2,}` means: Match 2 or more digits. If we also remove the comma, the
 regular expression `[0-9]{3}` means: Match exactly 3 digits.
 
 
@@ -277,16 +277,16 @@ regular expression `[0-9]{3}` means: Match exactly 3 digits.
 
 [Test the regular expression](https://regex101.com/r/Sivu30/1)
 
-## 2.5 Capturing Group
+## 2.5 Capturing Groups
 
-A capturing group is a group of sub-patterns that is written inside Parentheses 
-`(...)`. Like as we discussed before that in regular expression if we put a quantifier 
-after a character then it will repeat the preceding character. But if we put quantifier
+A capturing group is a group of sub-patterns that is written inside parentheses 
+`(...)`. As discussed before, in regular expressions, if we put a quantifier 
+after a character then it will repeat the preceding character. But if we put a quantifier
 after a capturing group then it repeats the whole capturing group. For example,
 the regular expression `(ab)*` matches zero or more repetitions of the character
-"ab". We can also use the alternation `|` meta character inside capturing group.
-For example, the regular expression `(c|g|p)ar` means: lowercase character `c`,
-`g` or `p`, followed by character `a`, followed by character `r`.
+"ab". We can also use the alternation `|` meta character inside a capturing group.
+For example, the regular expression `(c|g|p)ar` means: a lowercase `c`,
+`g` or `p`, followed by `a`, followed by `r`.
 
 
 "(c|g|p)ar" => The car is parked in the garage.
@@ -294,15 +294,15 @@ For example, the regular expression `(c|g|p)ar` means: lowercase character `c`,
 
 [Test the regular expression](https://regex101.com/r/tUxrBG/1)
 
-Note that capturing groups do not only match but also capture the characters for use in 
-the parent language. The parent language could be python or javascript or virtually any
+Note that capturing groups do not only match, but also capture, the characters for use in 
+the parent language. The parent language could be Python or JavaScript or virtually any
 language that implements regular expressions in a function definition.
 
-### 2.5.1 Non-capturing group
+### 2.5.1 Non-Capturing Groups
 
-A non-capturing group is a capturing group that only matches the characters, but 
+A non-capturing group is a capturing group that matches the characters but 
 does not capture the group. A non-capturing group is denoted by a `?` followed by a `:` 
-within parenthesis `(...)`. For example, the regular expression `(?:c|g|p)ar` is similar to 
+within parentheses `(...)`. For example, the regular expression `(?:c|g|p)ar` is similar to 
 `(c|g|p)ar` in that it matches the same characters but will not create a capture group.
 
 
@@ -319,13 +319,13 @@ See also [4. Lookaround](#4-lookaround).
 
 In a regular expression, the vertical bar `|` is used to define alternation.
 Alternation is like an OR statement between multiple expressions. Now, you may be
-thinking that character set and alternation works the same way. But the big
-difference between character set and alternation is that character set works on
-character level but alternation works on expression level. For example, the
-regular expression `(T|t)he|car` means: either (uppercase character `T` or lowercase
-`t`, followed by lowercase character `h`, followed by lowercase character `e`) OR
-(lowercase character `c`, followed by lowercase character `a`, followed by
-lowercase character `r`). Note that I put the parentheses for clarity, to show that either expression
+thinking that character sets and alternation work the same way. But the big
+difference between character sets and alternation is that character sets work at the
+character level but alternation works at the expression level. For example, the
+regular expression `(T|t)he|car` means: either (an uppercase `T` or a lowercase
+`t`, followed by a lowercase `h`, followed by a lowercase `e`) OR
+(a lowercase `c`, followed by a lowercase `a`, followed by
+a lowercase `r`). Note that I included the parentheses for clarity, to show that either expression
 in parentheses can be met and it will match.
 
 
@@ -334,17 +334,15 @@ in parentheses can be met and it will match.
 
 [Test the regular expression](https://regex101.com/r/fBXyX0/1)
 
-## 2.7 Escaping special character
+## 2.7 Escaping Special Characters
 
-Backslash `\` is used in regular expression to escape the next character. This
-allows us to specify a symbol as a matching character including reserved
-characters `{ } [ ] / \ + * . $ ^ | ?`. To use a special character as a matching
-character prepend `\` before it.
+A backslash `\` is used in regular expressions to escape the next character. This
+allows us to include reserved characters such as `{ } [ ] / \ + * . $ ^ | ?` as matching characters. To use one of these special character as a matching character, prepend it with `\`.
 
-For example, the regular expression `.` is used to match any character except
-newline. Now to match `.` in an input string the regular expression
-`(f|c|m)at\.?` means: lowercase letter `f`, `c` or `m`, followed by lowercase
-character `a`, followed by lowercase letter `t`, followed by optional `.`
+For example, the regular expression `.` is used to match any character except a
+newline. Now, to match `.` in an input string, the regular expression
+`(f|c|m)at\.?` means: a lowercase `f`, `c` or `m`, followed by a lowercase
+`a`, followed by a lowercase `t`, followed by an optional `.`
 character.
 
 
@@ -357,20 +355,20 @@ character.
 
 In regular expressions, we use anchors to check if the matching symbol is the
 starting symbol or ending symbol of the input string. Anchors are of two types:
-First type is Caret `^` that check if the matching character is the start
-character of the input and the second type is Dollar `$` that checks if matching
+The first type is the caret `^` that check if the matching character is the first
+character of the input and the second type is the dollar sign `$` which checks if a matching
 character is the last character of the input string.
 
-### 2.8.1 Caret
+### 2.8.1 The Caret
 
-Caret `^` symbol is used to check if matching character is the first character
-of the input string. If we apply the following regular expression `^a` (if a is
-the starting symbol) to input string `abc` it matches `a`. But if we apply
-regular expression `^b` on above input string it does not match anything.
-Because in input string `abc` "b" is not the starting symbol. Let's take a look
-at another regular expression `^(T|t)he` which means: uppercase character `T` or
-lowercase character `t` is the start symbol of the input string, followed by
-lowercase character `h`, followed by lowercase character `e`.
+The caret symbol `^` is used to check if a matching character is the first character
+of the input string. If we apply the following regular expression `^a` (meaning 'a' must be
+the starting character) to the string `abc`, it will match `a`. But if we apply
+the regular expression `^b` to the above string, it will not match anything.
+Because in the string `abc`, the "b" is not the starting character. Let's take a look
+at another regular expression `^(T|t)he` which means: an uppercase `T` or
+a lowercase `t` must be the first character in the string, followed by a
+lowercase `h`, followed by a lowercase `e`.
 
 
 "(T|t)he" => The car is parked in the garage.
@@ -384,12 +382,12 @@ lowercase character `h`, followed by lowercase character `e`.
 
 [Test the regular expression](https://regex101.com/r/jXrKne/1)
 
-### 2.8.2 Dollar
+### 2.8.2 The Dollar Sign
 
-Dollar `$` symbol is used to check if matching character is the last character
-of the input string. For example, regular expression `(at\.)$` means: a
-lowercase character `a`, followed by lowercase character `t`, followed by a `.`
-character and the matcher must be end of the string.
+The dollar sign `$` is used to check if a matching character is the last character
+in the string. For example, the regular expression `(at\.)$` means: a
+lowercase `a`, followed by a lowercase `t`, followed by a `.`
+character and the matcher must be at the end of the string.
 
 
 "(at\.)" => The fat cat. sat. on the mat.
@@ -405,30 +403,29 @@ character and the matcher must be end of the string.
 
 ##  3. Shorthand Character Sets
 
-Regular expression provides shorthands for the commonly used character sets,
-which offer convenient shorthands for commonly used regular expressions. The
-shorthand character sets are as follows:
+There are a number of convenient shorthands for commonly used character sets/
+regular expressions:
 
 |Shorthand|Description|
 |:----:|----|
 |.|Any character except new line|
 |\w|Matches alphanumeric characters: `[a-zA-Z0-9_]`|
 |\W|Matches non-alphanumeric characters: `[^\w]`|
-|\d|Matches digit: `[0-9]`|
-|\D|Matches non-digit: `[^\d]`|
-|\s|Matches whitespace character: `[\t\n\f\r\p{Z}]`|
-|\S|Matches non-whitespace character: `[^\s]`|
+|\d|Matches digits: `[0-9]`|
+|\D|Matches non-digits: `[^\d]`|
+|\s|Matches whitespace characters: `[\t\n\f\r\p{Z}]`|
+|\S|Matches non-whitespace characters: `[^\s]`|
 
-## 4. Lookaround
+## 4. Lookarounds
 
-Lookbehind and lookahead (also called lookaround) are specific types of
-***non-capturing groups*** (used to match the pattern but not included in matching
-list). Lookarounds are used when we have the condition that this pattern is
-preceded or followed by another certain pattern. For example, we want to get all
-numbers that are preceded by `$` character from the following input string
-`$4.44 and $10.88`. We will use following regular expression `(?<=\$)[0-9\.]*`
-which means: get all the numbers which contain `.` character and  are preceded
-by `$` character. Following are the lookarounds that are used in regular
+Lookbehinds and lookaheads (also called lookarounds) are specific types of
+***non-capturing groups*** (used to match a pattern but without including it in the matching
+list). Lookarounds are used when we a pattern must be
+preceded or followed by another pattern. For example, imagine we want to get all
+numbers that are preceded by the `$` character from the string
+`$4.44 and $10.88`. We will use the following regular expression `(?<=\$)[0-9\.]*`
+which means: get all the numbers which contain the `.` character and are preceded
+by the `$` character. These are the lookarounds that are used in regular
 expressions:
 
 |Symbol|Description|
@@ -438,18 +435,18 @@ expressions:
 |?<=|Positive Lookbehind|
 |?
 "(T|t)he(?=\sfat)" => The fat cat sat on the mat.
@@ -457,15 +454,14 @@ or `the` which are followed by the word `fat`.
 
 [Test the regular expression](https://regex101.com/r/IDDARt/1)
 
-### 4.2 Negative Lookahead
+### 4.2 Negative Lookaheads
 
-Negative lookahead is used when we need to get all matches from input string
-that are not followed by a pattern. Negative lookahead is defined same as we define
-positive lookahead but the only difference is instead of equal `=` character we
-use negation `!` character i.e. `(?!...)`. Let's take a look at the following
+Negative lookaheads are used when we need to get all matches from an input string
+that are not followed by a certain pattern. A negative lookahead is written the same way as a
+positive lookahead. The only difference is, instead of an equals sign `=`, we
+use an exclamation mark `!` to indicate negation i.e. `(?!...)`. Let's take a look at the following
 regular expression `(T|t)he(?!\sfat)` which means: get all `The` or `the` words
-from input string that are not followed by the word `fat` precedes by a space
-character.
+from the input string that are not followed by a space character and the word `fat`.
 
 
 "(T|t)he(?!\sfat)" => The fat cat sat on the mat.
@@ -473,12 +469,12 @@ character.
 
 [Test the regular expression](https://regex101.com/r/V32Npg/1)
 
-### 4.3 Positive Lookbehind
+### 4.3 Positive Lookbehinds
 
-Positive lookbehind is used to get all the matches that are preceded by a
-specific pattern. Positive lookbehind is denoted by `(?<=...)`. For example, the
+Positive lookbehinds are used to get all the matches that are preceded by a
+specific pattern. Positive lookbehinds are written `(?<=...)`. For example, the
 regular expression `(?<=(T|t)he\s)(fat|mat)` means: get all `fat` or `mat` words
-from input string that are after the word `The` or `the`.
+from the input string that come after the word `The` or `the`.
 
 
 "(?<=(T|t)he\s)(fat|mat)" => The fat cat sat on the mat.
@@ -486,11 +482,11 @@ from input string that are after the word `The` or `the`.
 
 [Test the regular expression](https://regex101.com/r/avH165/1)
 
-### 4.4 Negative Lookbehind
+### 4.4 Negative Lookbehinds
 
-Negative lookbehind is used to get all the matches that are not preceded by a
-specific pattern. Negative lookbehind is denoted by `(?
@@ -507,17 +503,17 @@ integral part of the RegExp.
 
 |Flag|Description|
 |:----:|----|
-|i|Case insensitive: Sets matching to be case-insensitive.|
-|g|Global Search: Search for a pattern throughout the input string.|
-|m|Multiline: Anchor meta character works on each line.|
+|i|Case insensitive: Match will be case-insensitive.|
+|g|Global Search: Match all instances, not just the first.|
+|m|Multiline: Anchor meta characters work on each line.|
 
 ### 5.1 Case Insensitive
 
 The `i` modifier is used to perform case-insensitive matching. For example, the
-regular expression `/The/gi` means: uppercase letter `T`, followed by lowercase
-character `h`, followed by character `e`. And at the end of regular expression
+regular expression `/The/gi` means: an uppercase `T`, followed by a lowercase
+`h`, followed by an `e`. And at the end of regular expression
 the `i` flag tells the regular expression engine to ignore the case. As you can
-see we also provided `g` flag because we want to search for the pattern in the
+see, we also provided `g` flag because we want to search for the pattern in the
 whole input string.
 
 
@@ -532,13 +528,13 @@ whole input string.
 
 [Test the regular expression](https://regex101.com/r/ahfiuh/1)
 
-### 5.2 Global search
+### 5.2 Global Search
 
-The `g` modifier is used to perform a global match (find all matches rather than
+The `g` modifier is used to perform a global match (finds all matches rather than
 stopping after the first match). For example, the regular expression`/.(at)/g`
-means: any character except new line, followed by lowercase character `a`,
-followed by lowercase character `t`. Because we provided `g` flag at the end of
-the regular expression now it will find all matches in the input string, not just the first one (which is the default behavior).
+means: any character except a new line, followed by a lowercase `a`,
+followed by a lowercase `t`. Because we provided the `g` flag at the end of
+the regular expression, it will now find all matches in the input string, not just the first one (which is the default behavior).
 
 
 "/.(at)/" => The fat cat sat on the mat.
@@ -554,12 +550,12 @@ the regular expression now it will find all matches in the input string, not jus
 
 ### 5.3 Multiline
 
-The `m` modifier is used to perform a multi-line match. As we discussed earlier
-anchors `(^, $)` are used to check if pattern is the beginning of the input or
-end of the input string. But if we want that anchors works on each line we use
-`m` flag. For example, the regular expression `/at(.)?$/gm` means: lowercase
-character `a`, followed by lowercase character `t`, optionally anything except
-new line. And because of `m` flag now regular expression engine matches pattern
+The `m` modifier is used to perform a multi-line match. As we discussed earlier,
+anchors `(^, $)` are used to check if a pattern is at the beginning of the input or
+the end. But if we want the anchors to work on each line, we use
+the `m` flag. For example, the regular expression `/at(.)?$/gm` means: a lowercase
+`a`, followed by a lowercase `t` and, optionally, anything except
+a new line. And because of the `m` flag, the regular expression engine now matches patterns
 at the end of each line in a string.
 
 
@@ -578,9 +574,9 @@ at the end of each line in a string.
 
 [Test the regular expression](https://regex101.com/r/E88WE2/1)
 
-## 6. Greedy vs lazy matching
-By default regex will do greedy matching which means it will match as long as
-possible. We can use `?` to match in lazy way which means as short as possible.
+## 6. Greedy vs Lazy Matching
+By default, a regex will perform a greedy match, which means the match will be as long as
+possible. We can use `?` to match in a lazy way, which means the match should be as short as possible.
 
 
 "/(.*at)/" => The fat cat sat on the mat. 
@@ -597,7 +593,7 @@ possible. We can use `?` to match in lazy way which means as short as possible. ## Contribution -* Open pull request with improvements +* Open a pull request with improvements * Discuss ideas in issues * Spread the word * Reach out with any feedback [![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/ziishaned.svg?style=social&label=Follow%20%40ziishaned)](https://twitter.com/ziishaned) From 060cf5c5143552aa00fcd79e844e4cc3638bb103 Mon Sep 17 00:00:00 2001 From: Tom McAndrew <42588609+tommcandrew@users.noreply.github.com> Date: Tue, 10 Mar 2020 17:13:25 +0000 Subject: [PATCH 18/24] Correct grammar in intro title --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6d534fb..055550b 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ * [Tiếng Việt](translations/README-vn.md) * [فارسی](translations/README-fa.md) -## What is Regular Expression? +## What are Regular Expressions? > A regular expression is a group of characters or symbols which is used to find a specific pattern in a text. From 41e1eefca7c196f9265cacbca59480bad021ac89 Mon Sep 17 00:00:00 2001 From: Tom McAndrew <42588609+tommcandrew@users.noreply.github.com> Date: Tue, 10 Mar 2020 17:14:48 +0000 Subject: [PATCH 19/24] Make section 2.1 title singular --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 055550b..75d0639 100644 --- a/README.md +++ b/README.md @@ -60,7 +60,7 @@ letter and also it is too short. - [Basic Matchers](#1-basic-matchers) - [Meta Characters](#2-meta-characters) - - [Full Stops](#21-full-stops) + - [The Full Stop](#21-the-full-stops) - [Character Sets](#22-character-sets) - [Negated Character Sets](#221-negated-character-sets) - [Repetitions](#23-repetitions) @@ -133,7 +133,7 @@ square brackets. The meta characters are as follows: |^|Matches the beginning of the input.| |$|Matches the end of the input.| -## 2.1 Full Stops +## 2.1 The Full Stop The full stop `.` is the simplest example of a meta character. The meta character `.` matches any single character. It will not match return or newline characters. From f7e4c53376f90651a656bbbd098aada82d064fe7 Mon Sep 17 00:00:00 2001 From: Tom McAndrew <42588609+tommcandrew@users.noreply.github.com> Date: Tue, 10 Mar 2020 17:17:18 +0000 Subject: [PATCH 20/24] Make section 4 titles singular --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 75d0639..4ac6e0f 100644 --- a/README.md +++ b/README.md @@ -77,10 +77,10 @@ letter and also it is too short. - [The Dollar Sign](#282-the-dollar-sign) - [Shorthand Character Sets](#3-shorthand-character-sets) - [Lookarounds](#4-lookarounds) - - [Positive Lookaheads](#41-positive-lookaheads) - - [Negative Lookaheads](#42-negative-lookaheads) - - [Positive Lookbehinds](#43-positive-lookbehinds) - - [Negative Lookbehinds](#44-negative-lookbehinds) + - [Positive Lookahead](#41-positive-lookahead) + - [Negative Lookahead](#42-negative-lookahead) + - [Positive Lookbehind](#43-positive-lookbehind) + - [Negative Lookbehind](#44-negative-lookbehind) - [Flags](#5-flags) - [Case Insensitive](#51-case-insensitive) - [Global Search](#52-global-search) @@ -435,7 +435,7 @@ expressions: |?<=|Positive Lookbehind| |? Date: Wed, 18 Mar 2020 18:06:14 +0900 Subject: [PATCH 21/24] Update README-ko.md --- translations/README-ko.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/translations/README-ko.md b/translations/README-ko.md index d4c9c0b..6d29495 100644 --- a/translations/README-ko.md +++ b/translations/README-ko.md @@ -77,7 +77,7 @@ - [대소문자 구분없음](#51-대소문자-구분없음) - [전체 검색](#52-전체-검색) - [멀티 라인](#53-멀티-라인) -- [탐욕적 vs 게으른 매칭](#6-탐욕적-vs-게으른 매칭) +- [탐욕적 vs 게으른 매칭](#6-탐욕적-vs-게으른-매칭) ## 1. 기본 매쳐 From ceb3d3bd74601f72d381fa9c993eae2157dc8cbc Mon Sep 17 00:00:00 2001 From: cuiyaocy Date: Wed, 1 Apr 2020 09:29:09 +0800 Subject: [PATCH 22/24] modify the desc of {n,m} --- translations/README-cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/translations/README-cn.md b/translations/README-cn.md index 55984ea..6ff3f4f 100644 --- a/translations/README-cn.md +++ b/translations/README-cn.md @@ -119,7 +119,7 @@ |*|匹配>=0个重复的在*号之前的字符。| |+|匹配>=1个重复的+号前的字符。 |?|标记?之前的字符为可选.| -|{n,m}|匹配num个大括号之间的字符 (n <= num <= m).| +|{n,m}|匹配num个大括号之前的字符或字符集 (n <= num <= m).| |(xyz)|字符集,匹配与 xyz 完全相等的字符串.| |||或运算符,匹配符号前或后的字符.| |\|转义字符,用于匹配一些保留的字符 [ ] ( ) { } . * + ? ^ $ \ || From e3945a4cbcdf68baf8f6a734642ed3ca4f55e7be Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ja=CC=81nos=20Orcsik?= Date: Tue, 30 Jun 2020 00:16:07 +0200 Subject: [PATCH 23/24] fix typo --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 1428d17..55da5ce 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ * [Tiếng Việt](translations/README-vn.md) * [فارسی](translations/README-fa.md) -## What are Regular Expressions? +## What is Regular Expression? > A regular expression is a group of characters or symbols which is used to find a specific pattern in a text. @@ -280,7 +280,7 @@ regular expression `[0-9]{3}` means: Match exactly 3 digits. ## 2.5 Capturing Groups -A capturing group is a group of sub-patterns that is written inside parentheses +A capturing group is a group of subpatterns that is written inside parentheses `(...)`. As discussed before, in regular expressions, if we put a quantifier after a character then it will repeat the preceding character. But if we put a quantifier after a capturing group then it repeats the whole capturing group. For example, @@ -356,7 +356,7 @@ character. In regular expressions, we use anchors to check if the matching symbol is the starting symbol or ending symbol of the input string. Anchors are of two types: -The first type is the caret `^` that check if the matching character is the first +The first type is the caret `^` that checks if the matching character is the first character of the input and the second type is the dollar sign `$` which checks if a matching character is the last character of the input string. From 006b7a1a984b5f8cbd5dc6a76373cad123ff042e Mon Sep 17 00:00:00 2001 From: Daniel de Andrade Lopes Date: Sat, 4 Jul 2020 08:58:16 -0300 Subject: [PATCH 24/24] Update Readme.md I see some errors in writing and in one expression, so I open this PR to improvements. --- translations/README-pt_BR.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/translations/README-pt_BR.md b/translations/README-pt_BR.md index 774cd21..1c8e474 100644 --- a/translations/README-pt_BR.md +++ b/translations/README-pt_BR.md @@ -36,7 +36,7 @@ > Expressão Regular é um grupo de caracteres ou símbolos utilizado para encontrar um padrão específico a partir de um texto. -Uma expressão regular é um padrão que é comparado com uma cadeia de caracteres da esquerda para a direita. A expressão "Expressão regular" é longa e difícil de falar; você geralmente vai encontrar o termo abreviado como "regex" ou "regexp". Expressões regulares são usadas para substituir um texto dentro de uma string, validar formulários, extrair uma parte de uma string baseada em um padrão encontrado e muito mais. +Uma expressão regular é um padrão que é comparado com uma cadeia de caracteres da esquerda para a direita. O termo "Expressão regular" é longo e difícil de falar; você geralmente vai encontrar o termo abreviado como "regex" ou "regexp". Expressões regulares são usadas para substituir um texto dentro de uma string, validar formulários, extrair uma parte de uma string baseada em um padrão encontrado e muito mais. Imagine que você está escrevendo uma aplicação e quer colocar regras para quando um usuário escolher seu username. Nós queremos permitir que o username contenha letras, números, underlines e hífens. Nós também queremos limitar o número de caracteres para não ficar muito feio. Então usamos a seguinte expressão regular para validar o username: @@ -307,7 +307,7 @@ As expressões regulares fornecem abreviações para conjuntos de caracteres com ## 4. Olhar ao Redor -Lookbehind (olhar atrás) e lookahead (olhar à frente), às vezes conhecidos como lookarounds (olhar ao redor), são tipos específicos de ***grupo de não captura*** (utilizado para encontrar um padrão, mas não incluí-lo na lista de ocorrêncoas). Lookarounds são usados quando temos a condição de que determinado padrão seja precedido ou seguido de outro padrão. Por exemplo, queremos capturar todos os números precedidos do caractere `$` da seguinte string de entrada: `$4.44 and $10.88`. Vamos usar a seguinte expressão regular `(?<=\$)[0-9\.]*` que significa: procure todos os números que contêm o caractere `.` e são precedidos pelo caractere `$`. A seguir estão os lookarounds que são utilizados em expressões regulares: +Lookbehind (olhar atrás) e lookahead (olhar à frente), às vezes conhecidos como lookarounds (olhar ao redor), são tipos específicos de ***grupo de não captura*** (utilizado para encontrar um padrão, mas não incluí-lo na lista de ocorrências). Lookarounds são usados quando temos a condição de que determinado padrão seja precedido ou seguido de outro padrão. Por exemplo, queremos capturar todos os números precedidos do caractere `$` da seguinte string de entrada: `$4.44 and $10.88`. Vamos usar a seguinte expressão regular `(?<=\$)[0-9\.]*` que significa: procure todos os números que contêm o caractere `.` e são precedidos pelo caractere `$`. A seguir estão os lookarounds que são utilizados em expressões regulares: |Símbolo|Descrição| |:----:|----| @@ -318,7 +318,7 @@ Lookbehind (olhar atrás) e lookahead (olhar à frente), às vezes conhecidos co ### 4.1 Lookahead Positivo -O lookahead positivo impõe que a primeira parte da expressão deve ser seguida pela expressão lookahead. A combinação retornada contém apenas o texto que encontrado pela primeira parte da expressão. Para definir um lookahead positivo, deve-se usar parênteses. Dentro desses parênteses, é usado um ponto de interrogação seguido de um sinal de igual, dessa forma: `(?=...)`. Expressões lookahead são escritas depois do sinal de igual dentro do parênteses. Por exemplo, a expressão regular `[T|t]he(?=\sfat)` significa: encontre a letra minúscula `t` ou a letra maiúscula `T`, seguida da letra `h`, seguida da letra `e`. Entre parênteses, nós definimos o lookahead positivo que diz para o motor de expressões regulares para encontrar `The` ou `the` que são seguidos pela palavra `fat`. +O lookahead positivo impõe que a primeira parte da expressão deve ser seguida pela expressão lookahead. A combinação retornada contém apenas o texto que é encontrado pela primeira parte da expressão. Para definir um lookahead positivo, deve-se usar parênteses. Dentro desses parênteses, é usado um ponto de interrogação seguido de um sinal de igual, dessa forma: `(?=...)`. Expressões lookahead são escritas depois do sinal de igual dentro do parênteses. Por exemplo, a expressão regular `[T|t]he(?=\sfat)` significa: encontre a letra minúscula `t` ou a letra maiúscula `T`, seguida da letra `h`, seguida da letra `e`. Entre parênteses, nós definimos o lookahead positivo que diz para o motor de expressões regulares para encontrar `The` ou `the` que são seguidos pela palavra `fat`.
 "[T|t]he(?=\sfat)" => The fat cat sat on the mat.
@@ -400,7 +400,7 @@ O modificador `g` é usado para realizar uma busca global (encontrar todas as oc
 
 ### 5.3 Multilinhas
 
-O modificador `m` é usado para realizar uma busca em várias linhas. Como falamos antes, as âncoras `(^, $)` são usadas para verificar se o padrão está no início ou no final da string de entrada. Mas se queremos que as âncoras funcionem em cada uma das linhas, usamos a flag `m`. Por exemplo, a expressão regular `/at(.)?$/gm` significa: o caractere minúsculo `a`, seguido do caractere minúsculo `t`, opcionalmente seguido por qualquer caractere, exceto nova linha. E por causa da flag `m`, agora o motor de expressões regulares encontra o padrão no final de cada uma das linhas da string.
+O modificador `m` é usado para realizar uma busca em várias linhas. Como falamos antes, as âncoras `(^, $)` são usadas para verificar se o padrão está no início ou no final da string de entrada respectivamente. Mas se queremos que as âncoras funcionem em cada uma das linhas, usamos a flag `m`. Por exemplo, a expressão regular `/.at(.)?$/gm` significa: o caractere minúsculo `a`, seguido do caractere minúsculo `t`, opcionalmente seguido por qualquer caractere, exceto nova linha. E por causa da flag `m`, agora o motor de expressões regulares encontra o padrão no final de cada uma das linhas da string.
 
 
 "/.at(.)?$/" => The fat