diff --git a/content/en/functions/findRe.md b/content/en/functions/findRe.md index e84a0cf76..0b8978ec6 100644 --- a/content/en/functions/findRe.md +++ b/content/en/functions/findRe.md @@ -9,13 +9,25 @@ keywords: [regex] signature: - "findRE PATTERN INPUT [LIMIT]" - "strings.FindRE PATTERN INPUT [LIMIT]" -relatedfuncs: [replaceRE] +relatedfuncs: [findRESubmatch, replaceRE] --- -By default, the `findRE` function finds all matches. You can limit the number of matches with an optional LIMIT parameter. +By default, `findRE` finds all matches. You can limit the number of matches with an optional LIMIT parameter. When specifying the regular expression, use a raw [string literal] (backticks) instead of an interpreted string literal (double quotes) to simplify the syntax. With an interpreted string literal you must escape backslashes. -The syntax of the regular expression is the same general syntax used by Perl, Python, and other languages. More precisely, it is the syntax accepted by [RE2] except for `\C`. +[string literal]: https://go.dev/ref/spec#String_literals + +This function uses the [RE2] regular expression library. See the [RE2 syntax documentation] for details. Note that the RE2 `\C` escape sequence is not supported. + +[RE2]: https://github.com/google/re2/ +[RE2 syntax documentation]: https://github.com/google/re2/wiki/Syntax/ + +{{% note %}} +The RE2 syntax is a subset of that accepted by [PCRE], roughly speaking, and with various [caveats]. + +[caveats]: https://swtch.com/~rsc/regexp/regexp3.html#caveats +[PCRE]: https://www.pcre.org/ +{{% /note %}} This example returns a slice of all second level headings (`h2` elements) within the rendered `.Content`: @@ -34,25 +46,3 @@ To limit the number of matches to one: {{% note %}} You can write and test your regular expression using [regex101.com](https://regex101.com/). Be sure to select the Go flavor before you begin. {{% /note %}} - -## findRESubmatch - -In Hugo 0.110.0 we added a variant of `findRE` that returns a slice of strings holding the text of the leftmost match of the regular expression in s and the matches, if any, of its subexpressions. - -This: - -```go-html-template -{{ findRESubmatch §§(.+?)§§ §§
  • Foo
  • Bar
  • §§ | print | safeHTML }} -``` - -Will print: - -``` -[[Foo #foo Foo] [Bar #bar Bar]] -``` - -{{< new-in "0.110.0" >}} - - -[RE2]: https://github.com/google/re2/wiki/Syntax -[string literal]: https://go.dev/ref/spec#String_literals diff --git a/content/en/functions/findresubmatch.md b/content/en/functions/findresubmatch.md new file mode 100644 index 000000000..ebf00f14d --- /dev/null +++ b/content/en/functions/findresubmatch.md @@ -0,0 +1,102 @@ +--- +title: findRESubmatch +description: Returns a slice of strings holding the text of the leftmost match of the regular expression and the matches, if any, of its subexpressions +categories: [functions] +menu: + docs: + parent: functions +keywords: [regex] +signature: + - "findRESubmatch PATTERN INPUT [LIMIT]" + - "strings.FindRESubmatch PATTERN INPUT [LIMIT]" +relatedfuncs: [findRE, replaceRE] +--- + +By default, `findRESubmatch` finds all matches. You can limit the number of matches with an optional LIMIT parameter. A return value of nil indicates no match. + +When specifying the regular expression, use a raw [string literal] (backticks) instead of an interpreted string literal (double quotes) to simplify the syntax. With an interpreted string literal you must escape backslashes. + +[string literal]: https://go.dev/ref/spec#String_literals + +This function uses the [RE2] regular expression library. See the [RE2 syntax documentation] for details. Note that the RE2 `\C` escape sequence is not supported. + +[RE2]: https://github.com/google/re2/ +[RE2 syntax documentation]: https://github.com/google/re2/wiki/Syntax/ + +{{% note %}} +The RE2 syntax is a subset of that accepted by [PCRE], roughly speaking, and with various [caveats]. + +[caveats]: https://swtch.com/~rsc/regexp/regexp3.html#caveats +[PCRE]: https://www.pcre.org/ +{{% /note %}} + +## Demonstrative examples + +```go-html-template +{{ findRESubmatch `a(x*)b` "-ab-" }} → [["ab" ""]] +{{ findRESubmatch `a(x*)b` "-axxb-" }} → [["axxb" "xx"]] +{{ findRESubmatch `a(x*)b` "-ab-axb-" }} → [["ab" ""] ["axb" "x"]] +{{ findRESubmatch `a(x*)b` "-axxb-ab-" }} → [["axxb" "xx"] ["ab" ""]] +{{ findRESubmatch `a(x*)b` "-axxb-ab-" 1 }} → [["axxb" "xx"]] +``` + +## Practical example + +This markdown: + +```text +- [Example](https://example.org) +- [Hugo](https://gohugo.io) +``` + +Produces this HTML: + +```html + +``` + +To match the anchor elements, capturing the link destination and text: + +```go-html-template +{{ $regex := `(.+?)` }} +{{ $matches := findRESubmatch $regex .Content }} +``` + +Viewed as JSON, the data structure of `$matches` in the code above is: + +```json +[ + [ + "Example", + "https://example.org", + "Example" + ], + [ + "Hugo", + "https://gohugo.io", + "Hugo" + ] +] +``` + +To render the `href` attributes: + +```go-html-template +{{ range $matches }} + {{ index . 1 }} +{{ end }} +``` + +Result: + +```text +https://example.org +https://gohugo.io +``` + +{{% note %}} +You can write and test your regular expression using [regex101.com](https://regex101.com/). Be sure to select the Go flavor before you begin. +{{% /note %}} diff --git a/content/en/functions/replacere.md b/content/en/functions/replacere.md index ef0ffb4fa..22f81a2f5 100644 --- a/content/en/functions/replacere.md +++ b/content/en/functions/replacere.md @@ -5,17 +5,29 @@ categories: [functions] menu: docs: parent: functions -keywords: [replace regex] +keywords: [regex] signature: - "replaceRE PATTERN REPLACEMENT INPUT [LIMIT]" - "strings.ReplaceRE PATTERN REPLACEMENT INPUT [LIMIT]" -relatedfuncs: [replace,findRE] +relatedfuncs: [findRE, FindRESubmatch, replace] --- -By default, the `replaceRE` function replaces all matches. You can limit the number of matches with an optional LIMIT parameter. +By default, `replaceRE` replaces all matches. You can limit the number of matches with an optional LIMIT parameter. When specifying the regular expression, use a raw [string literal] (backticks) instead of an interpreted string literal (double quotes) to simplify the syntax. With an interpreted string literal you must escape backslashes. -The syntax of the regular expression is the same general syntax used by Perl, Python, and other languages. More precisely, it is the syntax accepted by [RE2] except for `\C`. +[string literal]: https://go.dev/ref/spec#String_literals + +This function uses the [RE2] regular expression library. See the [RE2 syntax documentation] for details. Note that the RE2 `\C` escape sequence is not supported. + +[RE2]: https://github.com/google/re2/ +[RE2 syntax documentation]: https://github.com/google/re2/wiki/Syntax/ + +{{% note %}} +The RE2 syntax is a subset of that accepted by [PCRE], roughly speaking, and with various [caveats]. + +[caveats]: https://swtch.com/~rsc/regexp/regexp3.html#caveats +[PCRE]: https://www.pcre.org/ +{{% /note %}} This example replaces two or more consecutive hyphens with a single hyphen: