Xe Iaso ef52550e70
fix(config): remove trailing newlines in regexes (#373)
Closes #372

Fun YAML fact of the day:

What is the difference between how these two expressions are parsed?

```yaml
foo: >
  bar
```

```yaml
foo: >-
  bar
```

They are invisible in yaml, but when you evaluate them to JSON the
difference is obvious:

```json
{
  "foo": "bar\n"
}
```

```json
{
  "foo": "bar"
}
```

User-Agent strings, URL path values, and HTTP headers _do_ end in
newlines in HTTP/1.1 wire form, but that newline is usually stripped
before the server actually handles it. Also HTTP/2 is a thing and does
not terminate header values with newlines.

This change makes Anubis more aggressively detect mistaken uses of the
yaml `>` operator and nudges the user into using the yaml `>-` operator
which does not append the trailing newline.

I had honestly forgotten about this YAML behavior because it wasn't
relevant for so long. Oops! Glad I released a beta.

Whenever you get into this state, Anubis will throw a config parsing
error and then give you a message hinting at the folly of your ways.

```
config.Bot: regular expression ends with newline (try >- instead of > in yaml)
```

Big thanks to https://yaml-multiline.info, this helped me realize my
folly instantly.

@aiverson, this is official permission to say "told you so".

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-26 14:01:15 +00:00
2025-03-17 19:33:07 -04:00
2025-03-19 09:10:29 -04:00
2025-04-21 00:09:27 +00:00
2025-04-21 00:09:27 +00:00
2025-03-17 19:33:07 -04:00
2025-04-25 07:53:54 -04:00
2025-04-10 00:07:14 +00:00

Anubis

A smiling chibi dark-skinned anthro jackal with brown hair and tall ears looking victorious with a thumbs-up

enbyware GitHub Issues or Pull Requests by label GitHub go.mod Go version language count repo size

Sponsors

Anubis is brought to you by sponsors and donors like:

Distrust

Overview

Anubis weighs the soul of your connection using a proof-of-work challenge in order to protect upstream resources from scraper bots.

This program is designed to help protect the small internet from the endless storm of requests that flood in from AI companies. Anubis is as lightweight as possible to ensure that everyone can afford to protect the communities closest to them.

Anubis is a bit of a nuclear response. This will result in your website being blocked from smaller scrapers and may inhibit "good bots" like the Internet Archive. You can configure bot policy definitions to explicitly allowlist them and we are working on a curated set of "known good" bots to allow for a compromise between discoverability and uptime.

In most cases, you should not need this and can probably get by using Cloudflare to protect a given origin. However, for circumstances where you can't or won't use Cloudflare, Anubis is there for you.

If you want to try this out, connect to anubis.techaro.lol.

Support

If you run into any issues running Anubis, please open an issue. Please include all the information I would need to diagnose your issue.

For live chat, please join the Patreon and ask in the Patron discord in the channel #anubis.

Star History

Star History Chart

Packaging Status

Packaging status

Contributors

Made with contrib.rocks.

Description
Weighs the soul of incoming HTTP requests using proof-of-work to stop AI crawlers
Readme MIT 19 MiB
Languages
Go 87.4%
JavaScript 5.4%
Shell 4%
templ 1.9%
CSS 0.7%
Other 0.5%