* Add check endpoint which can be used with nginx' auth_request function
* feat(cmd): allow configuring redirect domains
* test: add test environment for the nginx_auth PR
This is a full local setup of the nginx_auth PR including HTTPS so that
it's easier to validate in isolation.
This requires an install of k3s (https://k3s.io) with traefik set to
listen on localhost. This will be amended in the future but for now this
works enough to ship it.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(cmd|lib): allow empty redirect domains variable
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(test): add space to target variable in anubis container
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(admin): rewrite subrequest auth docs, make generic
* docs(install): document REDIRECT_DOMAINS flag
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib): clamp redirects to the same HTTP host
Only if REDIRECT_DOMAINS is not set.
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* feat(config): support importing bot policy snippets
This changes the grammar of the Anubis bot policy config to allow
importing from internal shared rules or external rules on the
filesystem.
This lets you create a file at `/data/policies/block-evilbot.yaml` and
then import it with:
```yaml
bots:
- import: /data/policies/block-evilbot.yaml
```
This also explodes the default policy file into a bunch of composable
snippets.
Thank you @Aibrew for your example gitea Atom / RSS feed rules!
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(data): update botPolicies.json to use imports
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(cmd/anubis): extract bot policies with --extract-resources
This allows a user that doesn't have anything but the Anubis binary to
figure out what the default configuration does.
* docs(data/botPolices.yaml): document import syntax in-line
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib/policy): better test importing from JSON snippets
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(admin): Add import syntax documentation
This documents the import syntax and is based on the block comment at
the top of the default bot policy file.
* docs(changelog): add note about importing snippets
Signed-off-by: Xe Iaso <me@xeiaso.net>
* style(lib/policy/config): use an error value instead of an inline error
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(cmd/anubis): compute full XFF header
this one is pretty important to not pass
through blindly, as many applications and
frameworks will trust them
* feat(cmd/anubis): skip XFF compute if remote address is loopback
* docs: update CHANGELOG
* fix: improve error handling for resource closing and JSON encoding in MakeChallenge
* chore: update CHANGELOG with recent changes and improvements
* refactor: simplify RenderIndex function and improve error handling
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
This makes each check into its own type that has encapsulated check
logic, meaning that it's easier to add new checker implementations in
the future.
Signed-off-by: Xe Iaso <me@xeiaso.net>
Change the parsing of repository and tag to match the last colon. This fixes container builds when the repository already contains an earlier colon.
Signed-off-by: rayer <70722312+rayes0@users.noreply.github.com>
* feat: Add Open Graph tag support (og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Fix: Prevent nil pointer dereference in test (og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat!: Implement Open Graph tag caching and passthrough functionality (WIP)
I'm going to sleep. currently tags are passed to renderIndex.
see https://github.com/TecharoHQ/anubis/issues/131
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: Add configuration for air tool with build and logger settings
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: Move OG tags to base template (og-tags)
Moves the Open Graph (OG) tags from the index template to
the base template. This allows OG tags to be set on any
page, not just the index. Also adds a
BaseWithOGTags function to the web package to allow
passing OG tags to the base template. Removes the
ogTags parameter from the Index function and template.
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Delete CHANGELOG.md
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: Add language attribute to HTML tag in template
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(tests): Fix nil pointer ref
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(og-tags): Add timeout to http client (og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style: fix line endings & indentation
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style: add inspection comment for GoBoolExpressions in UnchangingCache
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(og-tags): Implement Open Graph tag fetching and caching
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(og-tags): Simplify Open Graph tag extraction logic
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(og-tags): Add nil check in isOGMetaTag and enhance test cases
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(og-tags): Add approved tags and prefixes for Open Graph extraction
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* test(og-tags): Update tests with approved tags and improve clarity
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: Add changelog notes
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix: Improve stability of the target fetcher?
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix: Update template error handling and improve Open Graph tag integration
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style: format files and remove deubg logs
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: Credit CELPHASE for mascot design (og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: Credit CELPHASE for mascot design (og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: Allow twitter prefixed OG tags by default
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: replace /tmp with /var
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Update docs/docs/CHANGELOG.md
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* Update docs/docs/admin/configuration/open-graph.mdx
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* chore: add fediverse to default prefixes (#og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(og-tags): Remove og-query-distinct flag
This commit removes the `og-query-distinct` flag and
associated logic. URLs with different query parameters
will now always be treated as the same cache key for Open
Graph tags. This simplifies the caching logic and
improves performance.
Additionally, the http client used for fetching OG tags
is now a member of the OGTagCache struct, rather than a
global variable. This improves testability and allows
for more flexible configuration in the future.
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Update docs/docs/admin/configuration/open-graph.mdx
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* docs: remove og tags references
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* refactor: rename url > u to not overlap package name
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Update internal/ogtags/cache.go
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* Update internal/ogtags/cache.go
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* fix(tests): Don't use network when network access is disabled
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Fix: Handle nil URL in GetOGTags (og-tags)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: sort installation docs alphabetically
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(tests): validate that no duplicate requests are made
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style(tests): remove unused ok var
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* docs: convert to table fmt
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(og-tags): Enhance OG tag fetching and caching
Adds additional approved OG tags (`keywords`, `author`), improves
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: update generated templ's after format
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(tests): update integration_test.go to reflect the new behavior of fetchHTMLDocument
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Revert "data/botPolicies: allow iMessage scraper by default (#178)"
This reverts commit 21a9d777
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Fix: Simplify ogTags access in cache test.
Didn't know this was possible! wow!
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Fix: Handle request timeouts when fetching OG tags (#og-tags)
Cache a nil result for half the TTL to avoid repeatedly
requesting a timed-out URL.
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Fix: make OG tags passthrough option function.
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Fix: Handle timeouts and non-200 responses when fetching OG tags (og-tags)
- Cache empty results for timeouts and non-200 status codes
to avoid spamming the server.
- Use a non-nil empty map to represent empty results in the
cache, as nil would be a cache miss.
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(og-tags): switch to http.MaxBytesReader
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore(og-tags): add noindex, nofollow meta tag and update error line numbers
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* fix: Correctly format listener address (https://github.com/TecharoHQ/anubis/issues/93)
Handle addresses that include a hostname, not just ports. If
the address starts with a colon, assume it's just a port and
prefix it with "http://localhost". Otherwise, prefix the
entire address with "http://". This ensures that the listener
URL is correctly formatted regardless of whether it includes
a hostname or just a port.
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore(docs): add changelog entry
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* cmd/anubis: add a debug option for benchmarking hashrate
Having the ability to benchmark different proof-of-work implementations
is useful for extending Anubis. This adds a flag `--debug-benchmark-js`
(and its associated environment variable `DEBUG_BENCHMARK_JS`) for
serving a tool to do so.
Internally, a there is a new policy action, "DEBUG_BENCHMARK", which
serves the benchmarking tool instead of a challenge. The flag then
replaces all bot rules with a special rule matching every request
to that action. The benchmark page makes heavy use of inline styles,
because currently all global styles are shared across all pages. This
could be fixed, but I wanted to avoid major changes to the templates.
* web/js: add signal for aborting an active proof-of-work algorithm
Both proof-of-work algorithms now take an optional `AbortSignal`, which
immediately terminates all workers and returns `false` if aborted before
the challenge is complete.
* web/js: add algorithm comparison to the benchmark page
"Compare:" is added to the benchmark page for testing the relative
performance between two algorithms. Since benchmark runs generally have
high variance, it may take a while for the averages to converge on a
stable difference.
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* Add periodic cleanup job for DecayMap
see https://github.com/TecharoHQ/anubis/issues/8
* Refactor: Improve DecayMap cleanup tests and add Len method
- Refactored DecayMap cleanup tests to use the new Len method
for more precise assertions.
- Added a Len method to DecayMap to retrieve the number of
entries.
- Simplified conditional checks in Get method.
* chore(changelog): add entry
* fix(tests): Use Impl.expire for decaymap cleanup
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Using TrimRight will remove all characters from `*dockerRepo` from right
to left that match a character contained on `"/"+filepath.Base(*dockerRepo)`
(the cutset) until it doesn't matches anymore.
So for example, if `dockerRepo` is `example.com/fijxu/anubis`, and
`"/"+filepath.Base(*dockerRepo)` is `/anubis`, it will remove
`u/anubis` and not just `/anubis` from `dockerRepo` because `u` is a character inside the
cutoff.
* Change how to make Anubis work without a reverse proxy
* Apply suggestions from code review
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Henri Vasserman <henv@hot.ee>
* add support for unix sockets.
* add env var docs
* lib: fix tests
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Henri Vasserman <henv@hot.ee>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* cmd/anubis: allow setting key bytes in flag/envvar
Docs are updated to generate a random key on load and when people press
the recycle button.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* review feedback fixups
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Update cmd/anubis/main.go
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Apply suggestions from code review
Co-authored-by: Ryan Cao <70191398+ryanccn@users.noreply.github.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Ryan Cao <70191398+ryanccn@users.noreply.github.com>
* Refactor anubis to split business logic into a lib, and cmd to just be direct usage.
* Post-rebase fixes.
* Update changelog, remove unnecessary one.
* lib: refactor this
This is mostly based on my personal preferences for how Go code should
be laid out. I'm not sold on the package name "lib" (I'd call it anubis
but that would stutter), but people are probably gonna import it as
libanubis so it's likely fine.
Packages have been "flattened" to centralize implementation with area of
concern. This goes against the Java-esque style that many people like,
but I think this helps make things simple.
Most notably: the dnsbl client (which is a hack) is an internal package
until it's made more generic. Then it can be made external.
I also fixed the logic such that `go generate` works and rebased on
main.
* internal/test: run tests iff npx exists and DONT_USE_NETWORK is not set
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: install deps
Signed-off-by: Xe Iaso <me@xeiaso.net>
* .github/workflows: verbose go tests?
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: sleep 2
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: nix this test so CI works
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: warmup per browser?
Signed-off-by: Xe Iaso <me@xeiaso.net>
* internal/test: disable for now :(
Signed-off-by: Xe Iaso <me@xeiaso.net>
* lib/anubis: do not apply bot rules if address check fails
Closes#83
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* Cleanup regex
Were were going overkill on the escape characters
* Update docs/docs/CHANGELOG.md
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Dennis ten Hoove <36002865+dennis1248@users.noreply.github.com>
---------
Signed-off-by: Dennis ten Hoove <36002865+dennis1248@users.noreply.github.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
The example/default bot policy document had a rule to allow RSS readers
through based on paths that end with ".rss", ".xml", ".atom", or
".json". Frameworks like Rails will treat these specially, meaning that
going to /things/12345-whateverhaha.json could bypass Anubis.
I checked the history of this rule and it was present in the original
example policy file in Xe/x. This rule is likely a mistake and it has
been removed. I think it was for making my blog still work with RSS
readers.
Thanks to Graham Sutherland for reporting this over email.
Signed-off-by: Xe Iaso <me@xeiaso.net>
hash.Write never returns error so removing it from
the results simplifies usage and eliminates dead error handling.
Signed-off-by: Alexander Yastrebov <yastrebov.alex@gmail.com>
* Added the possibility to define rules for remote addresses
* Added change in changelog
* Added check for X-Real-Ip and X-Forwarded-For when checking for remote address filtering
* cmd/anubis: refine IP filtering logic
* Optimize the configuration so that the IP trie is created once at
application start instead of dynamically being created every request.
* Document the changes in the changelog and docs site.
* Allow pure IP range filtering.
* Allow user agent based IP range filtering.
* Allow path based IP range filtering.
* Create --debug-x-real-ip-default flag for testing Anubis locally
without a HTTP load balancer.
---------
Co-authored-by: Xe Iaso <me@xeiaso.net>
Closes#30
Introduces the "challenge" field in bot rule definitions:
```json
{
"name": "generic-bot-catchall",
"user_agent_regex": "(?i:bot|crawler)",
"action": "CHALLENGE",
"challenge": {
"difficulty": 16,
"report_as": 4,
"algorithm": "slow"
}
}
```
This makes Anubis return a challenge page for every user agent with
"bot" or "crawler" in it (case-insensitively) with difficulty 16 using
the old "slow" algorithm but reporting in the client as difficulty 4.
This is useful when you want to make certain clients in particular
suffer.
Additional validation and testing logic has been added to make sure
that users do not define "impossible" challenge settings.
If no algorithm is specified, Anubis defaults to the "fast" algorithm.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat: allow binding to unix domain sockets
this is useful when the user does not want to expose more tcp ports than
needed. also simplifes configuration in some situation, like with nixos
modules as the socket paths can be automatically configured.
docs updated with additional configuration flags.
Signed-off-by: Cassie Cheung <me@soopy.moe>
* feat: graceful shutdown and cleanup on signal
this is needed to clean up left-over unix sockets, else on the next boot
listener panics with `address already in use`.
Co-authored-by: cat <cat@gensokyo.uk>
Signed-off-by: Cassie Cheung <me@soopy.moe>
* feat: support unix socket upstream targets
adds support for proxying unix socket upstreams, essentially allowing
anubis to run without listening on tcp sockets at all*.
*for metrics, neither prometheus and victoriametrics supports scraping
from unix sockets. if metrics are desired, tcp sockets are still needed.
Co-authored-by: cat <cat@gensokyo.uk>
Signed-off-by: Cassie Cheung <me@soopy.moe>
* docs: add changelog entry
---------
Signed-off-by: Cassie Cheung <me@soopy.moe>
Co-authored-by: cat <cat@gensokyo.uk>
* fix: no duplicate work when exceeding that 1xxx number
* run go generate and update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* cmd/anubis: drastically optimize proof of work
Closes#12Closes#17
This drastically optimizes the proof of work check by removing the
stringify call at every iteration. Additionally, this optimizes the
checks by running them in parallel for as many threads as the browser
has available (according to navigator.hardwareConcurrency).
This also changes the redirect lag to 250 milliseconds instead of 2000
milliseconds in order to be perceptually faster. This is below the
reaction time threshold of many people, so this will make the post-check
success phase perceptually instant.
Testing on an iPhone 7 Plus has shown that this can clear a difficulty 4
check in 3.4 seconds.
This actually optimizes the check so much it may be a logistical concern
for operators.
* cmd/anubis/js: fix happy cachebuster logic
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>