Jason Cameron 77436207e6
feat: Add Open Graph tag support (#195)
* feat: Add Open Graph tag support (og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Fix: Prevent nil pointer dereference in test (og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat!: Implement Open Graph tag caching and passthrough functionality (WIP)

I'm going to sleep. currently tags are passed to renderIndex.

see https://github.com/TecharoHQ/anubis/issues/131

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat: Add configuration for air tool with build and logger settings

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat: Move OG tags to base template (og-tags)

Moves the Open Graph (OG) tags from the index template to
the base template. This allows OG tags to be set on any
page, not just the index.  Also adds a
BaseWithOGTags function to the web package to allow
passing OG tags to the base template.  Removes the
ogTags parameter from the Index function and template.

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Delete CHANGELOG.md

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat: Add language attribute to HTML tag in template

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix(tests):  Fix nil pointer ref

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat(og-tags): Add timeout to http client (og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* style: fix line endings & indentation

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* style: add inspection comment for GoBoolExpressions in UnchangingCache

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat(og-tags): Implement Open Graph tag fetching and caching

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix(og-tags): Simplify Open Graph tag extraction logic

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix(og-tags): Add nil check in isOGMetaTag and enhance test cases

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat(og-tags): Add approved tags and prefixes for Open Graph extraction

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* test(og-tags): Update tests with approved tags and improve clarity

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* chore: Add changelog notes

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix: Improve stability of the target fetcher?

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix: Update template error handling and improve Open Graph tag integration

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* style: format files and remove deubg logs

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat: Credit CELPHASE for mascot design (og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat: Credit CELPHASE for mascot design (og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat: Allow twitter prefixed OG tags by default

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* chore: replace /tmp with /var

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Update docs/docs/CHANGELOG.md

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* Update docs/docs/admin/configuration/open-graph.mdx

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* chore: add fediverse to default prefixes (#og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat(og-tags): Remove og-query-distinct flag

This commit removes the `og-query-distinct` flag and
associated logic.  URLs with different query parameters
will now always be treated as the same cache key for Open
Graph tags.  This simplifies the caching logic and
improves performance.

Additionally, the http client used for fetching OG tags
is now a member of the OGTagCache struct, rather than a
global variable. This improves testability and allows
for more flexible configuration in the future.

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Update docs/docs/admin/configuration/open-graph.mdx

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* docs: remove og tags references

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* refactor: rename url > u to not overlap package name

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Update internal/ogtags/cache.go

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* Update internal/ogtags/cache.go

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* fix(tests): Don't use network when network access is disabled

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Fix: Handle nil URL in GetOGTags (og-tags)

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* chore: sort installation docs alphabetically

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix(tests): validate that no duplicate requests are made

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* style(tests): remove unused ok var

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* docs: convert to table fmt

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat(og-tags): Enhance OG tag fetching and caching

Adds additional approved OG tags (`keywords`, `author`), improves

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* chore: update generated templ's after format

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix(tests): update integration_test.go to reflect the new behavior of fetchHTMLDocument

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Revert "data/botPolicies: allow iMessage scraper by default (#178)"

This reverts commit 21a9d777

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Fix: Simplify ogTags access in cache test.

Didn't know this was possible! wow!

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Fix: Handle request timeouts when fetching OG tags (#og-tags)

Cache a nil result for half the TTL to avoid repeatedly
requesting a timed-out URL.

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Fix: make OG tags passthrough option function.

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* Fix: Handle timeouts and non-200 responses when fetching OG tags (og-tags)

- Cache empty results for timeouts and non-200 status codes
  to avoid spamming the server.
- Use a non-nil empty map to represent empty results in the
  cache, as nil would be a cache miss.

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* feat(og-tags): switch to http.MaxBytesReader

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* chore(og-tags): add noindex, nofollow meta tag and update error line numbers

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

---------

Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-04-06 20:02:12 -04:00
2025-04-06 12:44:52 +00:00
2025-03-17 19:33:07 -04:00
2025-03-19 09:10:29 -04:00
2025-04-06 20:02:12 -04:00
2025-04-06 00:28:08 -04:00
2025-03-17 19:33:07 -04:00
2025-04-06 12:44:52 +00:00
2025-04-02 00:44:51 -04:00
2025-04-06 12:44:52 +00:00

Anubis

A smiling chibi dark-skinned anthro jackal with brown hair and tall ears looking victorious with a thumbs-up

enbyware GitHub Issues or Pull Requests by label GitHub go.mod Go version language count repo size

Anubis weighs the soul of your connection using a sha256 proof-of-work challenge in order to protect upstream resources from scraper bots.

Installing and using this will likely result in your website not being indexed by some search engines. This is considered a feature of Anubis, not a bug.

This is a bit of a nuclear response, but AI scraper bots scraping so aggressively have forced my hand. I hate that I have to do this, but this is what we get for the modern Internet because bots don't conform to standards like robots.txt, even when they claim to.

In most cases, you should not need this and can probably get by using Cloudflare to protect a given origin. However, for circumstances where you can't or won't use Cloudflare, Anubis is there for you.

If you want to try this out, connect to anubis.techaro.lol.

Support

If you run into any issues running Anubis, please open an issue. Please include all the information I would need to diagnose your issue.

For live chat, please join the Patreon and ask in the Patron discord in the channel #anubis.

Star History

Star History Chart

Packaging Status

Packaging status

Description
Weighs the soul of incoming HTTP requests using proof-of-work to stop AI crawlers
Readme MIT 18 MiB
Languages
Go 87.4%
JavaScript 5.4%
Shell 4%
templ 1.9%
CSS 0.7%
Other 0.5%