mirror of
https://github.com/TecharoHQ/anubis.git
synced 2025-08-03 01:38:14 -04:00

* feat: Add Open Graph tag support (og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Fix: Prevent nil pointer dereference in test (og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat!: Implement Open Graph tag caching and passthrough functionality (WIP) I'm going to sleep. currently tags are passed to renderIndex. see https://github.com/TecharoHQ/anubis/issues/131 Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat: Add configuration for air tool with build and logger settings Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat: Move OG tags to base template (og-tags) Moves the Open Graph (OG) tags from the index template to the base template. This allows OG tags to be set on any page, not just the index. Also adds a BaseWithOGTags function to the web package to allow passing OG tags to the base template. Removes the ogTags parameter from the Index function and template. Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Delete CHANGELOG.md Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat: Add language attribute to HTML tag in template Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix(tests): Fix nil pointer ref Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat(og-tags): Add timeout to http client (og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * style: fix line endings & indentation Signed-off-by: Jason Cameron <git@jasoncameron.dev> * style: add inspection comment for GoBoolExpressions in UnchangingCache Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat(og-tags): Implement Open Graph tag fetching and caching Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix(og-tags): Simplify Open Graph tag extraction logic Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix(og-tags): Add nil check in isOGMetaTag and enhance test cases Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat(og-tags): Add approved tags and prefixes for Open Graph extraction Signed-off-by: Jason Cameron <git@jasoncameron.dev> * test(og-tags): Update tests with approved tags and improve clarity Signed-off-by: Jason Cameron <git@jasoncameron.dev> * chore: Add changelog notes Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix: Improve stability of the target fetcher? Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix: Update template error handling and improve Open Graph tag integration Signed-off-by: Jason Cameron <git@jasoncameron.dev> * style: format files and remove deubg logs Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat: Credit CELPHASE for mascot design (og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat: Credit CELPHASE for mascot design (og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat: Allow twitter prefixed OG tags by default Signed-off-by: Jason Cameron <git@jasoncameron.dev> * chore: replace /tmp with /var Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Update docs/docs/CHANGELOG.md Co-authored-by: Xe Iaso <me@xeiaso.net> Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com> * Update docs/docs/admin/configuration/open-graph.mdx Co-authored-by: Xe Iaso <me@xeiaso.net> Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com> * chore: add fediverse to default prefixes (#og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat(og-tags): Remove og-query-distinct flag This commit removes the `og-query-distinct` flag and associated logic. URLs with different query parameters will now always be treated as the same cache key for Open Graph tags. This simplifies the caching logic and improves performance. Additionally, the http client used for fetching OG tags is now a member of the OGTagCache struct, rather than a global variable. This improves testability and allows for more flexible configuration in the future. Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Update docs/docs/admin/configuration/open-graph.mdx Co-authored-by: Xe Iaso <me@xeiaso.net> Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com> * docs: remove og tags references Signed-off-by: Jason Cameron <git@jasoncameron.dev> * refactor: rename url > u to not overlap package name Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Update internal/ogtags/cache.go Co-authored-by: Xe Iaso <me@xeiaso.net> Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com> * Update internal/ogtags/cache.go Co-authored-by: Xe Iaso <me@xeiaso.net> Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com> * fix(tests): Don't use network when network access is disabled Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Fix: Handle nil URL in GetOGTags (og-tags) Signed-off-by: Jason Cameron <git@jasoncameron.dev> * chore: sort installation docs alphabetically Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix(tests): validate that no duplicate requests are made Signed-off-by: Jason Cameron <git@jasoncameron.dev> * style(tests): remove unused ok var Signed-off-by: Jason Cameron <git@jasoncameron.dev> * docs: convert to table fmt Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat(og-tags): Enhance OG tag fetching and caching Adds additional approved OG tags (`keywords`, `author`), improves Signed-off-by: Jason Cameron <git@jasoncameron.dev> * chore: update generated templ's after format Signed-off-by: Jason Cameron <git@jasoncameron.dev> * fix(tests): update integration_test.go to reflect the new behavior of fetchHTMLDocument Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Revert "data/botPolicies: allow iMessage scraper by default (#178)" This reverts commit 21a9d777 Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Fix: Simplify ogTags access in cache test. Didn't know this was possible! wow! Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Fix: Handle request timeouts when fetching OG tags (#og-tags) Cache a nil result for half the TTL to avoid repeatedly requesting a timed-out URL. Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Fix: make OG tags passthrough option function. Signed-off-by: Jason Cameron <git@jasoncameron.dev> * Fix: Handle timeouts and non-200 responses when fetching OG tags (og-tags) - Cache empty results for timeouts and non-200 status codes to avoid spamming the server. - Use a non-nil empty map to represent empty results in the cache, as nil would be a cache miss. Signed-off-by: Jason Cameron <git@jasoncameron.dev> * feat(og-tags): switch to http.MaxBytesReader Signed-off-by: Jason Cameron <git@jasoncameron.dev> * chore(og-tags): add noindex, nofollow meta tag and update error line numbers Signed-off-by: Jason Cameron <git@jasoncameron.dev> --------- Signed-off-by: Jason Cameron <git@jasoncameron.dev> Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com> Co-authored-by: Xe Iaso <me@xeiaso.net>
197 lines
7.4 KiB
Plaintext
197 lines
7.4 KiB
Plaintext
package web
|
|
|
|
import (
|
|
"github.com/TecharoHQ/anubis"
|
|
"github.com/TecharoHQ/anubis/xess"
|
|
)
|
|
|
|
templ base(title string, body templ.Component, ogTags map[string]string) {
|
|
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<title>{ title }</title>
|
|
<link rel="stylesheet" href={ xess.URL }/>
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
|
|
<meta name="robots" content="noindex,nofollow"/>
|
|
for key, value := range ogTags {
|
|
<meta property={ key } content={ value }/>
|
|
}
|
|
<style>
|
|
body,
|
|
html {
|
|
height: 100%;
|
|
display: flex;
|
|
justify-content: center;
|
|
align-items: center;
|
|
margin-left: auto;
|
|
margin-right: auto;
|
|
}
|
|
|
|
.centered-div {
|
|
text-align: center;
|
|
}
|
|
|
|
#status {
|
|
font-variant-numeric: tabular-nums;
|
|
}
|
|
|
|
#progress {
|
|
display: none;
|
|
width: min(20rem, 90%);
|
|
height: 2rem;
|
|
border-radius: 1rem;
|
|
overflow: hidden;
|
|
margin: 1rem 0 2rem;
|
|
outline-color: #b16286;
|
|
outline-offset: 2px;
|
|
outline-style: solid;
|
|
outline-width: 4px;
|
|
}
|
|
|
|
.bar-inner {
|
|
background-color: #b16286;
|
|
height: 100%;
|
|
width: 0;
|
|
transition: width 0.25s ease-in;
|
|
}
|
|
</style>
|
|
@templ.JSONScript("anubis_version", anubis.Version)
|
|
|
|
</head>
|
|
<body id="top">
|
|
<main>
|
|
<center>
|
|
<h1 id="title" class=".centered-div">{ title }</h1>
|
|
</center>
|
|
@body
|
|
<footer>
|
|
<center>
|
|
<p>
|
|
Protected by <a href="https://github.com/TecharoHQ/anubis">Anubis</a> from <a
|
|
href="https://techaro.lol"
|
|
>Techaro</a>. Made with ❤️ in 🇨🇦.
|
|
</p>
|
|
<p>Mascot design by <a href="https://bsky.app/profile/celphase.bsky.social">CELPHASE</a>.</p>
|
|
</center>
|
|
</footer>
|
|
</main>
|
|
</body>
|
|
</html>
|
|
}
|
|
|
|
templ index() {
|
|
<div class="centered-div">
|
|
<img
|
|
id="image"
|
|
style="width:100%;max-width:256px;"
|
|
src={ "/.within.website/x/cmd/anubis/static/img/pensive.webp?cacheBuster=" +
|
|
anubis.Version }
|
|
/>
|
|
<img
|
|
style="display:none;"
|
|
style="width:100%;max-width:256px;"
|
|
src={ "/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" +
|
|
anubis.Version }
|
|
/>
|
|
<p id="status">Loading...</p>
|
|
<script async type="module" src={
|
|
"/.within.website/x/cmd/anubis/static/js/main.mjs?cacheBuster=" + anubis.Version }></script>
|
|
<div id="progress" role="progressbar" aria-labelledby="status">
|
|
<div class="bar-inner"></div>
|
|
</div>
|
|
<details>
|
|
<summary>Why am I seeing this?</summary>
|
|
<p>You are seeing this because the administrator of this website has set up <a
|
|
href="https://github.com/TecharoHQ/anubis">Anubis</a> to protect the server against the scourge of
|
|
<a href="https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/">AI companies
|
|
aggressively scraping websites</a>. This can and does cause downtime for the websites, which makes their
|
|
resources inaccessible for everyone.</p>
|
|
<p>Anubis is a compromise. Anubis uses a <a href="https://anubis.techaro.lol/docs/design/why-proof-of-work">Proof-of-Work</a>
|
|
scheme in the vein of <a href="https://en.wikipedia.org/wiki/Hashcash">Hashcash</a>, a proposed
|
|
proof-of-work scheme for reducing email spam. The idea is that at individual scales the additional load is
|
|
ignorable, but at mass scraper levels it adds up and makes scraping much more expensive.</p>
|
|
<p>Ultimately, this is a hack whose real purpose is to give a "good enough" placeholder solution so that more
|
|
time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering)
|
|
so that the challenge proof of work page doesn't need to be presented to users that are much more likely to
|
|
be legitimate.</p>
|
|
<p>Please note that Anubis requires the use of modern JavaScript features that plugins like <a
|
|
href="https://jshelter.org/">JShelter</a> will disable. Please disable JShelter or other such
|
|
plugins for this domain.</p>
|
|
</details>
|
|
<noscript>
|
|
<p>
|
|
Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have
|
|
changed
|
|
the social contract around how website hosting works. A no-JS solution is a work-in-progress.
|
|
</p>
|
|
</noscript>
|
|
<div id="testarea"></div>
|
|
</div>
|
|
}
|
|
|
|
templ errorPage(message string) {
|
|
<div class="centered-div">
|
|
<img
|
|
id="image"
|
|
alt="Sad Anubis"
|
|
style="width:100%;max-width:256px;"
|
|
src={ "/.within.website/x/cmd/anubis/static/img/reject.webp?cacheBuster=" + anubis.Version }
|
|
/>
|
|
<p>{ message }.</p>
|
|
<button onClick="window.location.reload();">Try again</button>
|
|
<p><a href="/">Go home</a></p>
|
|
</div>
|
|
}
|
|
|
|
templ bench() {
|
|
<div style="height:20rem;display:flex">
|
|
<table style="margin-top:1rem;display:grid;grid-template:auto 1fr/auto auto;gap:0 0.5rem">
|
|
<thead style="border-bottom:1px solid black;padding:0.25rem 0;display:grid;grid-template:1fr/subgrid;grid-column:1/-1">
|
|
<tr id="table-header" style="display:contents">
|
|
<th style="width:4.5rem">Time</th>
|
|
<th style="width:4rem">Iters</th>
|
|
</tr>
|
|
<tr id="table-header-compare" style="display:none">
|
|
<th style="width:4.5rem">Time A</th>
|
|
<th style="width:4rem">Iters A</th>
|
|
<th style="width:4.5rem">Time B</th>
|
|
<th style="width:4rem">Iters B</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody id="results"
|
|
style="padding-top:0.25rem;display:grid;grid-template-columns:subgrid;grid-auto-rows:min-content;grid-column:1/-1;row-gap:0.25rem;overflow-y:auto;font-variant-numeric:tabular-nums"></tbody>
|
|
</table>
|
|
<div class="centered-div">
|
|
<img
|
|
id="image"
|
|
style="width:100%;max-width:256px;"
|
|
src={ "/.within.website/x/cmd/anubis/static/img/pensive.webp?cacheBuster=" +
|
|
anubis.Version }
|
|
/>
|
|
<p id="status" style="max-width:256px">Loading...</p>
|
|
<script async type="module" src={
|
|
"/.within.website/x/cmd/anubis/static/js/bench.mjs?cacheBuster=" + anubis.Version }></script>
|
|
<div id="sparkline"></div>
|
|
<noscript>
|
|
<p>Running the benchmark tool requires JavaScript to be enabled.</p>
|
|
</noscript>
|
|
</div>
|
|
</div>
|
|
<form id="controls" style="position:fixed;top:0.5rem;right:0.5rem">
|
|
<div style="display:flex;justify-content:end">
|
|
<label for="difficulty-input" style="margin-right:0.5rem">Difficulty:</label>
|
|
<input id="difficulty-input" type="number" name="difficulty" style="width:3rem"/>
|
|
</div>
|
|
<div style="margin-top:0.25rem;display:flex;justify-content:end">
|
|
<label for="algorithm-select" style="margin-right:0.5rem">Algorithm:</label>
|
|
<select id="algorithm-select" name="algorithm"></select>
|
|
</div>
|
|
<div style="margin-top:0.25rem;display:flex;justify-content:end">
|
|
<label for="compare-select" style="margin-right:0.5rem">Compare:</label>
|
|
<select id="compare-select" name="compare">
|
|
<option value="NONE">-</option>
|
|
</select>
|
|
</div>
|
|
</form>
|
|
}
|