fix(web/fast): remove event loop thrashing (#880)

Fixes #877

Continued from #879, event loop thrashing can cause stack space
exhaustion on ia32 systems. Previously this would thrash the event loop
in Firefox and Firefox derived browsers such as Pale Moon. I suspect
that this is the ultimate root cause of the bizarre irreproducible bugs
that Pale Moon (and maybe Cromite) users have been reporting since at
least #87 was merged.

The root cause is an invalid boolean statement:

```js
// send a progress update every 1024 iterations. since each thread checks
// separate values, one simple way to do this is by bit masking the
// nonce for multiples of 1024. unfortunately, if the number of threads
// is not prime, only some of the threads will be sending the status
// update and they will get behind the others. this is slightly more
// complicated but ensures an even distribution between threads.
if (
  (nonce > oldNonce) | 1023 && // we've wrapped past 1024
  (nonce >> 10) % threads === threadId // and it's our turn
) {
  postMessage(nonce);
}
```

The logic here looks fine but is subtly wrong as was reported in #877
by a user in the Pale Moon community. Consider the following scenario:

`nonce` is a counter that increments by the worker count every loop.
This is intended to spread the load between CPU cores as such:

| Iteration | Worker ID | Nonce |
| :-------- | :-------- | :---- |
| 1         | 0         | 0     |
| 1         | 1         | 1     |
| 2         | 0         | 2     |
| 3         | 1         | 3     |

And so on.

The incorrect part of this is the boolean logic, specifically the part
with the bitwise or `|`. I think the intent was to use a logical or
(`||`), but this had the effect of making the `postMessage` handler fire
on every iteration. The intent of this snippet (as the comment clearly
indicates) is to make sure that the main event loop is only updated with
the worker status every 1024 iterations per worker. This had the
opposite effect, causing a lot of messages to be sent from workers to
the parent JavaScript context.

This is bad for the event loop.

Instead, I have ripped out that statement and replaced it with a much
simpler increment only counter that fires every 1024 iterations.
Additionally, only the first thread communicates back to the parent
process. This does mean that in theory the other workers could be ahead
of the first thread (posting a message out of a worker has a nonzero
cost), but in practice I don't think this will be as much of an issue as
the current behaviour is.

The root cause of the stack exhaustion is likely the pressure caused by
all of the postMessage futures piling up. Maybe the larger stack size in
64 bit environments is causing this to be fine there, maybe it's some
combination of newer hardware in 64 bit systems making this not be as
much of a problem due to it being able to handle events fast enough to
keep up with the pressure.

Either way, thanks much to @wolfbeast and the Pale Moon community for
finding this. This will make Anubis faster for everyone!

Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
This commit is contained in:
Xe Iaso 2025-07-21 18:05:24 -04:00 committed by GitHub
parent 5e8ebaeb5d
commit e69fadddf1
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 18 additions and 12 deletions

View File

@ -26,6 +26,17 @@ Anubis now supports the [`missingHeader`](./admin/configuration/expressions.mdx#
### Fixes ### Fixes
#### Fix event loop thrashing when solving a proof of work challenge
Previously the "fast" proof of work solver had a fragment of JavaScript that attempted to only post an update about proof of work progress to the main browser window every 1024 iterations. This fragment of JavaScript was subtly incorrect in a way that passed review but actually made the workers send an update back to the main thread every iteration. This caused a pileup of unhandled async calls (similar to a socket accept() backlog pileup in Unix) that caused stack space exhaustion.
This has been fixed in the following ways:
1. The complicated boolean logic has been totally removed in favour of a worker-local iteration counter.
2. The progress bar is updated by worker `0` instead of all workers.
Hopefully this should limit the event loop thrashing and let ia32 browsers (as well as any environment with a smaller stack size than amd64 and aarch64 seem to have) function normally when processing Anubis proof of work challenges.
#### Fix potential memory leak when discovering a solution #### Fix potential memory leak when discovering a solution
In some cases, the parallel solution finder in Anubis could cause all of the worker promises to leak due to the fact the promises were being improperly terminated. This was fixed by having Anubis debounce worker termination instead of allowing it to potentially recurse infinitely. In some cases, the parallel solution finder in Anubis could cause all of the worker promises to leak due to the fact the promises were being improperly terminated. This was fixed by having Anubis debounce worker termination instead of allowing it to potentially recurse infinitely.

View File

@ -3,7 +3,7 @@ export default function process(
difficulty = 5, difficulty = 5,
signal = null, signal = null,
progressCallback = null, progressCallback = null,
threads = navigator.hardwareConcurrency || 1, threads = Math.max(navigator.hardwareConcurrency / 2, 1),
) { ) {
console.debug("fast algo"); console.debug("fast algo");
return new Promise((resolve, reject) => { return new Promise((resolve, reject) => {
@ -89,6 +89,7 @@ function processTask() {
let threads = event.data.threads; let threads = event.data.threads;
const threadId = nonce; const threadId = nonce;
let localIterationCount = 0;
while (true) { while (true) {
const currentHash = await sha256(data + nonce); const currentHash = await sha256(data + nonce);
@ -114,21 +115,15 @@ function processTask() {
break; break;
} }
const oldNonce = nonce;
nonce += threads; nonce += threads;
// send a progress update every 1024 iterations. since each thread checks // send a progress update every 1024 iterations so that the user can be informed of
// separate values, one simple way to do this is by bit masking the // the state of the challenge.
// nonce for multiples of 1024. unfortunately, if the number of threads if (threadId == 0 && localIterationCount === 1024) {
// is not prime, only some of the threads will be sending the status
// update and they will get behind the others. this is slightly more
// complicated but ensures an even distribution between threads.
if (
(nonce > oldNonce) | 1023 && // we've wrapped past 1024
(nonce >> 10) % threads === threadId // and it's our turn
) {
postMessage(nonce); postMessage(nonce);
localIterationCount = 0;
} }
localIterationCount++;
} }
postMessage({ postMessage({