fix(web/fast): remove event loop thrashing (#880)

Fixes #877 Continued from #879, event loop thrashing can cause stack space exhaustion on ia32 systems. Previously this would thrash the event loop in Firefox and Firefox derived browsers such as Pale Moon. I suspect that this is the ultimate root cause of the bizarre irreproducible bugs that Pale Moon (and maybe Cromite) users have been reporting since at least #87 was merged. The root cause is an invalid boolean statement: ```js // send a progress update every 1024 iterations. since each thread checks // separate values, one simple way to do this is by bit masking the // nonce for multiples of 1024. unfortunately, if the number of threads // is not prime, only some of the threads will be sending the status // update and they will get behind the others. this is slightly more // complicated but ensures an even distribution between threads. if ( (nonce > oldNonce) | 1023 && // we've wrapped past 1024 (nonce >> 10) % threads === threadId // and it's our turn ) { postMessage(nonce); } ``` The logic here looks fine but is subtly wrong as was reported in #877 by a user in the Pale Moon community. Consider the following scenario: `nonce` is a counter that increments by the worker count every loop. This is intended to spread the load between CPU cores as such: | Iteration | Worker ID | Nonce | | :-------- | :-------- | :---- | | 1 | 0 | 0 | | 1 | 1 | 1 | | 2 | 0 | 2 | | 3 | 1 | 3 | And so on. The incorrect part of this is the boolean logic, specifically the part with the bitwise or `|`. I think the intent was to use a logical or (`||`), but this had the effect of making the `postMessage` handler fire on every iteration. The intent of this snippet (as the comment clearly indicates) is to make sure that the main event loop is only updated with the worker status every 1024 iterations per worker. This had the opposite effect, causing a lot of messages to be sent from workers to the parent JavaScript context. This is bad for the event loop. Instead, I have ripped out that statement and replaced it with a much simpler increment only counter that fires every 1024 iterations. Additionally, only the first thread communicates back to the parent process. This does mean that in theory the other workers could be ahead of the first thread (posting a message out of a worker has a nonzero cost), but in practice I don't think this will be as much of an issue as the current behaviour is. The root cause of the stack exhaustion is likely the pressure caused by all of the postMessage futures piling up. Maybe the larger stack size in 64 bit environments is causing this to be fine there, maybe it's some combination of newer hardware in 64 bit systems making this not be as much of a problem due to it being able to handle events fast enough to keep up with the pressure. Either way, thanks much to @wolfbeast and the Pale Moon community for finding this. This will make Anubis faster for everyone! Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2025-08-03 09:48:08 -04:00 · 2025-07-21 18:05:24 -04:00 · 2025-07-21 18:05:24 -04:00 · e69fadddf1
commit e69fadddf1
parent 5e8ebaeb5d
2 changed files with 18 additions and 12 deletions
--- a/docs/docs/CHANGELOG.md
+++ b/docs/docs/CHANGELOG.md
@ -26,6 +26,17 @@ Anubis now supports the [`missingHeader`](./admin/configuration/expressions.mdx#
 ### Fixes
 #### Fix event loop thrashing when solving a proof of work challenge
 Previously the "fast" proof of work solver had a fragment of JavaScript that attempted to only post an update about proof of work progress to the main browser window every 1024 iterations. This fragment of JavaScript was subtly incorrect in a way that passed review but actually made the workers send an update back to the main thread every iteration. This caused a pileup of unhandled async calls (similar to a socket accept() backlog pileup in Unix) that caused stack space exhaustion.
 This has been fixed in the following ways:
 1. The complicated boolean logic has been totally removed in favour of a worker-local iteration counter.
 2. The progress bar is updated by worker `0` instead of all workers.
 Hopefully this should limit the event loop thrashing and let ia32 browsers (as well as any environment with a smaller stack size than amd64 and aarch64 seem to have) function normally when processing Anubis proof of work challenges.
 #### Fix potential memory leak when discovering a solution
 In some cases, the parallel solution finder in Anubis could cause all of the worker promises to leak due to the fact the promises were being improperly terminated. This was fixed by having Anubis debounce worker termination instead of allowing it to potentially recurse infinitely.
--- a/web/js/proof-of-work.mjs
+++ b/web/js/proof-of-work.mjs
@ -3,7 +3,7 @@ export default function process(
  difficulty = 5,
  signal = null,
  progressCallback = null,
-  threads = navigator.hardwareConcurrency || 1,
+  threads = Math.max(navigator.hardwareConcurrency / 2, 1),
 ) {
  console.debug("fast algo");
  return new Promise((resolve, reject) => {
@ -89,6 +89,7 @@ function processTask() {
      let threads = event.data.threads;
      const threadId = nonce;
      let localIterationCount = 0;
      while (true) {
        const currentHash = await sha256(data + nonce);
@ -114,21 +115,15 @@ function processTask() {
          break;
        }
        const oldNonce = nonce;
        nonce += threads;
-        // send a progress update every 1024 iterations. since each thread checks
+        // send a progress update every 1024 iterations so that the user can be informed of
-        // separate values, one simple way to do this is by bit masking the
+        // the state of the challenge.
-        // nonce for multiples of 1024. unfortunately, if the number of threads
+        if (threadId == 0 && localIterationCount === 1024) {
        // is not prime, only some of the threads will be sending the status
        // update and they will get behind the others. this is slightly more
        // complicated but ensures an even distribution between threads.
        if (
          (nonce > oldNonce) | 1023 && // we've wrapped past 1024
          (nonce >> 10) % threads === threadId // and it's our turn
        ) {
          postMessage(nonce);
          localIterationCount = 0;
        }
        localIterationCount++;
      }
      postMessage({