mirror of
https://github.com/cuberite/libdeflate.git
synced 2025-08-04 02:06:31 -04:00
Assume fast unaligned access on WebAssembly
I saw this tweet claiming this flag makes libdeflate run 20% faster on WebAssembly: https://twitter.com/Algunenano/status/1317098341377900550. Indeed, when tried even in a complex PNG compression benchmark I've observed 10-15% improvement when this flag is enabled. Even though WebAssembly might be running on top of a variety of underlying platforms, the spec requires it to support unaligned access, and on majority of platforms it will translate to a faster code. Hence, I think it makes sense to enable this flag by default.
This commit is contained in:
parent
e1803dc145
commit
a07ed5824a
@ -175,7 +175,16 @@ typedef unsigned char __v32qu __attribute__((__vector_size__(32)));
|
||||
# define bswap64 __builtin_bswap64
|
||||
#endif
|
||||
|
||||
#if defined(__x86_64__) || defined(__i386__) || defined(__ARM_FEATURE_UNALIGNED) || defined(__powerpc64__)
|
||||
#if defined(__x86_64__) || defined(__i386__) || \
|
||||
defined(__ARM_FEATURE_UNALIGNED) || defined(__powerpc64__) || \
|
||||
/*
|
||||
* For all compilation purposes, WebAssembly behaves like any other CPU
|
||||
* instruction set. Even though WebAssembly engine might be running on top
|
||||
* of different actual CPU architectures, the WebAssembly spec itself
|
||||
* permits unaligned access and it will be fast on most of those platforms,
|
||||
* and simulated at the engine level on others, so it's worth treating it
|
||||
* as a CPU architecture with fast unaligned access.
|
||||
*/ defined(__wasm__)
|
||||
# define UNALIGNED_ACCESS_IS_FAST 1
|
||||
#endif
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user