-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider distributing zlib-ng #464
Comments
Note that on macOS we don't currently build zlib from source. I can't definitely recall the historical reason for this. But I think it had something to do with macOS is guaranteed to have a copy of libz at I mention this because if we do ship zlib-ng it could create [unwanted] divergence between macOS and !macOS distributions. |
Of minor note, we recently switched from |
I wasn't aware of the pure rust zlib-rs crate! That does look promising. I see some use of AVX-2 in there, which could make it performance competitive with zlib-ng. However, it appears to be using compile-time CPU feature detection via e.g. zlib-ng, by contrast, uses runtime CPU feature detection. So the binary compiles in different versions of functions using different assembly instructions and then picks the best one at runtime. This is vastly more reliable than build time targeting since 1 binary runs optimally on ~every x86-64 CPU microarchitecture level. (You can achieve similar functionality in Rust by using a crate like https://crates.io/crates/multiversion.) While I haven't reproduced, I'd be surprised if a vanilla x86-64 target build of zlib-rs was performance competitive with zlib-ng without using AVX2 instructions. It is likely that whoever conducted the performance benchmark you linked (@charliermarsh?) had a If zlib-rs -AVX2 is actually performance competitive with zlib-ng +AVX2, this would be worthy of tech blog headlines since it means the Rust compiler at the vanilla x86-64 instruction set is performance competitive with C + handwritten AVX2 assembly for similar functionality. (I doubt this is the case and it is more likely the benchmarking methodology was flawed.) |
I ran it on my M3, so it seems more likely that none of the x86 feature sets were involved at all? Still might be worth it in our case to remove our CMake dependency. |
Yup. This is closer to a Rust vs C benchmark. Or a comparison of zlib-ng vs zlib-rs ARM support. TBH I'm unsure how optimized each is on ARM: I've mostly heard about the various x86-64 assembly optimizations. |
Makes sense! (Maybe we'll restore |
We build and distribute our own zlib library.
We're currently using the canonical zlib library available from zlib.net.
zlib-ng (https://github.com/zlib-ng/zlib-ng) has various optimizations for zlib that can realize >10% perf wins while maintaining API compatibility. API compatibility means that zlib-ng should just work.
I reckon we could safely compile against and distribute zlib-ng so end-users automagically see significant performance wins.
The text was updated successfully, but these errors were encountered: