libdivide-2.0
I am happy to announce the release of libdivide-2.0 🎉
Libdivide finally supports AVX2 and AVX512 vector division on x86 CPUs. Libdivide now also works with the clang-cl
compiler and the Intel C++ compiler on Windows. There have been many small incremental improvements which should provide minor speedups for many use cases.
Since libdivide is now nearly 10 years old and many features have been added over the years it has become necessary to remove some rarely used functionality. I have removed the unswitch functionality since it was a large amount of code that has never been used by anybody as far as I am aware of. So overall, even with the added support for AVX2 and AVX512, libdivide.h
now contains fewer lines of code than the previous release and compiles faster using both C and C++.
- BREAKING
- Removed unswitch functionality (#46)
- Renamed macro
LIBDIVIDE_USE_SSE2
toLIBDIVIDE_SSE2
- Renamed
divider::recover_divisor()
todivider::recover()
- BUG FIXES
- ENHANCEMENT
- TESTING
tester.cpp
: Convert to modern C++tester.cpp
: Add more test casesbenchmark_branchfreee.cpp
: Convert to modern C++benchmark.c
: Prevent compilers from optmizing too much
- BUILD
- Automatically detect SSE2/AVX2/AVX512
- DOCS
doc/C-API.md
: Add C API referencedoc/CPP-API.md
: Add C++ API referenceREADME.md
: Add vector division and performance tips sections