Skip to content

Commit

Permalink
Merge for v2 (#77)
Browse files Browse the repository at this point in the history
* Start tet1d module

* Update tet1d module

* Add CUDA support for tet1d module

* Add scalar input support in tet1d module

* simplify command line arguments for hatch.py

* Add support for generation of modules

* Make tet1d a module into the new system

* Add scalar version of some function from libm + reinterpret

* Fix refactoring

* Before merging master

* Fixes after merge

* For backup

* Fix CUDA

* Add forgotten files

* Working => backup

* Fixes

* All tests are passing f16 included

* For backup

* COVID-19

* Fixes

* Fixes

* Fixes: all test compile with nvcc

* ROCm support, addition on f32 and f16 are compiling

* TET1D tests are compiling with both nvcc and hipcc

* Merge CUDA and ROCm when code is the same

* Forgot files

* Now we can list generated files

* Forgot to merge nsimd.h

* Forgot to push

* Update .gitignore with the new file generated by the tet1d module.

* Return allocated arrays for tests

* Increase the minimum size of the tests array

* Fix segfault

* Fix segfault

* Add mask[oz]_load[zu] and mask_store[au] operators for CPU

* For backup

* For backup

* Fix for SSE

* Fix fma for C89

* Remove warning from GCC when using long long in C98 and C++98

* Fix warnings for C98 and C++98 and AVX512

* Add set1l, iota, mask_for_loop_tail for ARM

* Before merging master

* Fix ARM mask[oz]_load[au]

* Fixes for ARM SVE

* Fix warning when using __f16's

* Add alignment-templated masked loads/stores

* Rewrite friendly_but_not_optimized stuff

* Forgot file

* Fix ARM

* Fix ARM

* Cosmetic

* Backup

* Backup

* Backup

* Backup

* Forgot file

* For backup

* For backup

* Refactoring of documentation

* Add build.nsconfig + fix warning in fixed_point exp

* Fix warning in SPMd module

* Add forgotten file

* Fixes for CUDA

* Fixes for CPU

* Fixes

* Add gather/scatter for cpu and x86

* Add gather/scatter for arm (not tested yet)

* Fix gather/scatter for arm

* Deactivate tet1d module

* Cleanup

* Add scripts for building

* Fix setup and build script for Linux

* Changing computer

* Backup

* Fix script/setup.sh

* Fixes for fixed size SVE

* Fix Windows scripts

* Fix scripts for Linux

* Fix Makefile.nix for md2html

* Fix Makefile.win for md2html

* Fix generation of documentation

* Add mask scatter for cpu

* Add mask_scatter for x86

* Forgot a file

* Add mask_scatter for arm

* Add masked gather for cpu

* Add masked gather for x86

* Add masked gather for arm

* Fix masked gather for f16's

* Adapt SVE typedefs to new GCC 10

* Fixes for x86

* Fix tet1d tests for CUDA

* Fixes for HIP

* Fix warning fr ROCm/HIP

* Various fixes

* Fix tests for rec11, rec8, rsqrt11 and rsqrt8

* Fix rec11, rec8, rsqrt11, rsqrt8 tests

* Improve gather/scatter for neon128 and aarch64

* Add gather_linear + scatter_linear and remove masked gather and scatter

* Add linear gather + scatter

* Fix gather_linear for neon128 + aarch64

* Improve gather on aarch64 + neon128

* Add documentation for module TET1d

* Update README

* Add documentation for module TET1d

* Improve README with nsconfig stuff

* Improve README

* Improve README

* Improve README

* Improve README

* Fix warning for armclang

* Fix warning when compiling with Clang and C++98/03

* Fix generation of benches

* For backup

* First version (not finished yet)

* Add support for non closed operators

* Improve doc

* Improve documentation

* More fixes

* Fix broken link in README

* Add CONTRIBUTING.md

* Improve documentation

* Improve documentation

* Improve documentation + simplify scoped_aligned_mem_for

* Fix scoped_aligned_mem

* Fixed errors in nsimd.h

* Improve documentation

* Improve documentation

* Improve documentation

* Replace some print left by common.myprint

* Fixed multiple declarations

* Let benches generate despite the new function set1l

* Add a module offering a vectorized random generator

* Only generate rand module if flags passed from hatch are correct

* Removed F-strings

* Fix build.nsconfig

* Fix generation of rand module

* Building the library does not require C++14 anymore, C++98 is more than sufficient

* Update README

* Update README

* Setup.sh clone nstools using the same protocol as nsimd

* Add possibility to ignore tests/benches/...

* Add C++20 concepts to nsimd.h

* Add C++20 concepts to cxx_adv_api.hpp

* Add C++20 concepts to Python-generated functions

* Fix C++20 concepts

* Prepare support for oneAPI

* Add C++20 concepts doc

* Modify the rand module to allow generation with python 3.5 and earlier

* Improve doc + rename module rand --> random

* Fix menu of doc of random module

* Fix availability of scoped_mem...

* Fix tests to_pack*

* Tests are dependant of the SIMD architecture

* Improvements for Intel + Fixes for KNL

* More fixes for KNL and C89

* More fixes

* Fix fms/fnms for aarch64

* Fixes for SVE

* Fix warning whe compiling for 32-bits targets

* Cleaning in tests generation

* Fix ULP bounds for some operators

* Almost all tests are passing on 32-bits platform

* No more warning for 32-bits compilations

* Forgot a file

* Fix last errors in philox

* First version of quick'n'dirty CI

* Fix warnings

* Fix more warnings

* Fix Pyhon generation for module/random

* Fix fnms for SSE2 and SSE42

* Try again to fix warnings for GCC

* Fix warnings for Clang

* Add variable to compile for a given CUDA GPU

* Fix warnings for ROCm/HIP

* Fix CUDA f16 implementation

* Fix CUDA f16 implementation

* Fix CUDA f16 implementation

* Reduce size of arrays for GPU testing

* Reduce size of arrays for GPU testing

* Compile .so with nvcc and hipcc for binary compatibility

* Fix build.nsconfig

* Fix build.nsconfig

* Fix build.nsconfig

* Fix build.nsconfig

* Improve CI script + add static in NSIMD_INLINE

* Fix build.nsconfig for HIP

* Last fixes

* Fix issue: __popcnt64 not available in 32-bits mode

* Fix DLL specifier of *logulps*

* Fix MSVC 32-bits related issues

* Cosmetic

* Add __vectorcall for MSVC 32-bits

* Update .gitignore

Co-authored-by: Lénaïc Bagnères <[email protected]>
Co-authored-by: Lénaïc Bagnères <[email protected]>
Co-authored-by: Paul Gannay <[email protected]>
Co-authored-by: c <[email protected]>
Co-authored-by: Adrien Arnaud <[email protected]>
Co-authored-by: Rodolphe Cargnello <[email protected]>
  • Loading branch information
7 people authored Dec 11, 2020
1 parent df84e57 commit eb11949
Show file tree
Hide file tree
Showing 113 changed files with 17,285 additions and 9,171 deletions.
2 changes: 2 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Standard: Cpp03
ColumnLimit: 79
45 changes: 38 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,34 +1,65 @@
## Build system
build
# Common build dirs
build*/

## Auto-generated
# Dependencies
nstools/

# Binaries
*.o
*.so
*.pyc
*.exe
*.dll
*.dylib

# Generated files
## API
src/api_*.cpp
src/api_*

## Plateform specific code
include/nsimd/arm
include/nsimd/cpu
include/nsimd/cxx_adv_api_functions.hpp
include/nsimd/friendly_but_not_optimized.hpp
include/nsimd/functions.h
include/nsimd/ppc
include/nsimd/x86
src/api_*

## Tests
tests/c_base
tests/cxx_base
tests/cxx_adv
tests/modules/tet1d/
tests/modules/fixed_point/
tests/modules/rand/*.cpp
tests/modules/spmd/
tests/modules/random/

## Benches
benches/cxx_adv
_deps
_install
doc/html

## Modules
include/nsimd/modules/tet1d/
include/nsimd/modules/spmd/
include/nsimd/modules/fixed_point/
include/nsimd/scalar_utilities.h

## Doc
doc/html
doc/markdown/overview.md
doc/markdown/api.md
doc/markdown/api_*.md
doc/markdown/module_fixed_point_api*.md
doc/markdown/module_fixed_point_overview.md
doc/markdown/module_spmd_api*.md
doc/markdown/module_spmd_overview.md
doc/markdown/module_memory_management_overview.md
doc/md2html
doc/tmp.html

## Ulps
ulps/

## CI
_ci/
286 changes: 0 additions & 286 deletions CMakeLists.txt

This file was deleted.

Loading

0 comments on commit eb11949

Please sign in to comment.