Switch to the `*_neon` functions #183

aqjune-aws · 2025-01-16T22:16:14Z

This patch makes the *_neon functions replace their original scalar implementations. This (partially) resolves the divergence between the set of functions supported in Arm and in x86. There are still a few functions that are diverged - bignum_emontredc_8n_cdiff and bignum_copy_row_from_table_* which only exists in Arm - but all other functions are converged into one.

The original scalar functions are moved to the unopt/ directories. Their proofs are merged into the *_neon.ml proofs, which are again renamed to the original *.ml.
All _NEON and _neon suffixes are removed.

Also, this patch applies the NIST P-256 optimized field operations to p256_scalarmulbase which was missing in the past.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

aqjune-aws · 2025-01-16T22:29:42Z

To check that functions are correctly renamed, I ran benchmark with the following script:

suffix=_neon # Enable this line before applying this patch
suffix=  # ... after applying this patch

./benchmark bignum_sqr_8_16${suffix}
./benchmark bignum_mul_8_16${suffix}
./benchmark bignum_ksqr_16_32${suffix}
./benchmark bignum_ksqr_32_64${suffix}
./benchmark bignum_kmul_16_32${suffix}
./benchmark bignum_kmul_32_64${suffix}

./benchmark bignum_emontredc_8n${suffix}

for bits in 256 384 521; do
./benchmark bignum_montmul_p${bits}${suffix}
./benchmark bignum_montsqr_p${bits}${suffix}
done

for rowsz in 8n 16 32; do
./benchmark bignum_copy_row_from_table_${rowsz}${suffix}
done

./benchmark p256_scalarmulbase # Interestingly this performance result did not change much :/

This patch makes the `*_neon` functions replace their original scalar implementations. This (partially) resolves the divergence between the set of functions supported in Arm and in x86. There are still a few functions that are diverged - `bignum_emontredc_8n_cdiff` and `bignum_copy_row_from_table_*` which only exists in Arm - but all other functions are converged into one. The original scalar functions are moved to the `unopt/` directories. Their proofs are merged into the `*_neon.ml` proofs, which are again renamed to the original `*.ml`. All `_NEON` and `_neon` suffixes are removed. Also, this patch applies the NIST P-256 optimized field operations to `p256_scalarmulbase` which was missing in the past.

aqjune-aws force-pushed the neon_by_default branch from 7ec42ae to 8d4a6af Compare January 17, 2025 04:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to the `*_neon` functions #183

Switch to the `*_neon` functions #183

aqjune-aws commented Jan 16, 2025

aqjune-aws commented Jan 16, 2025

Switch to the *_neon functions #183

Are you sure you want to change the base?

Switch to the *_neon functions #183

Conversation

aqjune-aws commented Jan 16, 2025

aqjune-aws commented Jan 16, 2025

Switch to the `*_neon` functions #183

Switch to the `*_neon` functions #183