Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression come again on dotnet 9 #111016

Open
kingsznhone opened this issue Jan 1, 2025 · 4 comments
Open

Performance regression come again on dotnet 9 #111016

kingsznhone opened this issue Jan 1, 2025 · 4 comments
Assignees
Labels
area-System.Numerics tenet-performance Performance related issue untriaged New issue has not been triaged by the area owner

Comments

@kingsznhone
Copy link

Description

#95954
With temporary resolution in this issue . I add this segment to fix dotnet 8 performance regression. It makes performance back to 90% of dotnet 7
https://github.com/kingsznhone/VSOP2013.NET/blob/8a9e03fd734d9c29de788877724e32022b042d21/VSOP2013.NET/Calculator.cs#L151

Few days ago. I try dotnet 9 to run perf test. Same situation occor as before.

Ethier I apply that fix or not. Performance still very bad.

This line cause performance heavily drop as before.
(su, cu) = Math.SinCos(u);

Data

``

BenchmarkDotNet v0.13.11, Windows 11 (10.0.22631.4602/23H2/2023Update/SunValley3)
12th Gen Intel Core i9-12950HX, 1 CPU, 16 logical and 16 physical cores
.NET SDK 9.0.101
[Host] : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2 [AttachedDebugger]
.NET 8.0 : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
.NET 9.0 : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2

| Method  | Job      | Runtime  | Mean        | Error     | StdDev    | Ratio | Allocated | Alloc Ratio |
|-------- |--------- |--------- |------------:|----------:|----------:|------:|----------:|------------:|
| Compute | .NET 8.0 | .NET 8.0 |    739.2 μs |   8.05 μs |   7.14 μs |  1.00 |   2.91 KB |        1.00 |
|         |          |          |             |           |           |       |           |             |
| Compute | .NET 9.0 | .NET 9.0 | 29,018.1 μs | 183.88 μs | 172.00 μs |  1.00 |   2.92 KB |        1.00 |


@kingsznhone kingsznhone added the tenet-performance Performance related issue label Jan 1, 2025
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 1, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 1, 2025
@vcsjones vcsjones added area-System.Numerics and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 1, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

@tannergooding
Copy link
Member

I do measure a regression, but nowhere near what you're seeing (on Intel or AMD)

This is due to an MSVC correctness bug that exists in native: https://developercommunity.visualstudio.com/t/MSVCs-sincos-implementation-is-incorrec/10582378 and thus .NET 9 was fixed to no longer use the /fp:fast implementation and invokes the /fp:precise one instead. .NET 8 retains the buggy behavior and will return an incorrect Cos result for some large inputs.

The exact performance is a bit dependent on the input, but it should be pessimizing to approximately the same performance as if you invoked Sin and Cos independently, which is what I'm getting locally.

Intel

BenchmarkDotNet v0.13.11, Windows 11 (10.0.26100.2605)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.100
  [Host]   : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 8.0 : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Job Runtime Mean Error StdDev Ratio Code Size Allocated Alloc Ratio
Compute .NET 8.0 .NET 8.0 1.397 ms 0.0119 ms 0.0117 ms 1.00 314 B 2.9 KB 1.00
Compute .NET 9.0 .NET 9.0 2.281 ms 0.0068 ms 0.0064 ms 1.00 314 B 2.9 KB 1.00

AMD

BenchmarkDotNet v0.13.11, Windows 11 (10.0.26100.2605)
AMD Ryzen 9 7950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.200-preview.0.24575.35
  [Host]   : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 8.0 : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Job Runtime Mean Error StdDev Ratio Code Size Allocated Alloc Ratio
Compute .NET 8.0 .NET 8.0 714.2 us 2.00 us 1.56 us 1.00 314 B 2.92 KB 1.00
Compute .NET 9.0 .NET 9.0 1,314.8 us 5.29 us 4.69 us 1.00 314 B 2.9 KB 1.00

@kingsznhone
Copy link
Author

kingsznhone commented Jan 2, 2025

I do measure a regression, but nowhere near what you're seeing (on Intel or AMD)

This is due to an MSVC correctness bug that exists in native: https://developercommunity.visualstudio.com/t/MSVCs-sincos-implementation-is-incorrec/10582378 and thus .NET 9 was fixed to no longer use the /fp:fast implementation and invokes the /fp:precise one instead. .NET 8 retains the buggy behavior and will return an incorrect Cos result for some large inputs.

The exact performance is a bit dependent on the input, but it should be pessimizing to approximately the same performance as if you invoked Sin and Cos independently, which is what I'm getting locally.

Intel

BenchmarkDotNet v0.13.11, Windows 11 (10.0.26100.2605)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.100
  [Host]   : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 8.0 : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Job Runtime Mean Error StdDev Ratio Code Size Allocated Alloc Ratio
Compute .NET 8.0 .NET 8.0 1.397 ms 0.0119 ms 0.0117 ms 1.00 314 B 2.9 KB 1.00
Compute .NET 9.0 .NET 9.0 2.281 ms 0.0068 ms 0.0064 ms 1.00 314 B 2.9 KB 1.00

AMD

BenchmarkDotNet v0.13.11, Windows 11 (10.0.26100.2605)
AMD Ryzen 9 7950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK 9.0.200-preview.0.24575.35
  [Host]   : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 8.0 : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 9.0 : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method Job Runtime Mean Error StdDev Ratio Code Size Allocated Alloc Ratio
Compute .NET 8.0 .NET 8.0 714.2 us 2.00 us 1.56 us 1.00 314 B 2.92 KB 1.00
Compute .NET 9.0 .NET 9.0 1,314.8 us 5.29 us 4.69 us 1.00 314 B 2.9 KB 1.00

I notice that your platform have avx512 support, but my laptop only got avx2. Or it might be a exclusive bug of 12/13/14th gen core CPU,lol

As you say, I think I should call sin & cos seperately in the future, to aviod strange behaviour. thanks

@jeffhandley
Copy link
Member

Assigned to @PranavSenthilnathan for triage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Numerics tenet-performance Performance related issue untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

5 participants