Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement pairwise distance on 2d grid #306

Merged

Conversation

AlexanderKalistratov
Copy link
Collaborator

@AlexanderKalistratov AlexanderKalistratov commented Oct 21, 2023

Numbers before fix (size=8192, igpu, float32)

sycl 1d numba-dpex 2d wrong indexes numba-mlir 2d wrong indexes
300ms 1s 1s

After fix:

sycl 2d numba-dpex 2d numba-mlir 2d
17ms 23ms 17ms

Fixes #290

@adarshyoga
Copy link
Contributor

The faster performance is due to greater parallelism - this kernel is executed with a n*n range whereas the previous version ran on n range for the same input size.
But this precludes us from running on large matrices. For instance, there is a failure with preset=L because the range is greater than the limits set by sycl.

@AlexanderKalistratov
Copy link
Collaborator Author

Blocker IntelPython/numba-dpex#1191

@AlexanderKalistratov
Copy link
Collaborator Author

@adarshyoga @ZzEeKkAa i need an approve to continue

@adarshyoga
Copy link
Contributor

@adarshyoga @ZzEeKkAa i need an approve to continue

@AlexanderKalistratov , I am fine with the changes in this PR for SYCL implementation. But, I don't think we should be changing the numba_dpex_k and numba_mlir_k implementations.
Our goal in dpbench should be to maintain the same logic across kernel and sycl implementations. If there is a performance gap in dpex and mlir implementations because of the way indexing in handled, then we should perhaps be addressing the gap through changes in the frameworks.

@adarshyoga adarshyoga merged commit 362d006 into IntelPython:main Oct 31, 2023
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update pairwise distance sycl implementation
2 participants