Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark for CellList #734

Merged
merged 1 commit into from
Jan 18, 2025
Merged

Conversation

padix-key
Copy link
Member

No description provided.

Copy link

codspeed-hq bot commented Jan 7, 2025

CodSpeed Performance Report

Merging #734 will not alter performance

Comparing padix-key:celllist-benchmark (242a5cc) with main (226f2b4)

Summary

✅ 59 untouched benchmarks

@Croydon-Brixton
Copy link
Contributor

Hey @padix-key , I've been wondering this for a while but never had the time to investigate and thought you might know this: How does biotite's internal CellList performance compare to scipy's KDTrees (https://docs.scipy.org/doc/scipy-1.15.0/reference/generated/scipy.spatial.KDTree.html)?

The functionality they provide seem very similar (querying nearest neighbors, querying all points within a distance X, looping around in case of periodic boundary conditions, etc.) but I suspect there might have been a deeper reason to implement CellList instead of deferring to scipy's KDTrees?

@padix-key
Copy link
Member Author

padix-key commented Jan 13, 2025

This is a good question. There are multiple reasons:

  • Years ago, when I implemented the CellList I did not know about scipy.spatial.KDTree 😉.
  • The implementation is inspired by MD programs. Specifically it is able to handle periodic boundary conditions, while I do not know if K-d trees can handle them as well. (It could if, simply adjacent periodic copies of the molecule are added to the K-d tree.)
  • Neighbour search in a cell list has O(1) time complexity while for a K-d tree it is O(log n) (if I am not wrong).

That being said, I would still be curious how they compare with each other in terms of performance, especially over a range of structure sizes!

@Croydon-Brixton
Copy link
Contributor

Thank you for the quick response - that's very helpful and makes a lot of sense!

I agree that it'd be interesting to compare the performance. I've used it on AtomArray's before when processing the entire PDB and it was very fast, but I never did a proper comparison with CellList.

I'll put it on my backlog of things to try out -- if it were to indeed work as fast and provide the equivalent functionality, we could consider switching in KD-trees to reduce the code volume we'd need to actively maintain.

@padix-key
Copy link
Member Author

I generally agree. However one additional aspect to consider is that it would add scipy as mandatory dependency (which would probably be no big deal).

@padix-key padix-key merged commit 886b446 into biotite-dev:main Jan 18, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants