-
-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
champ for_each_chunk_p #270
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #270 +/- ##
==========================================
+ Coverage 90.53% 90.54% +0.01%
==========================================
Files 119 119
Lines 12144 12203 +59
==========================================
+ Hits 10994 11049 +55
- Misses 1150 1154 +4 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great contribution thank you! Once gain, sorry for the delays in reviewing this...
...one of the reasons it took me long to review it, is I wanted to book time to understand the implementation properly, as a recursive implementation would have been easier to understand.
Have you tried to benchmark this and see if there is an actual performance benefit and, if so how much? It feels to me that you're doing more or less the same work than the compiler generates when using recursion normally (just a bit less, not saving the pointer to the code position). But I ask out of genuine curiosity as I really can't predict here what performance would be like, as modern compilers and processors are sometimes surprising when it comes to micro-optimizing...
Hi @arximboldi, I might do some benchmarks to show some numbers, but I currently cannot tell you when I will find the time to do this. |
My argument in this case is not that the compiler would inline the code, but that alternative code does something very similar to what the compiler does when making a normal function call (i.e. pushing the local vars in the stack), so you're kind of doing manually the compilers job for no potential performance gain. I may be wrong of course as this kind of micro-optimization is very nuanced. |
Currently, the immer
set
,map
, andtable
containers only support a subset of available algorithms; especiallyimmer::all_of
is not supported. That is, because the underlyingchamp
does not implementfor_each_chunk_p
.This PR adds an implementation of
for_each_chunk_p
to the above mentionedchamp
. Should be related to #171.Design considerations:
const node_t *
. However, this has the drawback of potential memory allocation, so the implementation uses an explicit stack (std::array
) that models the required parts of the call-stack fromfor_each_chunk_traversal
.std::invoke
the callback if compiled with C++17 or higher?