-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip hash bucketing for small partitions #264
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionality wise, looks good. I think a bit of refactoring is needed to ensure it's easier to maintain this code in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good mostly. Couple of minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR adds a feature to skip hash bucketing if the user-supplied hash bucket count is set to 1.
Changes
merge
step directly to compact tablesmerge
step could be dealing with delta file envelope object refs (remote) or pure delta file envelopes (local), an abstraction layer is added for delta file envelope retrieval during invoke bymerge
.Initial Benchmarks