Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for decimal256 and allow rounding decimal to scale #439

Merged
merged 6 commits into from
Jan 3, 2025

Conversation

raghumdani
Copy link
Collaborator

Summary

Arrow cannot support reading CSV using decimal256. We add support by first reading them as strings and then casting to decimal256. Secondly, we support rounding decimal values to scale specified in the schema if the decimal values in the CSV does not conform to scale.

Rationale

The behavior will be same as Spark compaction.

Changes

  • added tests
  • Changed pyarrow.py

Impact

This change is backward compatible.

Testing

Changes were tested in a functional test suite as well reading individual problematic files in beta.

Regression Risk

Low

Checklist

  • Unit tests covering the changes have been added

    • If this is a bugfix, regression tests have been added
  • E2E testing has been performed

Additional Notes

Any additional information or context relevant to this PR.

@raghumdani raghumdani changed the title Decimalbugfix Add support for decimal256 and allow rounding decimal to scale Jan 3, 2025
Copy link
Collaborator

@yankevn yankevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also test on a table that was failing previously?

deltacat/utils/pyarrow.py Outdated Show resolved Hide resolved
deltacat/utils/pyarrow.py Show resolved Hide resolved
@raghumdani
Copy link
Collaborator Author

Could you also test on a table that was failing previously?

Yes, tested on two tables and the reads are working fine. The resultant tables are same as what spark would produce.

Copy link
Collaborator

@yankevn yankevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raghumdani raghumdani merged commit a1b8159 into main Jan 3, 2025
3 checks passed
@raghumdani raghumdani deleted the decimalbugfix branch January 3, 2025 19:53
@raghumdani raghumdani mentioned this pull request Jan 8, 2025
3 tasks
@pfaraone pfaraone mentioned this pull request Jan 22, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants