Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match the Postgres NaN behavior in vectorized filters #7598

Merged
merged 5 commits into from
Jan 22, 2025
Merged

Conversation

akuzm
Copy link
Member

@akuzm akuzm commented Jan 16, 2025

It has some nonstandard rules that don't match the IEEE floats.

Fixes #6884

It has some nonstandard rules that don't match the IEEE floats.
Copy link

codecov bot commented Jan 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.26%. Comparing base (59f50f2) to head (ae2be13).
Report is 705 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7598      +/-   ##
==========================================
+ Coverage   80.06%   82.26%   +2.19%     
==========================================
  Files         190      238      +48     
  Lines       37181    44135    +6954     
  Branches     9450    11102    +1652     
==========================================
+ Hits        29770    36307    +6537     
- Misses       2997     3449     +452     
+ Partials     4414     4379      -35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@akuzm akuzm marked this pull request as ready for review January 20, 2025 11:31
Copy link
Contributor

@erimatnor erimatnor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the same kind of fix that I made earlier for vector aggs, but now applied to filters. Is that correct?

@akuzm
Copy link
Member Author

akuzm commented Jan 22, 2025

Looks like this is the same kind of fix that I made earlier for vector aggs, but now applied to filters. Is that correct?

yeah, something like that

@akuzm akuzm enabled auto-merge (squash) January 22, 2025 10:34
@akuzm akuzm merged commit f0996a4 into timescale:main Jan 22, 2025
49 of 51 checks passed
@akuzm akuzm deleted the nan branch January 22, 2025 10:51
github-actions bot pushed a commit that referenced this pull request Jan 22, 2025
It has some nonstandard rules that don't match the IEEE floats.

(cherry picked from commit f0996a4)
timescale-automation pushed a commit that referenced this pull request Jan 22, 2025
It has some nonstandard rules that don't match the IEEE floats.

(cherry picked from commit f0996a4)
svenklemm pushed a commit to pallavisontakke/timescaledb that referenced this pull request Jan 23, 2025
It has some nonstandard rules that don't match the IEEE floats.
svenklemm pushed a commit to pallavisontakke/timescaledb that referenced this pull request Jan 23, 2025
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity.

**Highlighted features in TimescaleDB v2.18.0**

* The ability to add secondary indexes to the columnstore through the new hypercore table access method.
* Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore.
* Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests.
* Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression.

**Dropping support for Bitnami images**

After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha)

**Deprecation Notice**

We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below:

| Deprecated | Replacement | Type |
| --- | --- | --- |
| decompress_chunk | convert_to_rowstore | Procedure |
| compress_chunk | convert_to_columnstore | Procedure |
| add_compression_policy | add_columnstore_policy | Function |
| remove_compression_policy | remove_columnstore_policy | Function |
| hypertable_compression_stats | hypertable_columnstore_stats | Function |
| chunk_compression_stats | chunk_columnstore_stats | Function |
| hypertable_compression_settings | hypertable_columnstore_settings | View |
| chunk_compression_settings | chunk_columnstore_settings | View |
| compression_settings | columnstore_settings | View |
| timescaledb.compress | timescaledb.enable_columnstore | Parameter |
| timescaledb.compress_segmentby | timescaledb.segmentby | Parameter |
| timescaledb.compress_orderby  | timescaledb.orderby | Parameter |

**Features**
* timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types).
* timescale#7104: Hypercore table access method.
* timescale#6901: Add hypertable support for transition tables.
* timescale#7482: Optimize recompression of partially compressed chunks.
* timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable.
* timescale#7433: Add support for merging chunks.
* timescale#7271: Push down `order by` in real-time continuous aggregate queries.
* timescale#7455: Support `drop not null` on compressed hypertables.
* timescale#7295: Support `alter table set access method` on hypertable.
* timescale#7411: Change parameter name to enable hypercore table access method.
* timescale#7436: Add index creation on `order by` columns.
* timescale#7443: Add hypercore function and view aliases.
* timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`.
* timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases.
* timescale#7565: Add hint when hypertable creation fails.
* timescale#7390: Disable custom `hashagg` planner code.
* timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API.
* timescale#7486: Prevent building against PostgreSQL versions with broken ABI.
* timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default.
* timescale#7413: Add GUC for segmentwise recompression.

**Bugfixes**
* timescale#7378: Remove obsolete job referencing `policy_job_error_retention`.
* timescale#7409: Update `bgw_job` table when altering procedure.
* timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query.
* timescale#7426: Fix `datetime` parsing error in chunk constraint creation.
* timescale#7432: Verify that the heap tuple is valid before using.
* timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk.
* timescale#7488: Emit error for transition table trigger on chunks.
* timescale#7514: Fix the error: `invalid child of chunk append`.
* timescale#7517: Fix the performance regression on the `cagg_migrate` procedure.
* timescale#7527: Restart scheduler on error.
* timescale#7557: Fix null handling for in-memory tuple filtering.
* timescale#7566: Improve transaction check in CAGG refresh.
* timescale#7584: Fix NaN-handling for vectorized aggregation.
* timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables.

**Thanks**
* @bharrisau for reporting the segfault when creating chunks.
* @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables.
* @k-rus for suggesting that we add a hint when hypertable creation fails.
* @staticlibs for sending the pull request that improves the transaction check in CAGG refresh.
* @uasiddiqi for reporting the `aggregated compressed column not found` error.
svenklemm pushed a commit to pallavisontakke/timescaledb that referenced this pull request Jan 23, 2025
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity.

**Highlighted features in TimescaleDB v2.18.0**

* The ability to add secondary indexes to the columnstore through the new hypercore table access method.
* Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore.
* Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests.
* Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression.

**Dropping support for Bitnami images**

After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha)

**Deprecation Notice**

We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below:

| Deprecated | Replacement | Type |
| --- | --- | --- |
| decompress_chunk | convert_to_rowstore | Procedure |
| compress_chunk | convert_to_columnstore | Procedure |
| add_compression_policy | add_columnstore_policy | Function |
| remove_compression_policy | remove_columnstore_policy | Function |
| hypertable_compression_stats | hypertable_columnstore_stats | Function |
| chunk_compression_stats | chunk_columnstore_stats | Function |
| hypertable_compression_settings | hypertable_columnstore_settings | View |
| chunk_compression_settings | chunk_columnstore_settings | View |
| compression_settings | columnstore_settings | View |
| timescaledb.compress | timescaledb.enable_columnstore | Parameter |
| timescaledb.compress_segmentby | timescaledb.segmentby | Parameter |
| timescaledb.compress_orderby  | timescaledb.orderby | Parameter |

**Features**
* timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types).
* timescale#7104: Hypercore table access method.
* timescale#6901: Add hypertable support for transition tables.
* timescale#7482: Optimize recompression of partially compressed chunks.
* timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable.
* timescale#7433: Add support for merging chunks.
* timescale#7271: Push down `order by` in real-time continuous aggregate queries.
* timescale#7455: Support `drop not null` on compressed hypertables.
* timescale#7295: Support `alter table set access method` on hypertable.
* timescale#7411: Change parameter name to enable hypercore table access method.
* timescale#7436: Add index creation on `order by` columns.
* timescale#7443: Add hypercore function and view aliases.
* timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`.
* timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases.
* timescale#7565: Add hint when hypertable creation fails.
* timescale#7390: Disable custom `hashagg` planner code.
* timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API.
* timescale#7486: Prevent building against PostgreSQL versions with broken ABI.
* timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default.
* timescale#7413: Add GUC for segmentwise recompression.

**Bugfixes**
* timescale#7378: Remove obsolete job referencing `policy_job_error_retention`.
* timescale#7409: Update `bgw_job` table when altering procedure.
* timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query.
* timescale#7426: Fix `datetime` parsing error in chunk constraint creation.
* timescale#7432: Verify that the heap tuple is valid before using.
* timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk.
* timescale#7488: Emit error for transition table trigger on chunks.
* timescale#7514: Fix the error: `invalid child of chunk append`.
* timescale#7517: Fix the performance regression on the `cagg_migrate` procedure.
* timescale#7527: Restart scheduler on error.
* timescale#7557: Fix null handling for in-memory tuple filtering.
* timescale#7566: Improve transaction check in CAGG refresh.
* timescale#7584: Fix NaN-handling for vectorized aggregation.
* timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables.

**Thanks**
* @bharrisau for reporting the segfault when creating chunks.
* @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables.
* @k-rus for suggesting that we add a hint when hypertable creation fails.
* @staticlibs for sending the pull request that improves the transaction check in CAGG refresh.
* @uasiddiqi for reporting the `aggregated compressed column not found` error.
svenklemm pushed a commit to pallavisontakke/timescaledb that referenced this pull request Jan 23, 2025
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity.

**Highlighted features in TimescaleDB v2.18.0**

* The ability to add secondary indexes to the columnstore through the new hypercore table access method.
* Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore.
* Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests.
* Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression.

**Dropping support for Bitnami images**

After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha)

**Deprecation Notice**

We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below:

| Deprecated | Replacement | Type |
| --- | --- | --- |
| decompress_chunk | convert_to_rowstore | Procedure |
| compress_chunk | convert_to_columnstore | Procedure |
| add_compression_policy | add_columnstore_policy | Function |
| remove_compression_policy | remove_columnstore_policy | Function |
| hypertable_compression_stats | hypertable_columnstore_stats | Function |
| chunk_compression_stats | chunk_columnstore_stats | Function |
| hypertable_compression_settings | hypertable_columnstore_settings | View |
| chunk_compression_settings | chunk_columnstore_settings | View |
| compression_settings | columnstore_settings | View |
| timescaledb.compress | timescaledb.enable_columnstore | Parameter |
| timescaledb.compress_segmentby | timescaledb.segmentby | Parameter |
| timescaledb.compress_orderby  | timescaledb.orderby | Parameter |

**Features**
* timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types).
* timescale#7104: Hypercore table access method.
* timescale#6901: Add hypertable support for transition tables.
* timescale#7482: Optimize recompression of partially compressed chunks.
* timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable.
* timescale#7433: Add support for merging chunks.
* timescale#7271: Push down `order by` in real-time continuous aggregate queries.
* timescale#7455: Support `drop not null` on compressed hypertables.
* timescale#7295: Support `alter table set access method` on hypertable.
* timescale#7411: Change parameter name to enable hypercore table access method.
* timescale#7436: Add index creation on `order by` columns.
* timescale#7443: Add hypercore function and view aliases.
* timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`.
* timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases.
* timescale#7565: Add hint when hypertable creation fails.
* timescale#7390: Disable custom `hashagg` planner code.
* timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API.
* timescale#7486: Prevent building against PostgreSQL versions with broken ABI.
* timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default.
* timescale#7413: Add GUC for segmentwise recompression.

**Bugfixes**
* timescale#7378: Remove obsolete job referencing `policy_job_error_retention`.
* timescale#7409: Update `bgw_job` table when altering procedure.
* timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query.
* timescale#7426: Fix `datetime` parsing error in chunk constraint creation.
* timescale#7432: Verify that the heap tuple is valid before using.
* timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk.
* timescale#7488: Emit error for transition table trigger on chunks.
* timescale#7514: Fix the error: `invalid child of chunk append`.
* timescale#7517: Fix the performance regression on the `cagg_migrate` procedure.
* timescale#7527: Restart scheduler on error.
* timescale#7557: Fix null handling for in-memory tuple filtering.
* timescale#7566: Improve transaction check in CAGG refresh.
* timescale#7584: Fix NaN-handling for vectorized aggregation.
* timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables.

**Thanks**
* @bharrisau for reporting the segfault when creating chunks.
* @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables.
* @k-rus for suggesting that we add a hint when hypertable creation fails.
* @staticlibs for sending the pull request that improves the transaction check in CAGG refresh.
* @uasiddiqi for reporting the `aggregated compressed column not found` error.
svenklemm pushed a commit to pallavisontakke/timescaledb that referenced this pull request Jan 24, 2025
This release introduces the ability to add secondary indexes to the columnstore, improves group by and filtering performance through columnstore vectorization, and contains the highly upvoted community request of transition table support. We recommend that you upgrade at the next available opportunity.

**Highlighted features in TimescaleDB v2.18.0**

* The ability to add secondary indexes to the columnstore through the new hypercore table access method.
* Significant performance improvements through vectorization (`SIMD`) for aggregations using a group by with one column and/or using a filter clause when querying the columnstore.
* Hypertables support triggers for transition tables, which is one of the most upvoted community feature requests.
* Updated methods to manage Timescale's hybrid row-columnar store (hypercore) that highlight the usage of the columnstore which includes both an optimized columnar format as well as compression.

**Dropping support for Bitnami images**

After the recent change in Bitnami’s [LTS support policy](bitnami/containers#75671), we are no longer building Bitnami images for TimescaleDB. We recommend using the [official TimescaleDB Docker image](https://hub.docker.com/r/timescale/timescaledb-ha)

**Deprecation Notice**

We are deprecating the following parameters, functions, procedures and views. They will be removed with the next major release of TimescaleDB. Please find the replacements in the table below:

| Deprecated | Replacement | Type |
| --- | --- | --- |
| decompress_chunk | convert_to_rowstore | Procedure |
| compress_chunk | convert_to_columnstore | Procedure |
| add_compression_policy | add_columnstore_policy | Function |
| remove_compression_policy | remove_columnstore_policy | Function |
| hypertable_compression_stats | hypertable_columnstore_stats | Function |
| chunk_compression_stats | chunk_columnstore_stats | Function |
| hypertable_compression_settings | hypertable_columnstore_settings | View |
| chunk_compression_settings | chunk_columnstore_settings | View |
| compression_settings | columnstore_settings | View |
| timescaledb.compress | timescaledb.enable_columnstore | Parameter |
| timescaledb.compress_segmentby | timescaledb.segmentby | Parameter |
| timescaledb.compress_orderby  | timescaledb.orderby | Parameter |

**Features**
* timescale#7341: Vectorized aggregation with grouping by one fixed-size by-value compressed column (such as arithmetic types).
* timescale#7104: Hypercore table access method.
* timescale#6901: Add hypertable support for transition tables.
* timescale#7482: Optimize recompression of partially compressed chunks.
* timescale#7458: Support vectorized aggregation with aggregate `filter` clauses that are also vectorizable.
* timescale#7433: Add support for merging chunks.
* timescale#7271: Push down `order by` in real-time continuous aggregate queries.
* timescale#7455: Support `drop not null` on compressed hypertables.
* timescale#7295: Support `alter table set access method` on hypertable.
* timescale#7411: Change parameter name to enable hypercore table access method.
* timescale#7436: Add index creation on `order by` columns.
* timescale#7443: Add hypercore function and view aliases.
* timescale#7521: Add optional `force` argument to `refresh_continuous_aggregate`.
* timescale#7528: Transform sorting on `time_bucket` to sorting on time for compressed chunks in some cases.
* timescale#7565: Add hint when hypertable creation fails.
* timescale#7390: Disable custom `hashagg` planner code.
* timescale#7587: Add `include_tiered_data` parameter to `add_continuous_aggregate_policy` API.
* timescale#7486: Prevent building against PostgreSQL versions with broken ABI.
* timescale#7412: Add [GUC](https://www.postgresql.org/docs/current/acronyms.html#:~:text=GUC) for the `hypercore_use_access_method` default.
* timescale#7413: Add GUC for segmentwise recompression.

**Bugfixes**
* timescale#7378: Remove obsolete job referencing `policy_job_error_retention`.
* timescale#7409: Update `bgw_job` table when altering procedure.
* timescale#7410: Fix the `aggregated compressed column not found` error on aggregation query.
* timescale#7426: Fix `datetime` parsing error in chunk constraint creation.
* timescale#7432: Verify that the heap tuple is valid before using.
* timescale#7434: Fix the segfault when internally setting the replica identity for a given chunk.
* timescale#7488: Emit error for transition table trigger on chunks.
* timescale#7514: Fix the error: `invalid child of chunk append`.
* timescale#7517: Fix the performance regression on the `cagg_migrate` procedure.
* timescale#7527: Restart scheduler on error.
* timescale#7557: Fix null handling for in-memory tuple filtering.
* timescale#7566: Improve transaction check in CAGG refresh.
* timescale#7584: Fix NaN-handling for vectorized aggregation.
* timescale#7598: Match the Postgres NaN comparison behavior in WHERE clause over compressed tables.

**Thanks**
* @bharrisau for reporting the segfault when creating chunks.
* @jakehedlund for reporting the incompatible NaN behavior in WHERE clause over compressed tables.
* @k-rus for suggesting that we add a hint when hypertable creation fails.
* @staticlibs for sending the pull request that improves the transaction check in CAGG refresh.
* @uasiddiqi for reporting the `aggregated compressed column not found` error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: NaN behavior changes in compressed tables
4 participants