Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update my branch #305

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
98d001d
add macro to process new prejoin list syntax
tkiehn Nov 19, 2024
4b0e02d
add process_prejoined_columns macro to top-level stage macro
tkiehn Nov 19, 2024
07ec2de
change prejoin-logic to perform less joins
tkiehn Nov 20, 2024
fa4087e
add check and compilation error if a prejoined column is defined twice
tkiehn Nov 27, 2024
54b8720
add amount of extract_columns and aliases to amount-mismatch compilat…
tkiehn Nov 27, 2024
582af24
Merge branch 'main' into 287-feature-extract-multiple-columns-from-on…
tkiehn Nov 27, 2024
60b2a98
add prejoin with source to processing-macro
tkiehn Nov 27, 2024
9f1a06b
removed unnecessary md file
tkirschke Dec 2, 2024
d18d28b
First version of yaml parsing, needs to be turned into macro
tkirschke Dec 2, 2024
24ee6ba
First version of macro, and Control_snap_v0 adjusted to use it.
tkirschke Dec 2, 2024
956e71a
Update yaml_metadata_parser.sql, fix if-nesting, rename metadata_dict
tkiehn Dec 3, 2024
3a94499
Fixed warnings
tkirschke Dec 3, 2024
f3bc4a7
Merge branch '181-modify-macros-to-directly-work-with-yaml-metadata' …
tkirschke Dec 3, 2024
3fed963
move stage_processing_macros.sql into staging folder
tkiehn Dec 9, 2024
edf3dc4
change extract_input_columns, process_prejoined_columns. add extract_…
tkiehn Dec 11, 2024
da7d000
add staging.yml with descriptions of process_prejoined_columns and ex…
tkiehn Dec 11, 2024
a970a0a
postgres: modify stage to handle new prejoin syntax and simplify sett…
tkiehn Dec 11, 2024
f8767ae
bigquery: stage: implement new prejoin syntax
tkiehn Dec 11, 2024
90dc5c8
databricks: stage: implement new prejoin syntax
tkiehn Dec 11, 2024
706b2af
exasol stage: implement new prejoin syntax
tkiehn Dec 11, 2024
20d012c
fabric stage: implement new prejoin syntax
tkiehn Dec 11, 2024
36617f3
oracle stage: implement new prejoin syntax
tkiehn Dec 11, 2024
7adba69
postgres stage add prepend_generated_by()
tkiehn Dec 11, 2024
3070e0a
redshift stage: implement new prejoin syntax
tkiehn Dec 11, 2024
ef8b375
snowflake stage: implement new prejoin syntax
tkiehn Dec 11, 2024
e891046
synapse stage: implement new prejoin syntax
tkiehn Dec 11, 2024
a065dc1
First batch of modified front-end macros
tkirschke Dec 11, 2024
71bfee3
synapse stage: remove column name escaping in ghost record macro call
tkiehn Dec 12, 2024
e683e45
fabric stage fix escape column names
tkiehn Dec 12, 2024
3a86469
synapse, fabric stages: fix derived input columns
tkiehn Dec 12, 2024
57e21da
oracle stage: include col_size to ghost records
tkiehn Dec 12, 2024
c8df4d3
synapse stage: fix prejoin_column_names
tkiehn Dec 12, 2024
f484bbb
Finished yaml implementation, not tested yet
tkirschke Dec 13, 2024
1642057
Fixed returning none for optional parameters
tkirschke Dec 13, 2024
2f9e218
Fixed ref satellite parameter
tkirschke Dec 19, 2024
5e203ca
unify formatting of yaml_metadata_parser calls
tkiehn Jan 6, 2025
aa0ea49
Update hash_standardization.sql
tkirschke Jan 7, 2025
1e2f721
Update hash_standardization.sql
tkiehn Jan 7, 2025
de6861f
Move warnings to log and improve content of the messages
tkiehn Jan 7, 2025
ec1366c
add ghost record for DATETIME on BigQuery
tkiehn Jan 7, 2025
bef34c1
fix oracle ghost records for timestamps
tkiehn Jan 7, 2025
78af272
Merge remote-tracking branch 'origin/issue-295-tkirschke-patch' into …
tkiehn Jan 8, 2025
659d8ea
fix behaviour of derived columns throughout the CTEs for all adapers …
tkiehn Jan 8, 2025
c859311
fix fabric ghost records for datetime2
tkiehn Jan 8, 2025
0557fbf
Merge branch '287-feature-extract-multiple-columns-from-one-prejoined…
tkiehn Jan 8, 2025
6c380d3
Update oracle stage, remove AS for alias of joined tables
tkiehn Jan 8, 2025
e1703c7
Update oracle ghost_record_per_datatype, simplify detection of (var)c…
tkiehn Jan 9, 2025
c0d40bf
redshift: use qualify statement instead of prep CTEs throughout all m…
tkiehn Jan 9, 2025
7f13d91
change placeholder alias to actual alias in deduplicated_incoming cte…
tkiehn Jan 9, 2025
d143732
small fixes on exasol stage
tkiehn Jan 9, 2025
1570800
Update ghost_record_per_datatype.sql, fix databricks hash_default_values
tkiehn Jan 9, 2025
76c5ccf
Merge pull request #300 from ScalefreeCOM/test
tkiehn Jan 10, 2025
3f9c9f8
Merge branch 'main' into fix-databricks-ghost-records-binary-defaults
tkiehn Jan 10, 2025
8eb950c
Merge remote-tracking branch 'origin/main' into 298-bug-redshift-not-…
tkiehn Jan 10, 2025
e397ce4
Merge pull request #301 from ScalefreeCOM/fix-databricks-ghost-record…
tkiehn Jan 10, 2025
261d7ab
Merge branch 'main' into 298-bug-redshift-not-using-qualify-for-lates…
tkiehn Jan 10, 2025
27f64ae
Merge pull request #302 from ScalefreeCOM/298-bug-redshift-not-using-…
tkiehn Jan 10, 2025
c524a57
Added optional parameter "union_strategy" and implemented it in all n…
tkirschke Jan 13, 2025
72e0085
Added optional parameter "union_strategy" and implemented it in all n…
tkirschke Jan 13, 2025
78d6364
fix nh_link union strategy condition syntax
tkiehn Jan 13, 2025
208139c
Merge pull request #303 from ScalefreeCOM/177-nh-link-add-union_strat…
tkiehn Jan 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions General_Features.md

This file was deleted.

21 changes: 21 additions & 0 deletions macros/internal/metadata_processing/metadata_processing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
version: 2

macros:
- name: yaml_metadata_parser
description: A macro to parse yaml-metadata into single parameters. Used in top-level front-end macros.
arguments:
- name: name
type: string
description: The name of the parameter you want to extract of the yaml-metadata.
- name: yaml_metadata
type: string
description: The yaml-string that holds the definition of other parameters. Needs to be in yaml format.
- name: parameter
type: variable
description: The forwarded parameter of the top-level macro. This is used, if the yaml-metadata is none.
- name: required
type: boolean
description: Whether this parameter is required for the top-level macro. Default is False.
- name: documentation
type: string
description: A string that holds documentation of this parameter.
28 changes: 28 additions & 0 deletions macros/internal/metadata_processing/yaml_metadata_parser.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{% macro yaml_metadata_parser(name=none, yaml_metadata=none, parameter=none, required=False, documentation=none) %}

{% if datavault4dbt.is_something(yaml_metadata) %}
{%- set metadata_dict = fromyaml(yaml_metadata) -%}
{% if name in metadata_dict.keys() %}
{% set return_value = metadata_dict.get(name) %}
{% if datavault4dbt.is_something(parameter)%}
{{ log("[" ~ this ~ "] Parameter '" ~ name ~ "' defined both in yaml-metadata and separately. Value from yaml-metadata will be used, and separate parameter is ignored.", info=False) }}
{% endif %}
{% elif datavault4dbt.is_something(parameter) %}
{% set return_value = parameter %}
{{ log("[" ~ this ~ "] yaml-metadata given, but parameter '" ~ name ~ "' not defined in there. Applying '" ~ parameter ~ "' which is either a parameter passed separately or the default value.", info=False) }}
{% elif required %}
{{ exceptions.raise_compiler_error("[" ~ this ~ "] Error: yaml-metadata given, but required parameter '" ~ name ~ "' not defined in there or outside in the parameter. \n Description of parameter '" ~ name ~ "': \n" ~ documentation ) }}
{% else %}
{% set return_value = None %}
{% endif %}
{% elif datavault4dbt.is_something(parameter) %}
{% set return_value = parameter %}
{% elif required %}
{{ exceptions.raise_compiler_error("[" ~ this ~ "] Error: Required parameter '" ~ name ~ "' not defined. Define it either directly, or inside yaml-metadata. \n Description of parameter '" ~ name ~ "': \n" ~ documentation ) }}
{% else %}
{% set return_value = None %}
{% endif %}

{{ return(return_value) }}

{% endmacro %}
183 changes: 101 additions & 82 deletions macros/staging/bigquery/stage.sql

Large diffs are not rendered by default.

193 changes: 108 additions & 85 deletions macros/staging/databricks/stage.sql

Large diffs are not rendered by default.

183 changes: 108 additions & 75 deletions macros/staging/exasol/stage.sql

Large diffs are not rendered by default.

177 changes: 99 additions & 78 deletions macros/staging/fabric/stage.sql

Large diffs are not rendered by default.

185 changes: 102 additions & 83 deletions macros/staging/oracle/stage.sql

Large diffs are not rendered by default.

188 changes: 104 additions & 84 deletions macros/staging/postgres/stage.sql

Large diffs are not rendered by default.

185 changes: 103 additions & 82 deletions macros/staging/redshift/stage.sql

Large diffs are not rendered by default.

182 changes: 101 additions & 81 deletions macros/staging/snowflake/stage.sql

Large diffs are not rendered by default.

116 changes: 76 additions & 40 deletions macros/staging/stage.sql

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -54,24 +54,27 @@
{# Do nothing. No source column required. #}
{%- elif value is mapping and value.is_hashdiff -%}
{%- do extracted_input_columns.append(value['columns']) -%}
{%- elif value is mapping and 'this_column_name' in value.keys() -%}
{%- if datavault4dbt.is_list(value['this_column_name'])-%}
{%- for column in value['this_column_name'] -%}
{%- do extracted_input_columns.append(column) -%}
{%- endfor -%}
{%- else -%}
{%- do extracted_input_columns.append(value['this_column_name']) -%}
{%- endif -%}
{%- else -%}
{%- do extracted_input_columns.append(value) -%}
{%- endif -%}
{%- endfor -%}

{%- do return(extracted_input_columns) -%}

{%- elif datavault4dbt.is_list(columns_dict) -%}
{% for prejoin in columns_dict %}
{%- if datavault4dbt.is_list(prejoin['this_column_name'])-%}
{%- for column in prejoin['this_column_name'] -%}
{%- do extracted_input_columns.append(column) -%}
{%- endfor -%}
{%- else -%}
{%- do extracted_input_columns.append(prejoin['this_column_name']) -%}
{%- endif -%}
{% endfor %}
{%- else -%}
{%- do return([]) -%}
{%- endif -%}

{%- do return(extracted_input_columns) -%}

{%- endmacro -%}


Expand Down Expand Up @@ -123,4 +126,89 @@
{%- endif %}
{%- endfor -%}

{%- endmacro -%}
{%- endmacro -%}


{%- macro process_prejoined_columns(prejoined_columns=none) -%}
{# Check if the old syntax is used for prejoined columns
If so parse it to new list syntax #}

{% if datavault4dbt.is_list(prejoined_columns) %}
{% do return(prejoined_columns) %}
{% else %}
{% set output = [] %}

{% for key, value in prejoined_columns.items() %}
{% set ref_model = value.get('ref_model') %}
{% set src_name = value.get('src_name') %}
{% set src_table = value.get('src_table') %}
{%- if 'operator' not in value.keys() -%}
{%- do value.update({'operator': 'AND'}) -%}
{%- set operator = 'AND' -%}
{%- else -%}
{%- set operator = value.get('operator') -%}
{%- endif -%}

{% set match_criteria = (
ref_model and output | selectattr('ref_model', 'equalto', ref_model) or
src_name and output | selectattr('src_name', 'equalto', src_name) | selectattr('src_table', 'equalto', src_table)
) | selectattr('this_column_name', 'equalto', value.this_column_name)
| selectattr('ref_column_name', 'equalto', value.ref_column_name)
| selectattr('operator', 'equalto', value.operator)
| list | first %}

{% if match_criteria %}
{% do match_criteria['extract_columns'].append(value.bk) %}
{% do match_criteria['aliases'].append(key) %}
{% else %}
{% set new_item = {
'extract_columns': [value.bk],
'aliases': [key],
'this_column_name': value.this_column_name,
'ref_column_name': value.ref_column_name,
'operator': operator
} %}

{% if ref_model %}
{% do new_item.update({'ref_model': ref_model}) %}
{% elif src_name and src_table %}
{% do new_item.update({'src_name': src_name, 'src_table': src_table}) %}
{% endif %}

{% do output.append(new_item) %}
{% endif %}
{% endfor %}
{% endif %}

{%- do return(output) -%}

{%- endmacro -%}


{%- macro extract_prejoin_column_names(prejoined_columns=none) -%}

{%- set extracted_column_names = [] -%}

{% if not datavault4dbt.is_something(prejoined_columns) %}
{%- do return(extracted_column_names) -%}
{% endif %}

{% for prejoin in prejoined_columns %}
{% if datavault4dbt.is_list(prejoin['aliases']) %}
{% for alias in prejoin['aliases'] %}
{%- do extracted_column_names.append(alias) -%}
{% endfor %}
{% elif datavault4dbt.is_something(prejoin['aliases']) %}
{%- do extracted_column_names.append(prejoin['aliases']) -%}
{% elif datavault4dbt.is_list(prejoin['extract_columns']) %}
{% for column in prejoin['extract_columns'] %}
{%- do extracted_column_names.append(column) -%}
{% endfor %}
{% else %}
{%- do extracted_column_names.append(prejoin['extract_columns']) -%}
{% endif %}
{%- endfor -%}

{%- do return(extracted_column_names) -%}

{%- endmacro -%}
23 changes: 23 additions & 0 deletions macros/staging/staging.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
version: 2

macros:
- name: process_prejoined_columns
description: >
A macro to process prejoined columns. If a list of dictioniaries(new syntax) is provided it will do nothing and return the list.
If a dictionary of dictionaries if provided(old syntax) it will be transformed to the new syntax.
When multiple columns are to be extracted from the same prejoin-target and with the same conditions(columns and operator) they will be combined into one item.
arguments:
- name: prejoined_columns
type: list or dictionary
description: The value of the prejoined_columns as defined in the yaml_metadata of the stage-model.

- name: extract_prejoin_column_names
description: >
A macro to extract the names of the prejoined columns of each staging-model.
Takes a list of prejoins and will add the aliases of the prejoins to the return-list.
If no aliases are present it will return the names of the extracted columns.
Returns an empty list if the passed parameter is empty.
arguments:
- name: prejoined_columns
type: list
description: The prejoined_columns as process by the process_prejoined_columns-macro
Loading
Loading