Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple Iceberg support #815

Open
wants to merge 3 commits into
base: 1.9.latest
Choose a base branch
from
Open

Simple Iceberg support #815

wants to merge 3 commits into from

Conversation

benc-db
Copy link
Collaborator

@benc-db benc-db commented Oct 1, 2024

Description

Adds table_format: iceberg, to match support of other dbt-adapters. This config sets a table or incremental materialization as iceberg compatible via UniForm. Note: once a table is made iceberg compatible, removing the table_format will not remove the tblproperties, as we currently find doing tblproperty deletes on behalf of the user to be too fragile for most use-cases.

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

@benc-db benc-db changed the base branch from main to 1.9.latest October 1, 2024 19:08
@@ -106,6 +108,7 @@
@dataclass
class DatabricksConfig(AdapterConfig):
file_format: str = "delta"
table_format: TableFormat = TableFormat.DEFAULT
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future, we may fold file_format into table_format; adding table_format for consistency with other dbt adapters.

@@ -180,6 +183,20 @@ def __init__(self, config: Any, mp_context: SpawnContext) -> None:
def _behavior_flags(self) -> List[BehaviorFlag]:
return [USE_INFO_SCHEMA_FOR_COLUMNS]

@available.parse(lambda *a, **k: 0)
def check_iceberg(self, table_format: TableFormat, file_format: str, materialized: str) -> bool:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg V2 support is only available in 14.3 and later. Since iceberg support in Databricks lives on top of delta, we will accept delta as file_format, but any other file format is incompatible. Restricting to incremental and table, as the documentation I can find only shows Delta tables in the UniForm iceberg support documentation.

@@ -74,11 +74,19 @@ def from_relation_results(cls, results: RelationResults) -> TblPropertiesConfig:

@classmethod
def from_relation_config(cls, relation_config: RelationConfig) -> TblPropertiesConfig:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file handles iceberg settings as part of tblproperties for incremental tables. It could maybe be a separate component, but since iceberg support is accomplished via tblproperties, including here.

@@ -1,4 +1,5 @@
{% macro get_create_sql_tblproperties(tblproperties) %}
{%- set _ = adapter.check_iceberg(config.get('table_format'), config.get('file_format'), model.config.materialized) -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we check and fail the compilation of incremental tblproperties if we are misconfigured. Otherwise, since this is used only for incremental, the tblproperties will already have the iceberg properties included.

{%- if tblproperties is not none %}
{%- set tblproperties = config.get('tblproperties', {}) -%}
{%- set is_iceberg = adapter.check_iceberg(config.get('table_format'), config.get('file_format'), model.config.materialized) -%}
{%- if is_iceberg -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used by table/view (but not incremental) materialization. We will fail compilation if the prereqs are violated, but if they are not, and iceberg is set, add the iceberg properties to the dictionary of tblproperties.

@@ -15,6 +19,10 @@

{% macro apply_tblproperties(relation, tblproperties) -%}
{% if tblproperties %}
{%- set is_iceberg = adapter.check_iceberg(config.get('table_format'), config.get('file_format'), model.config.materialized) -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, but this statement is used to alter tables constructed in other ways, such as via python models.

"table_built_on_iceberg_table.sql": fixtures.ref_iceberg,
}

def test_iceberg_refs(self, project):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that iceberg can ref other models and other models can ref an iceberg table.

def models(self):
return {"first_model.sql": fixtures.basic_view}

def test_iceberg_swaps(self, project):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test that we can replace a view with iceberg table, and swap to incrementing that table with incremental. If tblproperties weren't so hard to manage, we would also remove iceberg when the user removes the table_format config; however, here we do not. Will need to address when I write the docs.

sql = self.render_create_table_as(template_bundle)

expected = (
f"create or replace table {template_bundle.relation} "
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test that we smoothly incorporate iceberg properties with other tblproperties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant