perf: reuse columns from process_schema_changes in incremental materialization by moomindani · Pull Request #1412 · databricks/dbt-databricks

moomindani · 2026-04-21T08:26:38Z

Summary

Closes #1411.

The incremental materialization currently discards the return value of
process_schema_changes, so each downstream strategy macro (merge,
append, delete+insert) re-issues another DESCRIBE TABLE EXTENDED
on the target relation even though check_for_schema_changes has just
DESCRIBEd it. This PR reuses those columns, eliminating one metadata
round-trip per incremental model, per run.

Changes

dbt/include/databricks/macros/materializations/incremental/incremental.sql
- Capture columns from process_schema_changes in both V1 and V2 paths.
- When on_schema_change == 'ignore' (returns {}), fall back to a
  single adapter.get_columns_in_relation(existing_relation).
- Thread the result through strategy_arg_dict['dest_columns']
  (previously hard-coded to none).
- Extend get_build_sql with a dest_columns=none parameter so the
  V2 path can pass through.
dbt/include/databricks/macros/materializations/incremental/strategies.sql
- databricks__get_merge_sql: only DESCRIBE the target when
  dest_columns is none.
- get_delete_insert_sql: honor arg_dict['dest_columns'] when set.
- get_insert_into_sql: accept a dest_columns=none parameter and
  honor it; databricks__get_incremental_append_sql now passes
  arg_dict['dest_columns'] through.
CHANGELOG.md: new entry under ## dbt-databricks next → Under the Hood.

Behavior

When on_schema_change is 'fail', 'sync_all_columns', or
'append_new_columns': process_schema_changes already DESCRIBEd
both relations, so we reuse its result — one fewer DESCRIBE.
When on_schema_change == 'ignore': we issue exactly one DESCRIBE on
the existing relation, matching today's total count for that path.
Existing public macro signatures are preserved. get_build_sql gains
an optional keyword argument that defaults to none.

Test plan

Manually verified on a live Databricks SQL Warehouse with a project of
9 incremental stg models (on_schema_change: 'fail', 7 merge + 2
append strategies).

Target DESCRIBE TABLE EXTENDED … AS JSON count per incremental model:

Path	Before	After
V1 (`use_materialization_v2: false`)	2	1
V2 (`use_materialization_v2: true`)	2	1

Wall-clock impact on a full dbt run is within measurement noise at
this scale (9 small models, 16 threads); the saved round-trips get
absorbed by parallelism. The win here is fewer metadata round-trips
(lower warehouse load, less API traffic), not a dramatic wall-clock
speedup.

Ruff lint clean on changed files.

…alization Incremental materialization previously discarded the return value of `process_schema_changes`, causing each strategy macro (`merge`, `append`, `delete+insert`) to issue a second `DESCRIBE TABLE EXTENDED` on the target relation even though `check_for_schema_changes` had just DESCRIBEd it. This change: - captures the columns returned by `process_schema_changes` in both V1 and V2 paths - falls back to a single `adapter.get_columns_in_relation(existing_relation)` when `on_schema_change == 'ignore'` - threads the result through `strategy_arg_dict['dest_columns']` - teaches `databricks__get_merge_sql`, `get_delete_insert_sql`, and `get_insert_into_sql` to honor a pre-supplied `dest_columns` and skip their own `DESCRIBE` when provided Net effect: one fewer `DESCRIBE TABLE EXTENDED … AS JSON` round-trip per incremental model, per run. Verified on a project with 9 incremental stg models (V1 path, `on_schema_change: 'fail'`): target DESCRIBE count drops from 2 to 1 per model across merge, append, and delete+insert strategies. Resolves databricks#1411 Co-authored-by: Isaac

moomindani requested review from benc-db, jprakash-db, sd-db and tejassp-db as code owners April 21, 2026 08:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: reuse columns from process_schema_changes in incremental materialization#1412

perf: reuse columns from process_schema_changes in incremental materialization#1412
moomindani wants to merge 1 commit intodatabricks:mainfrom
moomindani:feat/reuse-schema-change-columns

moomindani commented Apr 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

moomindani commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Behavior

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

moomindani commented Apr 21, 2026 •

edited

Loading