Skip to content

[AURON #2176] Implement native support for lead window function#2188

Open
weimingdiit wants to merge 3 commits intoapache:masterfrom
weimingdiit:feat/native-window-lead
Open

[AURON #2176] Implement native support for lead window function#2188
weimingdiit wants to merge 3 commits intoapache:masterfrom
weimingdiit:feat/native-window-lead

Conversation

@weimingdiit
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #2176

Rationale for this change

Auron’s native window support previously covered rank-like functions and a subset of aggregate window functions, but did not support offset-based window functions such as lead(...).

This PR extends native window coverage with a conservative first step:

  • support lead(...)
  • preserve Spark-compatible behavior for input, offset, and default
  • keep unsupported semantics out of the native path rather than approximating them incorrectly

What changes are included in this PR?

This PR:

  • adds Lead handling in NativeWindowBase
  • extends the protobuf/planner window function enum with LEAD
  • adds native planner support to decode LEAD into the native window plan
  • introduces a native LeadProcessor in datafusion-ext-plans
  • evaluates lead using Spark-compatible offset/default/null behavior
  • adds a full-partition processing path for lead so that lookahead works correctly across input batches
  • adds Rust regression coverage for cross-batch lead
  • adds Scala regression tests for:
    • native lead(...) execution
    • Spark fallback for lead(... ) IGNORE NULLS

The native implementation supports Spark semantics for:

  • lead(input)

    • default offset is 1
    • default value is null
  • lead(input, offset, default)

    • returns the value of input at the offsetth row after the current row in the same window partition
    • if the target row exists and input there is null, returns null
    • if the target row does not exist, returns default

Supported scope in this PR:

  • standard RESPECT NULLS behavior

Not supported natively in this PR:

  • IGNORE NULLS

Unsupported IGNORE NULLS queries continue to fall back to Spark to preserve correctness.

Are there any user-facing changes?

Yes.
Queries using lead(...) can now remain on Auron’s native window execution path when they use supported semantics.
Queries using unsupported lead(... ) IGNORE NULLS behavior will continue to fall back to Spark.

How was this patch tested?

CI.

@weimingdiit weimingdiit changed the title [AURON #2176]Implement native support for lead window function [AURON #2176] Implement native support for lead window function Apr 9, 2026
@weimingdiit weimingdiit force-pushed the feat/native-window-lead branch from efe00fc to 5fd852f Compare April 9, 2026 12:09
Signed-off-by: weimingdiit <weimingdiit@gmail.com>
@weimingdiit weimingdiit force-pushed the feat/native-window-lead branch from 5fd852f to 7f1fef3 Compare April 9, 2026 13:20
Signed-off-by: weimingdiit <weimingdiit@gmail.com>
@weimingdiit weimingdiit marked this pull request as ready for review April 10, 2026 01:47
@slfan1989 slfan1989 requested a review from Copilot April 10, 2026 23:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native execution support for Spark’s lead(...) window function in Auron (RESPECT NULLS only), extending the Spark->protobuf planner path and implementing the corresponding native window processor in the DataFusion-based engine.

Changes:

  • Extend Spark-side native window plan encoding to recognize Lead and reject IGNORE NULLS for native execution.
  • Add LEAD to the planner protobuf + decode it in the native planner into a new WindowFunction::Lead.
  • Implement a native LeadProcessor and add regression tests (Scala + Rust), including cross-input-batch lookahead behavior.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
spark-extension/src/main/scala/org/apache/spark/sql/execution/auron/plan/NativeWindowBase.scala Adds Spark Lead detection, encodes it into the native protobuf plan, and blocks IGNORE NULLS from native path.
spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronWindowSuite.scala Adds Scala regression tests for native lead and Spark fallback on IGNORE NULLS.
native-engine/datafusion-ext-plans/src/window/window_context.rs Adds a flag to indicate whether any window expr requires full-partition processing.
native-engine/datafusion-ext-plans/src/window/processors/mod.rs Registers the new lead processor module.
native-engine/datafusion-ext-plans/src/window/processors/lead_processor.rs Implements the native lead evaluation (offset/default/partition-boundary behavior).
native-engine/datafusion-ext-plans/src/window/mod.rs Adds WindowFunction::Lead, wires processor creation, and flags lead as requiring full-partition mode.
native-engine/datafusion-ext-plans/src/window_exec.rs Introduces a full-partition execution path (concat all batches then process) and adds a cross-batch regression test.
native-engine/auron-planner/src/planner.rs Decodes protobuf WindowFunction::Lead into the native window plan.
native-engine/auron-planner/proto/auron.proto Extends the protobuf window function enum with LEAD.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: weimingdiit <weimingdiit@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement native support for lead window function

2 participants