[AURON #2176] Implement native support for lead window function#2188
Open
weimingdiit wants to merge 3 commits intoapache:masterfrom
Open
[AURON #2176] Implement native support for lead window function#2188weimingdiit wants to merge 3 commits intoapache:masterfrom
weimingdiit wants to merge 3 commits intoapache:masterfrom
Conversation
26 tasks
efe00fc to
5fd852f
Compare
Signed-off-by: weimingdiit <weimingdiit@gmail.com>
5fd852f to
7f1fef3
Compare
Signed-off-by: weimingdiit <weimingdiit@gmail.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds native execution support for Spark’s lead(...) window function in Auron (RESPECT NULLS only), extending the Spark->protobuf planner path and implementing the corresponding native window processor in the DataFusion-based engine.
Changes:
- Extend Spark-side native window plan encoding to recognize
Leadand rejectIGNORE NULLSfor native execution. - Add
LEADto the planner protobuf + decode it in the native planner into a newWindowFunction::Lead. - Implement a native
LeadProcessorand add regression tests (Scala + Rust), including cross-input-batch lookahead behavior.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| spark-extension/src/main/scala/org/apache/spark/sql/execution/auron/plan/NativeWindowBase.scala | Adds Spark Lead detection, encodes it into the native protobuf plan, and blocks IGNORE NULLS from native path. |
| spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronWindowSuite.scala | Adds Scala regression tests for native lead and Spark fallback on IGNORE NULLS. |
| native-engine/datafusion-ext-plans/src/window/window_context.rs | Adds a flag to indicate whether any window expr requires full-partition processing. |
| native-engine/datafusion-ext-plans/src/window/processors/mod.rs | Registers the new lead processor module. |
| native-engine/datafusion-ext-plans/src/window/processors/lead_processor.rs | Implements the native lead evaluation (offset/default/partition-boundary behavior). |
| native-engine/datafusion-ext-plans/src/window/mod.rs | Adds WindowFunction::Lead, wires processor creation, and flags lead as requiring full-partition mode. |
| native-engine/datafusion-ext-plans/src/window_exec.rs | Introduces a full-partition execution path (concat all batches then process) and adds a cross-batch regression test. |
| native-engine/auron-planner/src/planner.rs | Decodes protobuf WindowFunction::Lead into the native window plan. |
| native-engine/auron-planner/proto/auron.proto | Extends the protobuf window function enum with LEAD. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: weimingdiit <weimingdiit@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #2176
Rationale for this change
Auron’s native window support previously covered rank-like functions and a subset of aggregate window functions, but did not support offset-based window functions such as
lead(...).This PR extends native window coverage with a conservative first step:
lead(...)input,offset, anddefaultWhat changes are included in this PR?
This PR:
Leadhandling inNativeWindowBaseLEADLEADinto the native window planLeadProcessorindatafusion-ext-plansleadusing Spark-compatible offset/default/null behaviorleadso that lookahead works correctly across input batchesleadlead(...)executionlead(... ) IGNORE NULLSThe native implementation supports Spark semantics for:
lead(input)1nulllead(input, offset, default)inputat theoffsetth row after the current row in the same window partitioninputthere isnull, returnsnulldefaultSupported scope in this PR:
RESPECT NULLSbehaviorNot supported natively in this PR:
IGNORE NULLSUnsupported
IGNORE NULLSqueries continue to fall back to Spark to preserve correctness.Are there any user-facing changes?
Yes.
Queries using
lead(...)can now remain on Auron’s native window execution path when they use supported semantics.Queries using unsupported
lead(... ) IGNORE NULLSbehavior will continue to fall back to Spark.How was this patch tested?
CI.