[AURON #2183] Implement native support for ORC InsertIntoHiveTable writes by weimingdiit · Pull Request #2191 · apache/auron

weimingdiit · 2026-04-12T08:22:00Z

Which issue does this PR close?

Rationale for this change

Auron already supports native Parquet InsertIntoHiveTable writes, but ORC Hive writes still fall back to Spark’s regular execution path. This leaves native write coverage incomplete for a common Hive storage format.

This PR adds native support for ORC InsertIntoHiveTable writes so eligible Hive ORC write workloads can stay on the native path instead of falling back.

What changes are included in this PR?

This PR:

adds native ORC sink support in the native engine
adds planner / proto support for ORC sink execution
adds Spark-side physical plan support for native ORC InsertIntoHiveTable
extends AuronConverters to convert supported Hive ORC write plans to the native path
adds ORC sink utilities for task output path generation and output completion
preserves dynamic partition write handling on the native ORC write path
adapts input batches to the expected ORC/Hive output schema before writing
records output row and byte metrics for native ORC writes
adds execution coverage in AuronExecSuite

Are there any user-facing changes?

Yes.

Hive table writes using ORC may now remain on the native execution path when they match the supported InsertIntoHiveTable write pattern, instead of falling back to Spark’s regular write execution.

How was this patch tested?

CI.

cxzl25 · 2026-04-13T07:34:23Z

spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronConverters.scala

-        Shims.get.createNativeParquetInsertIntoHiveTableExec(cmd, sortedChild)
+        Shims.get.createNativeParquetInsertIntoHiveTableExec(cmd, sortInsertChild(cmd, child))
+
+      case DataWritingCommandExec(cmd: InsertIntoHiveTable, child)


Currently Auron only has auron.enable.data.writing to control whether writing is converted to Native, but it is not enabled for different formats. It is recommended to add it for separate control.

Good point. I changed the write gating to support separate format-level controls on top of the existing global spark.auron.enable.data.writing switch. The converter now checks spark.auron.enable.data.writing.parquet and spark.auron.enable.data.writing.orc before converting InsertIntoHiveTable.

…ble writes Signed-off-by: weimingdiit <weimingdiit@gmail.com>

github-actions bot added spark native labels Apr 12, 2026

weimingdiit force-pushed the feat/orc-sink_native_iceberg branch 4 times, most recently from 4682ae0 to 65f7ae7 Compare April 13, 2026 05:27

cxzl25 reviewed Apr 13, 2026

View reviewed changes

weimingdiit force-pushed the feat/orc-sink_native_iceberg branch from 65f7ae7 to bc8c720 Compare April 13, 2026 09:44

[AURON apache#2183] Implement native support for ORC InsertIntoHiveTa…

28946b0

…ble writes Signed-off-by: weimingdiit <weimingdiit@gmail.com>

weimingdiit force-pushed the feat/orc-sink_native_iceberg branch from bc8c720 to 28946b0 Compare April 13, 2026 11:20

weimingdiit marked this pull request as ready for review April 13, 2026 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AURON #2183] Implement native support for ORC InsertIntoHiveTable writes#2191

[AURON #2183] Implement native support for ORC InsertIntoHiveTable writes#2191
weimingdiit wants to merge 1 commit intoapache:masterfrom
weimingdiit:feat/orc-sink_native_iceberg

weimingdiit commented Apr 12, 2026

Uh oh!

cxzl25 Apr 13, 2026

Uh oh!

weimingdiit Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

weimingdiit commented Apr 12, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

How was this patch tested?

Uh oh!

cxzl25 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

weimingdiit Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants