Flink: Support writing shredded variant in Flink by Guosmilesmile · Pull Request #15596 · apache/iceberg

Guosmilesmile · 2026-03-12T08:09:38Z

This PR is mainly to add support in Flink for writing shredding-variant data to Iceberg tables, based on #14297.

This PR is based on #14297 and will be adjusted in sync with it.

Guosmilesmile · 2026-05-07T08:36:30Z

Hi @aihuaxu @nssalian @pvary @mxm . Since the Spark part has been merged, the Flink part has been adjusted accordingly. If you have time, please help review it.

Thanks!
GuoYu.

pvary · 2026-05-08T13:49:46Z

+        .tableProperty(TableProperties.PARQUET_SHRED_VARIANTS)
+        .defaultValue(TableProperties.PARQUET_SHRED_VARIANTS_DEFAULT)


How will we handle when ORC supports shredding variants?

Good catch . I rename shred-variants to parquet-shred-variants to clarify this feature is only support parquet . If orc support this, then we can add another config.

Let's do parquet for now since we followed that pattern for the Spark implementation.

pvary · 2026-05-08T14:07:37Z

-                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant)));
+                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant),
+            new FlinkVariantShreddingAnalyzer(),
+            (row, rowType) -> new RowDataSerializer(rowType).copy(row)));


Isn't this costly to recreate every time when we copy a row?

It will increase the cost, but without copying, there would be issues with data corruption when buffer data. We ran into this during early development, and the unit tests can reproduce it.

Can we reuse the RowDataSerializer?

With the current BiFunction, (row, rowType) -> new RowDataSerializer(rowType).copy(row) creates a new RowDataSerializer for every buffered row (default buffer = 100). This construction is not free, as it involves walking rowType.getChildren(), building a TypeSerializer[] via InternalSerializers.create, a BinaryRowDataSerializer, and a RowData.FieldGetter[]. Since the engine schema is fixed for the entire file, a factory allows us to build it once and reuse it. Using the Factory Pattern, we can avoid recreating the serializer for a given table schema with every incoming record.

Yes, we can use Function<S, UnaryOperator<D>> instead of BiFunction<D, S, D> to implement this.

+1. We should be able to reuse RowDataSerializer so we don't need to create new instance for every row.

talatuyarer

Thanks for putting this together. the structure mirrors the Spark side cleanly, and the test coverage of inference behaviors (tie-breaking, decimal fallback, cross-file types) is genuinely valuable. I believe we should address reconstructing RowDataSerializer on every buffered row.

talatuyarer · 2026-05-09T06:24:55Z

-                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant)));
+                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant),
+            new FlinkVariantShreddingAnalyzer(),
+            (row, rowType) -> new RowDataSerializer(rowType).copy(row)));


With the current BiFunction, (row, rowType) -> new RowDataSerializer(rowType).copy(row) creates a new RowDataSerializer for every buffered row (default buffer = 100). This construction is not free, as it involves walking rowType.getChildren(), building a TypeSerializer[] via InternalSerializers.create, a BinaryRowDataSerializer, and a RowData.FieldGetter[]. Since the engine schema is fixed for the entire file, a factory allows us to build it once and reuse it. Using the Factory Pattern, we can avoid recreating the serializer for a given table schema with every incoming record.

aihuaxu

Minor comments. It mirrors what Spark does. The implementation is clean in general to me.

aihuaxu · 2026-05-09T14:28:54Z

+        .parse();
+  }
+
+  public int variantInferenceBufferSize() {


Should this be Parquet specific as well?

Thanks for pointing out. Initially I intended to make it a general parameter. But after your suggestion, I realized that the table property it references is Parquet-specific, so it still needs to be defined as a Parquet-only parameter.

aihuaxu · 2026-05-09T14:34:50Z

+      ConfigOptions.key("parquet-shred-variants").booleanType().defaultValue(false);
+
+  public static final ConfigOption<Integer> VARIANT_INFERENCE_BUFFER_SIZE =
+      ConfigOptions.key("variant-inference-buffer-size").intType().defaultValue(10);


Maybe default to 100 to align with TableProperties.PARQUET_VARIANT_BUFFER_SIZE_DEFAULT value?

You are right, at the first implement in spark is 10, last it change to 100 . I have make it the same .

aihuaxu · 2026-05-09T14:49:04Z

-                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant)));
+                FlinkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant),
+            new FlinkVariantShreddingAnalyzer(),
+            (row, rowType) -> new RowDataSerializer(rowType).copy(row)));


+1. We should be able to reuse RowDataSerializer so we don't need to create new instance for every row.

talatuyarer · 2026-05-09T15:13:02Z

                SparkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant),
            new SparkVariantShreddingAnalyzer(),
-            InternalRow::copy));
+            structType -> InternalRow::copy));


We dont need this change anymore. Can you revert it ?

Since we changed UnaryOperator<D> to Function<S, UnaryOperator<D>>, the Spark part should be adjusted accordingly.

talatuyarer · 2026-05-09T15:13:25Z

                SparkParquetReaders.buildReader(icebergSchema, fileSchema, idToConstant),
            new SparkVariantShreddingAnalyzer(),
-            InternalRow::copy));
+            structType -> InternalRow::copy));


We dont need this change anymore. Can you revert it ?

The same above.

talatuyarer · 2026-05-09T16:08:55Z

+  }
+
+  @TestTemplate
+  public void testDecimalFallbackAfterBuffer() {


My understanding is that this test verifies that rows which don't fit the inferred shredded type are written to the unshredded value field. However, the test only performs a round trip verification. The schema assertion would be the most informative part of this test.

Good point. I will add a asset to verify the schema.

talatuyarer · 2026-05-09T16:11:14Z

+  }
+
+  @TestTemplate
+  public void testInfrequentFieldPruning() throws IOException {


The test is implicitly coupled to MIN_FIELD_FREQUENCY = 0.10. The ratio of 1/11 (≈ 0.0909) is just below this threshold. If this constant is adjusted, the test's outcome will change without an obvious failure. Either reference the constant or add a comment to document this dependency.

Good point. Add a comment to clarify it

nssalian · 2026-05-09T21:11:38Z

+        .tableProperty(TableProperties.PARQUET_SHRED_VARIANTS)
+        .defaultValue(TableProperties.PARQUET_SHRED_VARIANTS_DEFAULT)


Let's do parquet for now since we followed that pattern for the Spark implementation.

nssalian · 2026-05-09T21:12:07Z

+  public int parquetVariantInferenceBufferSize() {
+    return confParser
+        .intConf()
+        .option(FlinkWriteOptions.PARQUET_VARIANT_INFERENCE_BUFFER_SIZE.key())


Could you add these to the flink documentation similar to the #14297 for spark?

Right, will add it.

nssalian · 2026-05-09T21:14:11Z

+  public static final ConfigOption<Boolean> PARQUET_SHRED_VARIANTS =
+      ConfigOptions.key("parquet-shred-variants").booleanType().defaultValue(false);
+
+  public static final ConfigOption<Integer> PARQUET_VARIANT_INFERENCE_BUFFER_SIZE =


Could you rename this to VARIANT_INFERENCE_BUFFER_SIZE to be consistent with Spark?

Right, change it.

nssalian · 2026-05-09T21:17:12Z

  private final boolean isBatchReader;
  private final VariantShreddingAnalyzer<D, S> variantAnalyzer;
-  private final UnaryOperator<D> copyFunc;
+  private final Function<S, UnaryOperator<D>> copyFuncFactory;


Based on this comment thread we decided to keep the UnaryOperator: #14297 (comment). @pvary suggest BiFunction too? Worth exploring to see what is the best way or we can keep it to Unary too.

The previous version used BiFunction, which caused RowDataSerializer to be repeatedly created. To avoid this, we use Function<S, UnaryOperator<D>> instead, so that RowDataSerializer can be initialized once before BufferedFileAppender is created.

nssalian · 2026-05-09T21:17:41Z

                GenericParquetReaders.buildReader(icebergSchema, fileSchema),
            testAnalyzer,
-            record -> record);
+            unused -> oriRecord -> oriRecord);


What was this change for?

We change UnaryOperator<D> to Function<S, UnaryOperator<D>>

nssalian · 2026-05-09T21:21:06Z

Thanks for cleaning up this PR @Guosmilesmile. Do we need to back port this to additional Flink versions as well? CC: @pvary @stevenzwu

Guosmilesmile · 2026-05-10T05:38:26Z

Thanks for cleaning up this PR @Guosmilesmile. Do we need to back port this to additional Flink versions as well? CC: @pvary @stevenzwu

@nssalian Flink support variant type in 2.1 . So we only add this feature in version of 2.1 .

Guosmilesmile marked this pull request as draft March 12, 2026 08:09

github-actions Bot added spark parquet flink ORC labels Mar 12, 2026

Guosmilesmile mentioned this pull request Mar 12, 2026

Spark: Support writing shredded variant in Iceberg-Spark #14297

Merged

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from 15ff223 to 5b448b9 Compare March 12, 2026 08:22

github-actions Bot removed the ORC label Mar 12, 2026

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch 2 times, most recently from 8f6198a to b03caf6 Compare March 12, 2026 08:59

Guosmilesmile changed the title ~~Core,Flink: Support writing shredded variant in Flink~~ Flink: Support writing shredded variant in Flink Mar 12, 2026

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch 3 times, most recently from 88045e1 to cbfa8c2 Compare March 13, 2026 07:17

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from fae2814 to f3a2fba Compare March 24, 2026 05:50

github-actions Bot added the core label Mar 24, 2026

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch 4 times, most recently from b07b00b to c95d78f Compare March 24, 2026 08:36

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch 3 times, most recently from fc8c45a to b116f25 Compare April 1, 2026 01:50

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from b116f25 to 650cb7a Compare April 10, 2026 09:42

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from 770d9c4 to 7d48389 Compare May 7, 2026 06:55

Guosmilesmile marked this pull request as ready for review May 7, 2026 08:34

pvary reviewed May 8, 2026

View reviewed changes

Flink:Support writing shredded variant

c0b9d40

rename SHRED_VARIANTS to PARQUET_SHRED_VARIANTS

0f2ae10

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from 63ae5ae to 0f2ae10 Compare May 9, 2026 05:33

fix spark 4.0

a51b1d3

talatuyarer reviewed May 9, 2026

View reviewed changes

github-actions Bot added the build label May 9, 2026

Guosmilesmile added 2 commits May 9, 2026 14:55

Fix RowDataSerializer create every row

7d62a08

move set param to after

1c72097

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from 23b94ed to 1c72097 Compare May 9, 2026 06:55

aihuaxu reviewed May 9, 2026

View reviewed changes

Address aihua's Comment

550e7f8

talatuyarer reviewed May 9, 2026

View reviewed changes

nssalian reviewed May 9, 2026

View reviewed changes

github-actions Bot added the docs label May 10, 2026

Address Comments

8bf41e8

Guosmilesmile force-pushed the flink_shredded_varisnt_fileformat branch from 1ae2aac to 8bf41e8 Compare May 10, 2026 06:37

		.tableProperty(TableProperties.PARQUET_SHRED_VARIANTS)
		.defaultValue(TableProperties.PARQUET_SHRED_VARIANTS_DEFAULT)

Conversation

Guosmilesmile commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Guosmilesmile commented May 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

talatuyarer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aihuaxu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

talatuyarer May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

talatuyarer May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Guosmilesmile May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Guosmilesmile commented Mar 12, 2026 •

edited

Loading

talatuyarer May 9, 2026 •

edited

Loading

talatuyarer May 9, 2026 •

edited

Loading

Guosmilesmile May 10, 2026 •

edited

Loading

nssalian May 9, 2026 •

edited

Loading