Skip to content

Support independent lake partition spec for lakehouse tiering #3309

@litiliu

Description

@litiliu

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

For Kafka-like event log use cases, users often want a Fluss log table to be non-partitioned and bucketed by an event key, for example DISTRIBUTED BY (id), so that all events of the same key are written to the same bucket and can be consumed in order.

However, when tiering the same data into a lake table such as Iceberg, users often want the lake table to be partitioned by event date or event time, for example days(event_time) or dt, to support query pruning, partition-level retention, and efficient analytical queries.

These two layout requirements are different:

Fluss log layout:
optimized for keyed ordering and streaming replay

Lake table layout:
optimized for analytical query pruning and lifecycle management

Currently, Fluss derives the lake table partition spec directly from the Fluss table partition keys. This forces users to choose between:

  1. Fluss non-partitioned table + bucket by key
    -> preserves per-key ordering
    -> lake table cannot be independently partitioned by date/time

  2. Fluss partitioned table by date + bucket by key
    -> lake table can be date partitioned
    -> per-key ordering is only guaranteed within each Fluss partition, not across dates

For example:
2026-05-01, id=1, login
2026-05-02, id=1, order
2026-05-02, id=1, ship
With a Kafka-like layout, id=1 is always routed to the same partition/bucket, so consumers can observe:

login -> order -> ship

But if the Fluss table is partitioned by day, the same id=1 is written to different TableBuckets:
TableBucket(partition=2026-05-01, bucket=hash(id=1))
TableBucket(partition=2026-05-02, bucket=hash(id=1))

This means Fluss only guarantees ordering within each partition + bucket, not across daily partitions. That makes it hard to use a partitioned Fluss log table as a Kafka-like event log while also keeping the lake table efficiently partitioned by date.

Solution

Add support for an independent lake partition spec, decoupled from Fluss table partition keys.

For example:

CREATE TABLE event_log (
id STRING,
event_time TIMESTAMP_LTZ(3),
event_type STRING,
payload STRING
)
DISTRIBUTED BY (id) INTO 64 BUCKETS
WITH (
'table.datalake.enabled' = 'true',
'table.datalake.format' = 'iceberg',
'table.datalake.iceberg.partition-spec' = 'days(event_time)'
);
This would allow:

Fluss table:
non-partitioned, bucketed by id
-> preserves per-key ordering

Iceberg table:
partitioned by event_time/dt
-> supports efficient day-based query pruning and retention
When creating a lake-enabled Fluss table:

Fluss table partition keys continue to control Fluss physical partitioning.
Lake partition spec controls the Iceberg/Paimon physical partitioning.
If lake partition spec is not configured, current behavior remains unchanged.
Lake tiering still tracks progress by Map<TableBucket, Long>, independent of lake partition layout.

This is a common ingestion pattern in streaming lakehouse architectures:

Kafka / Fluss log:
key-partitioned for ordering and replay

Iceberg / Paimon lake table:
time-partitioned for analytical query efficiency

Anything else?

No response

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions