[core] When scan.primary-branch is set, allow primary branch to be an append only table#7614
[core] When scan.primary-branch is set, allow primary branch to be an append only table#7614tsreaper wants to merge 3 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates branch read validation so that when scan.primary-branch is configured, the “primary” branch can be append-only while the main branch can have primary keys, and adds an IT case covering batch reads for this scenario.
Changes:
- Add an integration test verifying
scan.primary-branchreads partitions from the primary branch first, falling back to main for missing partitions. - Adjust schema validation in
FallbackReadFileStoreTableto validate primary-key compatibility based on the configured read priority order.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| paimon-flink/paimon-flink-common/src/test/java/org/apache/paimon/flink/BranchSqlITCase.java | Adds IT coverage for primary-branch precedence + PK(main)/append-only(primary) behavior. |
| paimon-core/src/main/java/org/apache/paimon/table/FallbackReadFileStoreTable.java | Reorders schema/PK validation using priority ordering to allow PKs only on the fallback side. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| FileStoreTable first = wrappedFirst ? wrapped : other; | ||
| FileStoreTable second = wrappedFirst ? other : wrapped; | ||
|
|
||
| String firstBranch = first.coreOptions().branch(); | ||
| String secondBranch = second.coreOptions().branch(); |
There was a problem hiding this comment.
first/second (and related variables) are ambiguous in a method that enforces asymmetric rules depending on read priority. Renaming them to something semantic like primary/fallback (or prioritized/fallback) would make it much harder to misread or accidentally break the intended validation behavior in future edits.
| List<String> firstPrimaryKeys = first.schema().primaryKeys(); | ||
| List<String> secondPrimaryKeys = second.schema().primaryKeys(); |
There was a problem hiding this comment.
first/second (and related variables) are ambiguous in a method that enforces asymmetric rules depending on read priority. Renaming them to something semantic like primary/fallback (or prioritized/fallback) would make it much harder to misread or accidentally break the intended validation behavior in future edits.
| if (!firstPrimaryKeys.isEmpty()) { | ||
| if (secondPrimaryKeys.isEmpty()) { | ||
| throw new IllegalArgumentException( | ||
| "Branch " | ||
| + mainBranch | ||
| + firstBranch | ||
| + " has primary keys while branch " | ||
| + otherBranch | ||
| + secondBranch | ||
| + " does not. This is not allowed."); | ||
| } | ||
| Preconditions.checkArgument( | ||
| mainPrimaryKeys.equals(otherPrimaryKeys), | ||
| firstPrimaryKeys.equals(secondPrimaryKeys), | ||
| "Branch %s and %s both have primary keys but are not the same.\n" | ||
| + "Primary keys of %s are %s.\n" | ||
| + "Primary keys of %s are %s.", | ||
| mainBranch, | ||
| otherBranch, | ||
| mainBranch, | ||
| mainPrimaryKeys, | ||
| otherBranch, | ||
| otherPrimaryKeys); | ||
| firstBranch, | ||
| secondBranch, | ||
| firstBranch, | ||
| firstPrimaryKeys, | ||
| secondBranch, | ||
| secondPrimaryKeys); | ||
| } |
There was a problem hiding this comment.
With the new logic, the method intentionally allows firstPrimaryKeys to be empty while secondPrimaryKeys is non-empty (i.e., append-only prioritized branch + PK fallback). That’s a non-obvious invariant for validateSchema(); adding a short comment explaining why this asymmetry is valid (and that the ordering is derived from read priority via wrappedFirst) would reduce the risk of a future refactor “fixing” it back to symmetric PK checks.
| List<String> secondPrimaryKeys = second.schema().primaryKeys(); | ||
| if (!firstPrimaryKeys.isEmpty()) { | ||
| if (secondPrimaryKeys.isEmpty()) { | ||
| throw new IllegalArgumentException( |
There was a problem hiding this comment.
Why do we need to restrict the situation of primary keys?
There was a problem hiding this comment.
Good question. I suppose this is because when we first design the fallback branch feature, we only think of a main primary key branch + fallback append only branch. Maybe we should allow all combinations.
32a40a5 to
5a4ca19
Compare
Purpose
In
scan.fallback-branch, we allow main branch to be append only, and fallback branch to have primary key. So inscan.primary-branch, we should allow main branch to have primary key, and primary branch to be append only.Tests
BranchSqlITCase