refactor: enforce tilebuf-only pipeline#612
refactor: enforce tilebuf-only pipeline#612Zhendong404 wants to merge 1 commit intohw-native-sys:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request transitions the PTOAS compilation pipeline to a tilebuf-native model, replacing legacy memref dependencies with a unified tile_buf/view/ptr system. This involves introducing a semantic extraction layer and updating PlanMemory and InsertSync to consume these native types. Feedback identifies several critical issues: a unit mismatch between bits and bytes in address planning, incorrect stride calculations for column-major subviews in the sync translator, and the breakage of global workspace planning due to hardcoded local-memory logic. Additionally, the reviewer noted that scalar load/store operations are missing from liveness analysis and pointed out redundant logic for buffer alias extraction that should be unified.
| SmallVector<uint64_t> offsets = getAllocatedOffsets( | ||
| op, currentMemRefType, buffer2Offsets, fallbackNextOffset); | ||
| op, tileType, buffer2Offsets, fallbackNextOffset); |
There was a problem hiding this comment.
There is a unit mismatch between PlanMemory and AllocToPointerCast. PlanMemory calculates offsets in bits (using semantics.constBits), while AllocToPointerCast's fallback logic and alignment constants use bytes. If a buffer is planned, its offset in buffer2Offsets will be in bits, but it is used here as a byte address, leading to incorrect hardware addresses. Ensure that offsets are consistently handled in bytes or bits throughout the pipeline, or perform the necessary conversion here.
| SmallVector<int64_t, 4> strides(shape.size(), 1); | ||
| for (int i = static_cast<int>(shape.size()) - 2; i >= 0; --i) { | ||
| if (shape[i + 1] == ShapedType::kDynamic) | ||
| return {-1, -1}; | ||
| strides[i] = strides[i + 1] * shape[i + 1]; | ||
| } |
There was a problem hiding this comment.
The stride calculation for pto.subview assumes a row-major layout. If the source tile has a column-major layout (specified in its TileBufType config), the calculated totalOffset will be incorrect, leading to wrong hazard detection in sync insertion. The calculation should check the blayout attribute of the source tile type and adjust the stride accumulation accordingly.
| auto memorySpaceAttr = getPlanningBufferSpaceAttr(operand); | ||
| if (isLocalMemPlan() && isLocalBuffer(memorySpaceAttr)) { | ||
| if (!memorySpaceAttr.has_value()) | ||
| llvm::report_fatal_error("local buffer must have memory space"); | ||
| return GetBufferInfo(op, operand, | ||
| memorySpaceAttr.value().getAddressSpace()); | ||
| return GetBufferInfo(op, operand, memorySpaceAttr.value().getAddressSpace(), | ||
| out); | ||
| } | ||
| llvm_unreachable("buffer must has BufferInfo !"); | ||
| op->emitError("expects local tile buffer result for PlanMemory"); | ||
| return failure(); | ||
| } |
There was a problem hiding this comment.
GenerateBufferInfo is currently hardcoded to only support local memory planning. If planMode is GLOBAL_WORKSPACE_PLAN, the function will emit an error and return failure for any buffer, effectively breaking global workspace planning. Additionally, getPlanningBufferSpaceAttr does not handle GM views, which are the primary targets for global workspace planning. The logic should be updated to handle non-local buffers when in global workspace planning mode and to correctly resolve the GM address space for views.
| } else if (auto tprintOp = dyn_cast<pto::TPrintOp>(op)) { | ||
| // TPrintOp only reads from buffer, similar to LoadOp | ||
| OpKillHandle(curOpInfo, live, op->getBlock()); |
There was a problem hiding this comment.
pto::LoadScalarOp and pto::StoreScalarOp are not handled in RecursionIR. Since they read/write buffers, they should trigger OpKillHandle or UpdateOpGenInfo to ensure accurate liveness analysis. Skipping them may lead to incorrect memory planning if they are the last users of a buffer.
} else if (auto tprintOp = dyn_cast<pto::TPrintOp>(op)) {
// TPrintOp only reads from buffer, similar to LoadOp
OpKillHandle(curOpInfo, live, op->getBlock());
} else if (auto loadScalarOp = dyn_cast<pto::LoadScalarOp>(op)) {
OpKillHandle(curOpInfo, live, op->getBlock());
} else if (auto storeScalarOp = dyn_cast<pto::StoreScalarOp>(op)) {
UpdateStoreOpInfo(curOpInfo, storeScalarOp.getPtr(), live);| // Returns (alias_result, source) for planning aliases, including tile views. | ||
| std::optional<std::pair<Value, Value>> getBufferAliasInfo(Operation *op) { | ||
| if (auto genericAlias = getOperationAliasInfo(op)) | ||
| return genericAlias; | ||
|
|
||
| if (auto bindOp = dyn_cast<pto::BindTileOp>(op)) | ||
| return std::make_pair(bindOp.getResult(), bindOp.getSource()); | ||
| if (auto subviewOp = dyn_cast<pto::SubViewOp>(op)) | ||
| return std::make_pair(subviewOp.getResult(), subviewOp.getSource()); | ||
| if (auto bitcastOp = dyn_cast<pto::BitcastOp>(op)) | ||
| return std::make_pair(bitcastOp.getResult(), bitcastOp.getSrc()); | ||
| if (auto treshapeOp = dyn_cast<pto::TReshapeOp>(op)) | ||
| return std::make_pair(treshapeOp.getResult(), treshapeOp.getSrc()); | ||
| return std::nullopt; | ||
| } |
There was a problem hiding this comment.
getBufferAliasInfo is redundant with getOperationAliasInfo defined in Utils.cpp. Both functions now handle the same set of PTO ops (BindTileOp, SubViewOp, BitcastOp, TReshapeOp). These should be unified to avoid duplicate logic and potential inconsistencies.
std::optional<std::pair<Value, Value>> getBufferAliasInfo(Operation *op) {
return getOperationAliasInfo(op);
}
Codex Review该评论由 review 机器人自动更新。
SummaryPR #612 introduces memref-compatibility regressions in the pipe pipeline, a silent address-0 aliasing bug in standalone EmitC lowering, and an unversioned C API type-helper break. Findings
The PR removes the legacy memref bridge from the default ptoas pipeline (
The exported C API symbol |
5daaa8e to
c3e1f7c
Compare
No description provided.