feat(drive): pre-flight 10000-rune total cap for +add-comment reply_elements#605
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughEnforces a server-aligned global rune budget for comment replies: all Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #605 +/- ##
==========================================
+ Coverage 64.19% 64.60% +0.41%
==========================================
Files 507 516 +9
Lines 44869 45791 +922
==========================================
+ Hits 28803 29583 +780
- Misses 13524 13624 +100
- Partials 2542 2584 +42 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@0ef47490b50c886cf92df00188950d0705e7b77b🧩 Skill updatenpx skills add larksuite/cli#feat/add-comment-length-precheck -y -g |
4740361 to
1216876
Compare
fangshuyu-768
left a comment
There was a problem hiding this comment.
Two concerns on the byte-limit pre-flight.
1. Byte check happens before HTML escaping (shortcuts/drive/drive_add_comment.go, lines ~579 and ~590). len(input.Text) is measured before escapeCommentText runs, but each < / > becomes 4 bytes (< / >) after escaping. A payload with even one < at exactly 300 bytes raw becomes 303 bytes after escaping, which the server may still reject with the same opaque [1069302]. Suggest measuring the escaped output:
```go
escaped := escapeCommentText(input.Text)
if byteLen := len(escaped); byteLen > maxCommentTextElementBytes {
// ...
}
```
and storing the already-escaped value to avoid double work later.
2. mention_user and link elements bypass the byte limit (lines ~569–586). If the server enforces a per-element limit regardless of type, those should be validated too. Worth confirming whether the limit is text-only, and either adding the same check or documenting the exemption.
fangshuyu-768
left a comment
There was a problem hiding this comment.
Inline note; expands on the earlier summary review.
1216876 to
8aa340e
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/drive/drive_add_comment_test.go (1)
352-360: Optional: add a symmetric escaped-byte overflow case for>You already cover
<escaped-length overflow; adding>would harden both escape branches equally.Optional diff
{ // Regression: byte check must measure the post-escape length so // '<' / '>' (which expand to '<' / '>', 4 bytes each) // don't slip through the limit into the opaque server-side // [1069302]. 99 ASCII chars + 75 '<' = raw 174 bytes, escaped // 99 + 75*4 = 399 bytes (over 300). name: "raw under but escaped over byte limit is rejected", input: `[{"type":"text","text":"` + strings.Repeat("a", 99) + strings.Repeat("<", 75) + `"}]`, wantErr: "(399 bytes)", }, + { + name: "greater-than chars also overflow by escaped byte length", + input: `[{"type":"text","text":"` + strings.Repeat("a", 99) + strings.Repeat(">", 75) + `"}]`, + wantErr: "(399 bytes)", + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/drive/drive_add_comment_test.go` around lines 352 - 360, Add a symmetric test case to drive_add_comment_test.go that mirrors the existing "raw under but escaped over byte limit is rejected" case but uses '>' characters instead of '<' to exercise the other escape branch; duplicate the entry with name like "raw under but escaped over byte limit is rejected (gt)" and set input to strings.Repeat("a", 99) + strings.Repeat(">", 75) and wantErr "(399 bytes)" so the test validates that '>'-to-'>' escaping also triggers the byte-limit rejection in the same way the existing case using '<' does.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/drive/drive_add_comment_test.go`:
- Around line 352-360: Add a symmetric test case to drive_add_comment_test.go
that mirrors the existing "raw under but escaped over byte limit is rejected"
case but uses '>' characters instead of '<' to exercise the other escape branch;
duplicate the entry with name like "raw under but escaped over byte limit is
rejected (gt)" and set input to strings.Repeat("a", 99) + strings.Repeat(">",
75) and wantErr "(399 bytes)" so the test validates that '>'-to-'>' escaping
also triggers the byte-limit rejection in the same way the existing case using
'<' does.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d83495a6-acb4-44f1-b634-e1ef20a067fd
📒 Files selected for processing (2)
shortcuts/drive/drive_add_comment.goshortcuts/drive/drive_add_comment_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- shortcuts/drive/drive_add_comment.go
The open-platform comment API returns an opaque [1069302] Invalid or missing parameters whenever a single reply_elements[i] text exceeds its implicit byte budget. The error does not name which element failed or that length is the cause, so callers resort to binary-search debugging. Empirically: Chinese text up to ~80 chars (~240 bytes) lands; ~130 chars (~390 bytes) fails. Set the pre-flight limit to 300 bytes which sits safely inside the known-good zone. - parseCommentReplyElements now rejects any text element whose UTF-8 byte length exceeds 300, with an ExitError naming the element index (#N, 1-based) and both the rune and byte counts, plus an ErrWithHint recommending the correct remediation (split into multiple text elements — the comment UI renders them as one contiguous comment). - The previous 1000-rune check is removed: it was too lenient (a Chinese text under that cap would still fail server-side). - skills/lark-drive/references/lark-drive-add-comment.md documents the per-element limit and the correct split pattern so agents avoid constructing oversized single elements upstream. Addresses Case 12 in the 踩坑列表 doc.
`escapeCommentText` only expands `<` and `>` (each → 4 bytes via `<` / `>`); `&` is intentionally left as-is. Both the over-limit hint and the inline comment in `parseCommentReplyElements` previously claimed `&` was also escaped, with a "4-5 bytes each" range that implicitly assumed `&` (5 bytes) — a string of 300 `&` chars would actually fit in the budget, but a user reading the hint would think otherwise and pre-emptively split it. Code: - Hint string ends with `Note: '<' and '>' are HTML-escaped and counted in their escaped form (4 bytes each).` (was: included `&` and "4-5 bytes") - Inline comment above the budget check now matches: `escapeCommentText only expands '<' and '>' (each becomes 4 bytes: < / >); '&' is intentionally left as-is.` Tests (regression): - New `300 ampersands accepted (escapeCommentText leaves '&' as-is)` subtest pins that 300 `&` chars stay within budget. Without the fix this also passed (function was always correct), but the hint was lying — the test pins the budget contract loud and clear. - New `TestParseCommentReplyElementsHintMatchesEscape` asserts the hint string itself: must mention `'<' and '>'` / `4 bytes`, must NOT mention `'&'` / `&` / `4-5 bytes`. Catches a future drift if `escapeCommentText` is changed without updating the hint, or vice-versa. The skill md (`skills/lark-drive/references/lark-drive-add-comment.md`) already had the right wording (`每个 < 或 > 占 4 字节`), so it was the in-Go strings that drifted; this commit aligns code with doc.
8aa340e to
8069a39
Compare
…vior
The original PR set a 300-byte per-element pre-flight check, justified
by the empirical pattern "~80 Chinese chars succeeds, ~130 fails". A
fresh round of probing the live `/open-apis/drive/v1/files/{token}/
new_comments` endpoint with a real docx shows that pattern does not
reproduce, and the actual contract is very different:
- 10000 ASCII / 10000 Chinese / 10000 '<' (escaped to 40000 bytes)
in a single text element: all OK
- 10001 of any of the above in a single text element: [1069302]
- 5000 + 5000 across two text elements (total 10000): OK
- 5000 + 5001 across two text elements (total 10001): [1069302]
- 4000 + 4000 + 4000 across three (total 12000): [1069302]
Two consequences:
1. The cap is *10000 runes total across all reply_elements text*, not
300 bytes per element. The old check rejected legitimate input
anywhere from ~100 to 10000 Chinese chars (≈100x too aggressive).
2. The hint that recommended "split the content across multiple
{\"type\":\"text\",\"text\":\"...\"} elements" was actively wrong —
splitting doesn't bypass a total cap. A user told to split a
10001-char message into 5000+5001 hits the same opaque [1069302].
This commit:
- Replaces `maxCommentTextElementBytes = 300` with
`maxCommentTotalRunes = 10000`. The constant's doc comment records
the probe matrix above so future maintainers know how it was
derived.
- Switches the measurement from `len(escapeCommentText(input.Text))`
to `utf8.RuneCountInString(input.Text)`. Server counts raw runes;
byte width and post-escape form are irrelevant. The escape itself
still happens — `<` and `>` still get rendered literally — but it
no longer participates in the length check.
- Tracks a running `totalRunes` across the whole reply_elements array
and bails at the first element that pushes the cumulative total
over the 10000-rune budget, with index reporting that points at the
offending element.
- Rewrites the over-cap hint to (a) name the actual 10000-rune budget,
(b) explicitly say splitting does NOT help, (c) drop the wrong
"comment UI still renders them as one contiguous comment" framing
that implied splitting was a workaround.
- Adds a `TestParseCommentReplyElementsHintForbidsSplitAdvice`
watchdog that fails if any future drift puts the discredited split
advice back into the hint.
Tests: 11 cases on TestParseCommentReplyElementsTextLength covering
single-element boundary (ASCII / Chinese / angle brackets at exactly
10000 and at 10001), multi-element total cap (5000+5000 OK, 5000+5001
rejected with index pointing at element #2), early-element-overshoot
indexing (first element at 10001 reports index #1, not the trailing
element), and mention_user not double-counting toward the cap.
Skill md updated: removes the 300-byte / "split into multiple
elements" advice; documents the 10000-rune total cap with a note that
the schema currently advertises 1-1000 chars and is out of date,
plus a procedure for re-probing if the server-side limit ever moves.
Manual API verification: rebuilt binary and posted comments at
boundary lengths — all OK cases (100 / 5000 / 10000 chars, 5000+5000
split) accepted by server; over-cap cases (10001 / 10100 single, and
5000+5001 split) rejected by the new pre-flight before reaching the
network.
87fed6f to
0ef4749
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/drive/drive_add_comment.go`:
- Around line 653-658: The error and hint returned from the validation branch
using output.ErrWithHint("text_too_long", ...) incorrectly state the cap applies
across "ALL reply_elements" even though the validator only sums elements with
type:"text"; update both the main message (the fmt.Sprintf that mentions "across
ALL reply_elements") and the hint (the fmt.Sprintf that tells the user about the
cap and splitting) to explicitly state the limit applies to text elements (e.g.,
"type=text" or "text elements") and reference the same maxCommentTotalRunes
variable and existing context variables (totalRunes, index, runes) so the
message remains accurate and consistent with the validator behavior in
drive_add_comment.go.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f0816e71-9e7e-49b2-95f5-92530af28db3
📒 Files selected for processing (3)
shortcuts/drive/drive_add_comment.goshortcuts/drive/drive_add_comment_test.goskills/lark-drive/references/lark-drive-add-comment.md
🚧 Files skipped from review as they are similar to previous changes (1)
- skills/lark-drive/references/lark-drive-add-comment.md
Summary
The open-platform comment endpoint
/open-apis/drive/v1/files/{token}/new_commentsrejects oversized payloads with an opaque[1069302] Invalid or missing parameters— no field name, no length information, no indication that length is even the cause. Callers (especially agents) end up binary-searching the content to figure out what hit the cap.This PR adds a pre-flight check inside
parseCommentReplyElementssodrive +add-commentrejects over-cap input client-side with a structured, actionable error before any network round-trip.Behavior
type=textelements in--contentmust be ≤ 10000.<chars all sit at the limit, even though they encode to 10000 / 30000 / 40000 bytes respectively.mention_user(open_id) andlink(URL) don't count toward the budget.text_too_longExitError naming the offending element index and the running total at that point. Hint walks the caller through the fix.Empirical basis
Probed via
drive file.comments create_v2against a fresh docx, bypassing the shortcut so the server's actual responses are visible:[1069302][1069302]<chars (would-be 40000 B if escaped)&chars[1069302][1069302]Two takeaways: (1) the cap is on total runes, not per-element bytes; (2) splitting doesn't bypass the cap, so the previous "split into multiple elements" advice was actively misleading and is no longer in the code, the tests, or the skill md.
The schema doc currently advertises "1-1000 characters" for
textelements. The live API accepts 10× that. The discrepancy is documented in a comment onmaxCommentTotalRunes; if the schema is later corrected to 10000 (or any other value), re-probe withdrive file.comments create_v2and update the constant.Changes
shortcuts/drive/drive_add_comment.go—parseCommentReplyElementsnow tracks a running rune total across reply_elements and rejects at the first element pushing the cumulative count past 10000. Counts the raw input viautf8.RuneCountInString; escape (<→<,>→>) still happens for rendering but no longer participates in the length check. ThemaxCommentTotalRunes = 10000constant carries an inline doc comment with the probe matrix.shortcuts/drive/drive_add_comment_test.go—TestParseCommentReplyElementsTextLengthrewritten with 11 cases: single-element boundaries (ASCII / Chinese / angle brackets at exactly 10000 and 10001), multi-element total-cap (5000+5000 OK, 5000+5001 rejected with the rejection index pointing at element readme 优化建议 #2), early-element overshoot (first element at 10001 reports index Can I join the meeting via CLI? #1, not the trailing element), andmention_usernot double-counting. NewTestParseCommentReplyElementsHintForbidsSplitAdvicewatchdog fails CI if anyone re-introduces the discredited "split into multiple text elements" advice into the hint.skills/lark-drive/references/lark-drive-add-comment.md— replaces the old "300-byte per-element / split as workaround" bullets with two short ones describing the 10000-rune total cap and themention_user/linkexemption.Test Plan
go build ./...cleango vet ./...cleango test ./shortcuts/drive/... -count=1passes; race-mode pass cleangolangci-lint run --new-from-rev=origin/main— 0 issuesnode scripts/skill-format-check/index.js skillspassesgo test $(go list ./... | grep -v cli_e2e) -count=1) — all packages green