Skip to content

fix(dev-env): recompute mydumper section sizes after search-replace#2872

Open
WRasada wants to merge 1 commit into
trunkfrom
fix/dev-env-mydumper-search-replace-sizes
Open

fix(dev-env): recompute mydumper section sizes after search-replace#2872
WRasada wants to merge 1 commit into
trunkfrom
fix/dev-env-mydumper-search-replace-sizes

Conversation

@WRasada

@WRasada WRasada commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Problem

vip dev-env import sql --search-replace=... on a MyDumper-format dump imports nothing. myloader floods the output with:

** Message: Different file size in metadata.header. Should be: -1 | Written: <n>. But continuing

and exits "successfully" with zero tables restored. vip dev-env sync sql is equally affected for multisite environments (it always search-replaces site URLs through the same code path).

Root cause

A MyDumper stream is a sequence of -- <filename> <size> section headers, each followed by that file's content. Search-replace changes content lengths, so fixMyDumperTransform() rewrote every header size to -1
(old implementation on trunk).

myloader parses the size with g_ascii_strtoull()-1 wraps to ULLONG_MAX — and uses it to distinguish a real header from header-looking content: while fewer bytes than the declared size have been written, a header line is treated as content (myloader_stream.c#L291-L309). Against ULLONG_MAX that condition is always true, so every header after the first is swallowed into metadata.header and no sections are ever processed.

This is the same failure that previously forced a mydumper downgrade in vip-container-images#1116 (refs PLTFRM-22, closed without a durable fix); the 0.21.3 upgrade re-exposed it.

The old transform also processed raw chunks rather than lines, so headers straddling a chunk boundary were silently missed — fixed here as well.

Fix

MyDumperSectionSizeTransform (src/lib/database.ts) replaces the -1 hack:

  1. Line-buffered parsing (headers can never be split across chunk boundaries)
  2. Emits each header with a fixed-width 20-digit zero-padded size placeholder
  3. Counts each section's actual post-replacement bytes as they stream through and records the placeholder's byte offset
  4. After the pipeline finishes, patchMyDumperSectionSizes() overwrites the placeholders in place (same byte length → offsets stay valid)

Size convention verified against real mydumper output: a section's size counts its content bytes including the content's own trailing newline, excluding the single separator newline before the next header; the final section runs to end of stream.

Wired into both consumers: searchAndReplace() (src/lib/search-and-replace.ts) and DevEnvSyncSQLCommand.runSearchReplace() (src/commands/dev-env-sync-sql.ts).

Also adds a guard rejecting compressed (.gz) input to searchAndReplace(): the replacement
operates on raw bytes, so a compressed file previously passed through with no replacements
applied and no indication of failure
. It now errors with instructions to decompress first.
(The dev-env import path is unaffected — it already decompresses before search-replace.)

Testing

  • New unit tests (__tests__/lib/database.js): stale-size recomputation, byte-exact content preservation, chunk-boundary splits at widths 1/3/7/64, header-lookalike content lines, final section without trailing newline
  • New test: compressed input is rejected with a clear error
  • New end-to-end test (__tests__/lib/search-and-replace.js): full searchAndReplace() over a MyDumper fixture with a length-changing replacement; asserts every header size matches the content that follows
  • Updated __fixtures__/dev-env-e2e/mydumper-detection.expected.sql — the old fixture asserted the -1 output as expected behavior; new sizes were derived with an independent script and match the transform's output
  • Oracle test against a real 30GB production-scale dump (201,724 sections): recomputed sizes byte-identical to mydumper's originals on unmodified content; ~280MB/s with adversarial 7,777-byte chunking; the size patch pass takes ~2s for 201k headers
  • Full import with --search-replace against a 201k-table dump completes and restores all tables (previously: zero)

Compatibility

  • Non-MyDumper (mysqldump) dumps: unaffected — the transform is only attached for MyDumper inputs
  • Imports without --search-replace: unaffected — the transform never runs

Changelog Description

Fixed

  • Dev-env: Fixed --search-replace producing an import that silently restores no tables for MyDumper-format SQL backups (affects vip dev-env import sql and vip dev-env sync sql)
  • Fixed vip search-replace writing MyDumper files that could not be imported with myloader (section sizes are now recomputed when output goes to a file)
  • Fixed search-replace silently applying no replacements when given a compressed (.gz) file; it now errors and asks for the file to be decompressed first

Related

🤖 Generated with Claude Code

@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Comment thread src/lib/database.ts Dismissed
The transform rewrote section header sizes to -1, which myloader >= 0.20
parses as ULLONG_MAX and then swallows every subsequent header as file
content, importing nothing. Emit fixed-width placeholders while counting
each section's post-replacement bytes, then patch the real sizes into the
output file. Also fixes header detection across chunk boundaries.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@WRasada WRasada force-pushed the fix/dev-env-mydumper-search-replace-sizes branch from cf6ef16 to 9183d6d Compare June 5, 2026 04:43
@sonarqubecloud

sonarqubecloud Bot commented Jun 5, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants