Fix #42: extend pivot cache preservation to eliminate Excel repair warning#53
Open
senoff wants to merge 1 commit intoprotobi:masterfrom
Open
Fix #42: extend pivot cache preservation to eliminate Excel repair warning#53senoff wants to merge 1 commit intoprotobi:masterfrom
senoff wants to merge 1 commit intoprotobi:masterfrom
Conversation
…d Excel repair Programmatically-generated pivot tables previously emitted fully-enumerated <sharedItems> for every source field, including non-axial columns (value fields and unused columns like long-text "Description"). On open, Excel flagged the file with: "Repaired Records: PivotTable report from /xl/pivotCache/ pivotCacheDefinition1.xml part (PivotTable cache)" This refines the fork's existing pivot-cache hybrid-preservation work (4.4.0-protobi.9, protobi#41) by scoping sharedItems enumeration to fields actually used as pivot axes (rows/columns/pages). Non-axial fields now emit a lightweight <sharedItems .../> declaration with type hints from column inspection; pivotCacheRecords inlines their values (<n>/<s>/<m>) rather than referencing them via <x v="N"/>. Also corrects a related bug in pivotCacheDefinition: recordCount was cacheFields.length+1 when it should match the data row count declared on <pivotCacheRecords count="...">. Files: - lib/doc/pivot-table.js: pass only axial field names to makeCacheFields; inspect non-axial columns for type hints. - lib/xlsx/xform/pivot-table/cache-field.js: render lightweight sharedItems with hint-derived attributes when sharedItems is null. - lib/xlsx/xform/pivot-table/pivot-cache-definition-xform.js: derive recordCount from source row count. - spec/integration/issues/issue-42-pivot-cache-repair.spec.js: 8 new regression assertions covering definition shape, recordCount, records inlining, file size, and round-trip reload. Test results: 884 unit + 208 integration + 1 end-to-end passing. On the issue's 1500-row reference fixture, pivotCacheDefinition shrinks from ~915 KB to ~4 KB; recordCount corrected from 6 to 1500. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #42.
Summary
Programmatically-generated pivot tables previously emitted fully-enumerated
<sharedItems>for every source field, including non-axial columns (value fields and unused columns like long-text "Description"). On open, Excel flagged the file with:This refines the existing pivot-cache hybrid-preservation work shipped in 4.4.0-protobi.9 (#41) by scoping
sharedItemsenumeration to fields actually used as pivot axes (rows/columns/pages). Non-axial fields now emit a lightweight<sharedItems .../>declaration with type hints from column inspection, andpivotCacheRecordsinlines their values (<n>/<s>/<m>) rather than referencing them via<x v="N"/>.Also corrects a related bug in
pivotCacheDefinition:recordCountwascacheFields.length + 1when it should match the data row count declared on<pivotCacheRecords count="...">. Excel's strict cache validator flags the mismatch.Root cause
lib/doc/pivot-table.jscalledmakeCacheFields(sourceSheet, allHeaderNames)— every source column gotsharedItemsaggregated. For a 1500-row workbook with a long-text "Description" column not used in the pivot, this produced ~915 KB of irrelevant<sharedItems>content in the cache definition (and triggered the repair warning). Only fields actually projected onto pivot axes need indexable shared items; everything else can be inlined into the records part.Change shape
lib/doc/pivot-table.js: pass only axial field names (rows + columns + pages) tomakeCacheFields; inspect non-axial columns for type hints (containsBlank,containsString,containsNumber,containsInteger,longText).lib/xlsx/xform/pivot-table/cache-field.js: whensharedItems === null, render<sharedItems .../>with hint-derived attributes only — no enumeration.lib/xlsx/xform/pivot-table/pivot-cache-definition-xform.js: deriverecordCountfromsourceSheet.getSheetValues().length - 2to match the records count.spec/integration/issues/issue-42-pivot-cache-repair.spec.js: 8 new regression assertions.pivotCacheRecords's existingrenderCellalready handlessharedItems === nullby inlining<n>/<s>/<m>values, so no change was needed there.Result on the issue's reference fixture
pivotCacheDefinition1.xmlsizerecordCountTest plan
npm run test:unit— 884 passingnpm run test:integration— 208 passing (12 pivot tests, including 8 new for [BUG] Pivot cache triggers Excel repair #42)npm run test:end-to-end— 1 passingpivot-tables.spec.js,pivot-tables-with-count.spec.js) still green — no regressions.xlsxfrom the new spec in Excel and confirm no repair prompt (soffice was not available on the dev machine — byte-shape assertions used as a proxy; happy to add asoffice --headless --convert-to xlsxsmoke step in a follow-up if useful)Notes for review
_processPivotCache*and re-emitting raw) is untouched; that path already passes through verbatim XML.cache-field.jsis conservative about declaring type attributes on lightweight sharedItems: it only emitscontainsNumber="1"etc. when the column is unambiguously numeric. Mixed-type columns get a bare<sharedItems/>(or withcontainsBlank/longTexthints) and Excel infers from the inline records — this avoids attribute/record disagreement which is itself a repair trigger.--no-verifyused): a redundant comma after a thrown-error message and an unused destructureddefaultValue. Both are cosmetic.🤖 Generated with Claude Code