Skip to content

fix(dpp): remove erroneous keywords field from document-meta schema and fix contract keywords docs#3471

Open
thephez wants to merge 5 commits intov3.1-devfrom
docs/keyword-clarification
Open

fix(dpp): remove erroneous keywords field from document-meta schema and fix contract keywords docs#3471
thephez wants to merge 5 commits intov3.1-devfrom
docs/keyword-clarification

Conversation

@thephez
Copy link
Copy Markdown
Collaborator

@thephez thephez commented Apr 9, 2026

Issue being fixed or feature implemented

#2523 accidentally added a keywords field to the document type meta schema. The correct location is contract-level (DataContractV1.keywords). Neither DocumentTypeV0 nor DocumentTypeV1 has a keywords field, and try_from_schema never reads it — so document-type keywords were silently accepted and discarded.

Protocol v12 set document_type_schema: 1 (v4.rs:42), making the v1 meta schema the active validation path. The errant field needs to be removed from the v1 schema and its v12-migration allowlist in lockstep.

Separately, the DataContractV1 doc comment incorrectly stated the contract-level keyword limit as 20 when the validators and TooManyKeywordsError both enforce 50.

What was done?

  • Removed the keywords block from packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json (the active meta schema)
  • Removed "keywords" from ALLOWED_DOCUMENT_SCHEMA_V1_PROPERTIES in allowed_top_level_properties.rs — required to keep the allowlist in sync with the schema (enforced by the allowlist_matches_v1_meta_schema_properties parity test)
  • Left the v0 meta schema unchanged — it's a frozen historical schema and not on the active validation path
  • Fixed the doc comment on DataContractV1 to state the correct contract-level max of 50 keywords
  • Clarified inline validator comments to say "contract keywords"

How Has This Been Tested?

  • cargo test -p dpp — all 2462 dpp lib tests pass, including allowlist_matches_v1_meta_schema_properties
  • cargo fmt --all clean
  • No new tests added — change is a schema + allowlist removal backed by existing parity test coverage

Breaking Changes

Document-type keywords is no longer accepted by v1 meta-schema validation. Any existing stored contract that happens to have keywords on a document type will have it stripped during the v12 migration via strip_unknown_document_schema_properties. This is safe — keywords on document types has never been read by Rust, so stripping it cannot affect runtime semantics. Contract-level keywords (DataContractV1.keywords) are unaffected.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

Release Notes

  • Documentation

    • Clarified keyword field constraints in validation documentation.
    • Updated maximum keywords limit from 20 to 50 in field documentation.
  • Changes

    • Removed the keywords property from document schema definitions. Documents will no longer support keyword fields.

@thephez thephez requested a review from QuantumExplorer as a code owner April 9, 2026 20:42
@github-actions github-actions bot added this to the v3.1.0 milestone Apr 9, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0e7d8275-b8cc-49ed-b7fd-76e71bcdec0c

📥 Commits

Reviewing files that changed from the base of the PR and between 2b3585f and 85081a8.

📒 Files selected for processing (5)
  • packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json
  • packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs
  • packages/rs-dpp/src/data_contract/methods/validate_update/v0/mod.rs
  • packages/rs-dpp/src/data_contract/v1/data_contract.rs
  • packages/rs-drive-abci/src/execution/validation/state_transition/state_transitions/data_contract_create/basic_structure/v0/mod.rs
💤 Files with no reviewable changes (2)
  • packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs
  • packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json
✅ Files skipped from review due to trivial changes (3)
  • packages/rs-dpp/src/data_contract/v1/data_contract.rs
  • packages/rs-drive-abci/src/execution/validation/state_transition/state_transitions/data_contract_create/basic_structure/v0/mod.rs
  • packages/rs-dpp/src/data_contract/methods/validate_update/v0/mod.rs

📝 Walkthrough

Walkthrough

Documentation updates clarifying keywords constraints for data contracts (increasing stated maximum from 20 to 50), combined with removal of the keywords field from document meta-schema and its corresponding allowlist entry to prevent documents from accepting keywords.

Changes

Cohort / File(s) Summary
Documentation & Comment Updates
packages/rs-dpp/src/data_contract/.../validate_update/v0/mod.rs, packages/rs-dpp/src/data_contract/v1/data_contract.rs, packages/rs-drive-abci/src/execution/validation/.../data_contract_create/basic_structure/v0/mod.rs
Updated inline comments and documentation: clarified "contract keywords" terminology and increased documented maximum keywords limit from 20 to 50 for data contract validation.
Schema & Allowlist Removal
packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json, packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs
Removed keywords property definition from document meta-schema and removed "keywords" entry from the v1 document schema properties allowlist, preventing documents from including keywords fields.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Poem

🐰 Keywords once scattered in documents free,
Now live in contracts, where they should be!
From twenty to fifty, the contracts now scale,
While schemas stay clean—a harmonious tale! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: removing the erroneous keywords field from document-meta schema and fixing contract keywords documentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/keyword-clarification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@thephez thephez force-pushed the docs/keyword-clarification branch from 4390ccf to b98d374 Compare April 9, 2026 20:45
@thepastaclaw
Copy link
Copy Markdown
Collaborator

thepastaclaw commented Apr 9, 2026

Review Gate

Commit: 85081a85

  • Debounce: 16m ago (need 30m)

  • CI checks: checks still running (2 pending)

  • CodeRabbit review: comment found

  • Off-peak hours: peak window (5am-11am PT) — currently 08:00 AM PT Thursday

  • Run review now (check to override)

@QuantumExplorer
Copy link
Copy Markdown
Member

This might break testnet/mainnet (we will need to verify it doesn't)

Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Clean, well-scoped PR that removes an erroneous keywords property from the document-meta JSON schema and fixes doc comments. The one substantive finding — modifying a runtime-compiled schema in place without protocol version gating — is technically valid but low practical risk. The doc comment nitpick is dropped because the author already considered and explicitly reverted a more verbose comment.

Reviewed commit: b98d374

🟡 1 suggestion(s)

1 additional finding

🟡 suggestion: In-place modification of runtime-compiled v0 meta-schema

packages/rs-dpp/schema/meta_schemas/document/v0/document-meta.json (lines 591-594)

Verified: DOCUMENT_META_JSON_V0 is loaded via include_str! at meta_validators/mod.rs:39 and compiled into DOCUMENT_META_SCHEMA_V0 at line 89. Both try_from_schema/v0 (line 136) and try_from_schema/v1 (line 155) use this single compiled schema to validate document type definitions at runtime.

The root meta-schema has no additionalProperties: false — the additionalProperties at line 591 is a property definition (constraining what document types must set), not a root-level restriction. So unknown properties were already silently accepted. The removed keywords definition only constrained malformed keyword values (wrong item types, >20 items, etc.).

During a rolling upgrade, a document type with malformed keywords (e.g. non-string items) would be rejected by old binaries but accepted by new ones — a theoretical consensus divergence. Practical risk is minimal since no code reads keywords from document types and the field was always discarded. Only one version of this schema exists (no v1), so there's no version-gating mechanism to leverage.

If the team's upgrade process guarantees atomic binary upgrades (all nodes update before processing new blocks), this is a non-issue. Otherwise, consider whether this warrants a versioned schema.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-dpp/schema/meta_schemas/document/v0/document-meta.json`:
- [SUGGESTION] lines 591-594: In-place modification of runtime-compiled v0 meta-schema
  Verified: `DOCUMENT_META_JSON_V0` is loaded via `include_str!` at `meta_validators/mod.rs:39` and compiled into `DOCUMENT_META_SCHEMA_V0` at line 89. Both `try_from_schema/v0` (line 136) and `try_from_schema/v1` (line 155) use this single compiled schema to validate document type definitions at runtime.

The root meta-schema has no `additionalProperties: false` — the `additionalProperties` at line 591 is a *property definition* (constraining what document types must set), not a root-level restriction. So unknown properties were already silently accepted. The removed `keywords` definition only constrained *malformed* keyword values (wrong item types, >20 items, etc.).

During a rolling upgrade, a document type with malformed `keywords` (e.g. non-string items) would be rejected by old binaries but accepted by new ones — a theoretical consensus divergence. Practical risk is minimal since no code reads keywords from document types and the field was always discarded. Only one version of this schema exists (no v1), so there's no version-gating mechanism to leverage.

If the team's upgrade process guarantees atomic binary upgrades (all nodes update before processing new blocks), this is a non-issue. Otherwise, consider whether this warrants a versioned schema.

@thephez
Copy link
Copy Markdown
Collaborator Author

thephez commented Apr 13, 2026

Maybe good to pair this change with #3475 in the same release? Prior to that PR, keywords (or any additional property) would be accepted, but that PR makes things more strict.

If AI correctly stated that extra document schema entries were actually just silently dropped, there shouldn't be breaking issues. But I'm not familiar enough with the inner workings to be sure on that so it should be reviewed by someone more knowledgeable.

Copy link
Copy Markdown
Collaborator

@thepastaclaw thepastaclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The comment/doc cleanups are correct, but the schema fix is incomplete. This PR only removes the erroneous document-type keywords field from the v0 document meta-schema, while the current protocol path still validates document types against the v1 meta-schema that retains the same field.

Reviewed commit: 2b3585f

🔴 1 blocking

1 additional finding

🔴 blocking: The live document meta-schema still accepts document-level `keywords`

packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json (lines 591-600)

Removing keywords from document/v0/document-meta.json does not fix the active validation path. DocumentType::try_from_schema() selects DOCUMENT_META_SCHEMA_V1 whenever document_type_schema == 1, and that is the version used for the current protocol line, so contracts created today still validate successfully with document-level keywords because packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json still defines the field. The v12 migration allowlist in packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs also still treats keywords as a valid document-type property, so the cleanup needs to update the v1 schema and its synced allowlist together.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json`:
- [BLOCKING] lines 591-600: The live document meta-schema still accepts document-level `keywords`
  Removing `keywords` from `document/v0/document-meta.json` does not fix the active validation path. `DocumentType::try_from_schema()` selects `DOCUMENT_META_SCHEMA_V1` whenever `document_type_schema == 1`, and that is the version used for the current protocol line, so contracts created today still validate successfully with document-level `keywords` because `packages/rs-dpp/schema/meta_schemas/document/v1/document-meta.json` still defines the field. The v12 migration allowlist in `packages/rs-dpp/src/data_contract/document_type/schema/allowed_top_level_properties.rs` also still treats `keywords` as a valid document-type property, so the cleanup needs to update the v1 schema and its synced allowlist together.

@thephez thephez marked this pull request as draft April 16, 2026 13:01
thephez and others added 3 commits April 16, 2026 09:02
… comments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The field was added by mistake in #2523 — keywords belong to the
contract level (DataContractV1), not document types. DocumentTypeV0/V1
have no keywords field and try_from_schema never reads it, so the
schema entry was silently accepted then discarded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@thephez thephez force-pushed the docs/keyword-clarification branch from 2b3585f to 6d234e9 Compare April 16, 2026 13:02
thephez and others added 2 commits April 16, 2026 10:41
v0 is a frozen historical meta schema no longer on the active
validation path (protocol v12 switched to v1). Removing it from v0
had no runtime effect but risked diverging from what was validated
when legacy contracts were created. The real fix belongs on the v1
schema, which is applied in a follow-up commit.

Reverts the schema portion of 9e56d4d.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #2523 mistakenly placed keywords in the document-type meta schema
(correct location is contract-level, on DataContractV1). Neither
DocumentTypeV0 nor DocumentTypeV1 has a keywords field, and
try_from_schema never reads it.

v1 is the active meta schema (protocol v12 sets document_type_schema
to 1), so document-type keywords were still silently accepted in
new contracts after the earlier v0 fix. Removing the field from the
v1 schema and its v12-migration allowlist in lockstep — the parity
test (allowlist_matches_v1_meta_schema_properties) requires both to
change together.

Any existing stored contract that happens to have document-type
keywords will be cleanly stripped by the v12 migration in
strip_unknown_document_schema_properties; safe because keywords on
document types has never been read by Rust.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@thephez thephez marked this pull request as ready for review April 16, 2026 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants