fix: consistent xsd:decimal/xsd:integer datatype across index and novelty (#1329)#1386
Open
bplatz wants to merge 3 commits into
Open
fix: consistent xsd:decimal/xsd:integer datatype across index and novelty (#1329)#1386bplatz wants to merge 3 commits into
bplatz wants to merge 3 commits into
Conversation
Arena-served xsd:decimal rendered with an empty `@type` while the same
value in novelty rendered as a bare string. NUM_BIG_OVERFLOW names no
single datatype from o_type alone (its arena holds both overflow
xsd:integer/BigInt and xsd:decimal/BigDecimal), so resolve_datatype_sid
returned None and callers fell back to Sid::new(0, ""); novelty copies
still carry o_type=XSD_DECIMAL and resolved correctly.
Add BinaryIndexStore::resolve_datatype_sid_for_value, which derives the
datatype from the decoded FlakeValue (Decimal->xsd:decimal,
BigInt->xsd:integer) when o_type cannot name it. Wire it into every
flake/binding builder that has the decoded value in scope: binary_range,
binary_history, object_binding, fast_group_count_firsts, block_fetch.
Also fix the pre-decode datatype filter in binary_scan: it ran on o_type
alone and dropped indexed big numerics from datatype-constrained patterns
(e.g. {"@value":"?p","@type":"xsd:decimal"}) while novelty copies passed.
It now decodes to disambiguate only in the rare ambiguous case, keeping
the common path decode-free.
…erics BigInt (overflow xsd:integer) and BigDecimal share the NUM_BIG arena and the same late-materialized EncodedLit, whose dt_id is hardcoded to decimal. Materializing by dt_id alone mislabeled an indexed > i64 integer as xsd:decimal. Add FlakeValue::overflow_numeric_datatype_sid, which recovers the datatype from the decoded value variant, and use it when materializing EncodedLit in both the query materializer and the API formatter. resolve_datatype_sid_for_value now delegates to it as well.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1329: an
xsd:decimalserved from the binary index rendered with an empty@type({"@value":"19.99","@type":""}) while the same value in novelty rendered as a bare string. Also fixes a sibling mislabel where indexed> i64integers came back asxsd:decimal.Root cause
OType::NUM_BIG_OVERFLOWnames no single datatype fromo_typealone — its arena holds both overflowxsd:integer(BigInt) andxsd:decimal(BigDecimal). Soresolve_datatype_sid(o_type)returnsNone, and every flake/binding builder fell back toSid::new(0, "")→ empty@type. Novelty copies still carryo_type = XSD_DECIMAL, so they resolved correctly — hence the index-vs-novelty asymmetry.A second, related issue: the late-materialized
EncodedLitfor the NumBig arena hardcodesdt_id = DECIMAL(it has no decoded value at creation time), so an indexed overflow integer materialized asxsd:decimal.Fix
BinaryIndexStore::resolve_datatype_sid_for_value(o_type, &val)derives the datatype from the decodedFlakeValue(Decimal→xsd:decimal,BigInt→xsd:integer) wheno_typecan't name it. Wired into every flake/binding builder that has the decoded value in scope:binary_range(x2),binary_history,object_binding,fast_group_count_firsts(x2),block_fetch.binary_scan::matches_datatype_constraintis a pre-decode filter — it dropped indexed big numerics from datatype-constrained patterns (e.g.{"@value":"?p","@type":"xsd:decimal"}) while novelty copies passed. It now decodes to disambiguate only in the rare ambiguous case, keeping the common path decode-free.FlakeValue::overflow_numeric_datatype_sid()recovers the datatype at materialization time (the EncodedLit can't be fixed at creation). Consulted before thedt_idlookup in both the query materializer and the API formatter;resolve_datatype_sid_for_valuedelegates to it too.Because
xsd:decimalis an inferable datatype, the consistent output is the bare-string form across crawl and tabular lanes (matching the existing convention).export.rsalready had per-variant fallbacks, so it needed no change.Tests
jsonld_decimal_renders_consistently_across_index_and_novelty— crawl + tabular SPARQL + datatype-constrained pattern, asserting the indexed and novelty decimals render identically and both match axsd:decimal-constrained query.indexed_overflow_integer_reports_xsd_integer_not_decimal— pins the late-materialization datatype recovery.Both confirmed to fail before the fix and pass after. Full
fluree-db-apisuite,fluree-db-query/fluree-db-coreunit tests, and clippy (all-features) are green.