Skip to content

feat: Complete SPARQL property-path support (ZeroOrOne, negated sets, composite & inverse transitive)#1379

Open
bplatz wants to merge 7 commits into
mainfrom
feature/sparql-property-paths
Open

feat: Complete SPARQL property-path support (ZeroOrOne, negated sets, composite & inverse transitive)#1379
bplatz wants to merge 7 commits into
mainfrom
feature/sparql-property-paths

Conversation

@bplatz

@bplatz bplatz commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Rounds out the final SPARQL 1.1 property-path forms that were not yet covered, so the full W3C property-path grammar now evaluates in the engine instead of returning HTTP 400. Wired in both the SPARQL and JSON-LD (@path) lowering paths, which share the same IR and traversal operator.

What's now supported

  • p? (ZeroOrOne) — at top level and as a sequence step (p?/q, ^p?/q). New PathModifier::ZeroOrOne; the traversal operator short-circuits to the node itself plus a single hop.
  • Negated property sets!p, !(a|b), !(^p), and mixed. Lowered to a fresh predicate-variable triple plus a FILTER excluding the listed predicates, unioning forward and inverse directions as needed.
  • Nested modifiers((p)*)*, (p+)?, (p?)?, … collapse algebraically to a single modifier.
  • Transitive over a composite sub-path(a/b)+, (a/b)*, (a/b)?. Each hop follows the whole sub-path. Steps may be alternations of simple predicates and may be inverse ((^a/b)+, (a/^b)+, (^(a|b)/c)+).
  • Parenthesized inverse-transitive(^p)+, (^p)*, (^p)? (and the JSON-LD array form), via the identity (^X)+ ≡ ^(X+).

The one remaining grammar corner — a nested transitive step inside a composite unit ((a+/b)+) — is intentionally rejected and guarded by a negative test. It is not exercised by any W3C conformance test.

Implementation notes

  • PropertyPathPattern gained direction-aware composite steps (PathStep { predicates, inverse } + first_inverse); the traversal operator's read_step maps step direction to a SPOT or POST read (and the opposite index when retreating for backward traversal). Single-predicate count fast paths bail on composite, so hot paths are untouched.
  • Negated-set filtering reuses the predicate-variable FILTER(?p != IRI(...)) comparison, matching how a hand-written filter lowers.
  • The both-unbound composite closure is correct but unoptimized (per-start BFS, no shared adjacency map) — noted in code for a future pass.

Tests & docs

  • New SPARQL and JSON-LD parity tests for every form above (forward, backward, both-unbound, and reachability modes), mirroring the relevant W3C property-path / nps_* cases.
  • Updated docs/query/sparql.md, docs/query/jsonld-query.md, docs/query/graph-crawl.md, and docs/reference/compatibility.md; corrected stale Rules text that predated this work.

CI: clippy (all features, all targets) + nextest (workspace, all features).

bplatz added 7 commits June 25, 2026 11:47
…d modifiers

Adds three previously-unsupported SPARQL 1.1 property-path forms, wired in
both the SPARQL and JSON-LD lowering paths so they share the same IR and
executor:

- ZeroOrOne (p?): new PathModifier::ZeroOrOne; the traversal operator
  short-circuits to the node itself plus a single hop (no closure).
- Negated property sets (!p, !(a|b), !(^p), mixed): lowered to a fresh
  predicate-variable triple plus a FILTER excluding the listed predicates,
  with forward and inverse directions unioned as needed.
- Nested modifiers ((p*)*, (p+)?, (p?)?): collapsed algebraically to a
  single modifier over the innermost path.

Transitive over a composite sub-path ((a/b)+) remains unsupported and now
has a focused negative test guarding it.

Updates docs/query/sparql.md and docs/reference/compatibility.md.
Extends property paths so a +, *, or ? modifier can apply to a forward
sequence: (p1/p2/...)+ now follows the whole sub-path on every hop.

- PropertyPathPattern gains sequence_steps; new_composite() builds the
  per-step predicate sets. Simple/alternation paths are unchanged
  (sequence_steps empty), and single_predicate() reports None for a
  composite so the single-predicate count fast paths bail.
- The traversal operator composes hops out of per-step forward/backward
  reads; the both-unbound closure seeds from first-step subjects.
- SPARQL and JSON-LD lowering both detect a sequence inner (after nested-
  modifier collapse) and build a composite pattern; non-forward steps
  (inverse, nested transitive) are still rejected.

Updates docs/query/sparql.md and docs/reference/compatibility.md.
Add zero-or-one, composite-transitive, and the nested-modifier collapse to
the JSON-LD @path operator tables (jsonld-query.md) and the graph-crawl
operator reference; narrow their not-yet-supported notes to inverse steps
inside a composite repeated unit.
A composite-transitive step may now run backward, e.g. (^a/b)+. Each step
of the repeated sub-path carries its own direction.

- PathStep { predicates, inverse } replaces the forward-only step vec; the
  pattern gains first_inverse for the leading step. The traversal operator
  reads SPOT or POST per step by direction (and the opposite index when
  retreating), so backward traversal and the both-unbound closure seed from
  the correct endpoint.
- SPARQL and JSON-LD lowering both record per-step direction; a nested
  transitive step inside the unit (a+/b) remains rejected.

Updates docs/query/sparql.md, docs/query/jsonld-query.md, and
docs/reference/compatibility.md.
Addresses three property-path review findings:

- ZeroOrOne (?) is now accepted as a sequence step (p?/q, ^p?/q), not just
  at top level, in both SPARQL and JSON-LD sequence-step lowering.
- Both-unbound composite */? closure now emits the zero-length self-pair
  for every node in the path's domain, not only first-step subjects —
  matching the simple-path closure (which tracks subjects and objects).
- Corrects the SPARQL doc Rules section: both-constant transitive paths are
  reachability tests (not rejected), and modifiers ARE allowed on simple
  sequence steps.

Also notes in code that the both-unbound composite closure is unoptimized
(per-start BFS, no shared adjacency map).
…erage

- (^p)+, (^p)*, (^p)? (and the JSON-LD array form) now lower correctly:
  an inverse wrapper around a repeated unit is peeled by swapping endpoints
  ((^X)+ == ^(X+)), so it reduces to a forward path. Covers (^(a|b))+ and
  (^(a/b))+ too. Previously these fell through to the forward extractor and
  were rejected.
- Tests: parenthesized inverse +/*/?; both-unbound (^p/q)+ exercising the
  object-side seed; inverse non-leading step (p/^q) forward and backward;
  inverse alternation step (^(p|q)/r); JSON-LD string and array parity.
- Docs: note (^p)+ == ^p+; clarify composite steps may be inverse.
@bplatz bplatz requested review from aaj3f and zonotope June 25, 2026 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant