Skip to content

Add server-side BFS entity graph endpoint#37

Open
damienriehl wants to merge 21 commits intodevfrom
entity-graph-endpoint
Open

Add server-side BFS entity graph endpoint#37
damienriehl wants to merge 21 commits intodevfrom
entity-graph-endpoint

Conversation

@damienriehl
Copy link
Copy Markdown
Contributor

@damienriehl damienriehl commented Apr 6, 2026

Summary

  • New endpoint: GET /projects/{id}/ontology/classes/{iri}/graph — server-side BFS graph traversal
  • New schema: EntityGraphResponse with GraphNode and GraphEdge models
  • BFS traversal: configurable ancestors_depth, descendants_depth, max_nodes, include_see_also
  • Lineage-based node types: focus, root, class, individual, property, external, unexplored

Companion PR

Frontend: CatholicOS/ontokit-web#88

Test plan

  • GET /projects/{id}/ontology/classes/{iri}/graph returns nodes + edges for a valid class IRI
  • Truncation flag set when node count exceeds max_nodes
  • Ancestors/descendants depth limits are respected
  • include_see_also=false excludes seeAlso cross-links
  • 404 returned for non-existent class IRI

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Project-scoped entity-graph endpoint to fetch a focused class graph (nodes, edges, focus metadata, pagination/truncation) with configurable branch, depths, node limits, and optional cross-reference inclusion; returns 404 if class not found.
  • Bug Fixes

    • Deletion now returns 404 when the target class is missing instead of reporting success.
  • Chores

    • Removed legacy class-hierarchy endpoint and added typed graph response schemas.
  • Tests

    • Extensive unit tests for graph building, helpers, traversal, relations, see-also, truncation, validation, and edge cases.

Closes #82

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 6, 2026

📝 Walkthrough

Walkthrough

Adds a server-side BFS entity-graph feature: new Pydantic graph schemas, helper functions for rdfs:seeAlso extraction, an implemented OntologyService.build_entity_graph traversal with see-also handling and truncation, a project-scoped route to return EntityGraphResponse (404 when not found), removal of the old class-hierarchy endpoint, and comprehensive unit tests.

Changes

Entity Graph Feature

Layer / File(s) Summary
Schema
ontokit/schemas/graph.py
Adds GraphNodeType/GraphEdgeType and Pydantic models GraphNode, GraphEdge, EntityGraphResponse.
Helpers
ontokit/services/entity_graph_helpers.py
Adds get_see_also_targets(graph, uri) and get_see_also_referrers(graph, uri) to extract rdfs:seeAlso targets/referrers (including OWL restriction encodings) with deduplication and preservation of discovery order.
Core Service
ontokit/services/ontology.py
Removes get_class_hierarchy; adds build_entity_graph(...) implementing BFS ancestors/descendants with depth and max_nodes validation, node/edge construction (subClassOf, equivalentClass, disjointWith, optional seeAlso), node classification (focus/root/secondary_root/class/individual/property/external), definition/label selection, child counts, truncation flag, total_concept_count, and use of the new see-also helpers. Also adds EXTERNAL_NAMESPACES, TYPE_CHECKING imports, and updates LabelPreference.parse annotation.
API — Project Route
ontokit/api/routes/projects.py
Adds GET /{project_id}/ontology/classes/graph (get_ontology_class_graph) which resolves branch (query or git.get_default_branch), ensures ontology loaded, calls ontology.build_entity_graph(...) with parameters and label preferences, returns EntityGraphResponse or raises 404 when result is None.
API — Ontology Routes
ontokit/api/routes/classes.py
Updates delete_class to raise 404 when deletion returns falsy; removes the old GET /ontologies/{ontology_id}/classes/{class_iri:path}/hierarchy endpoint.
Tests — Service & Helpers
tests/unit/test_entity_graph.py, tests/unit/test_entity_graph_helpers.py
Adds extensive async unit tests for build_entity_graph covering focus-missing, traversal, depths, max_nodes/truncation, seeAlso (direct/reverse/restriction encodings), equivalent/disjoint edges, node classification, label/definition selection, validation errors, and many edge cases; adds unit tests for the see-also helper functions.
Tests — Route
tests/unit/test_graph_routes.py
Adds route tests for project-scoped graph endpoint: success (200) with EntityGraphResponse, not-found (404) when service returns None, and passing resolved default branch into build_entity_graph.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProjectAPI as Project API
    participant Loader as _ensure_ontology_loaded
    participant Service as OntologyService
    participant RDF as RDF Graph
    participant Response as EntityGraphResponse

    Client->>ProjectAPI: GET /{project_id}/ontology/classes/graph?class_iri=...
    ProjectAPI->>ProjectAPI: resolve branch (query or git.get_default_branch)
    ProjectAPI->>Loader: ensure ontology loaded for branch
    Loader-->>ProjectAPI: ontology instance
    ProjectAPI->>Service: build_entity_graph(class_iri, ancestors_depth, ...)
    Service->>RDF: validate focus class exists
    alt focus not found
        RDF-->>Service: not found
        Service-->>ProjectAPI: None
        ProjectAPI-->>Client: HTTP 404 "Class not found"
    else focus found
        Service->>RDF: BFS ancestors (up to ancestors_depth)
        RDF-->>Service: ancestor nodes/edges
        Service->>RDF: BFS descendants (up to descendants_depth)
        RDF-->>Service: descendant nodes/edges
        opt include_see_also
            Service->>RDF: discover rdfs:seeAlso + restriction encodings (get_see_also_* helpers)
            RDF-->>Service: seeAlso targets/referrers
            Service->>RDF: expand ancestor lineage for seeAlso-discovered nodes
            RDF-->>Service: additional nodes/edges
        end
        Service->>Service: build GraphNode/GraphEdge lists, compute truncated/total_concept_count
        Service-->>ProjectAPI: EntityGraphResponse
        ProjectAPI-->>Client: HTTP 200 + JSON
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

🐇
I nibble IRIs beneath the moon,
I map the roots and kin by noon,
Ancestors, children, see-also threads,
I stitch bright nodes with tiny heads,
Hop—behold the graph now croon! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: adding a server-side BFS entity graph endpoint, which is the main feature introduced across multiple files.
Linked Issues check ✅ Passed The PR implementation fully satisfies all coding requirements from issue #82: new endpoint added, configurable BFS parameters implemented, EntityGraphResponse schema created with GraphNode/GraphEdge models, node classification system implemented, and truncation flag included.
Out of Scope Changes check ✅ Passed A secondary change removes the legacy GET /ontologies/{ontology_id}/classes/{class_iri:path}/hierarchy endpoint and replaces it with the new project-scoped graph endpoint, representing a natural refactoring to consolidate graph functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch entity-graph-endpoint

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@damienriehl damienriehl force-pushed the entity-graph-endpoint branch 2 times, most recently from c7fe862 to b37648d Compare April 6, 2026 17:12
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
ontokit/schemas/graph.py (2)

22-29: Consider using Literal type for edge_type.

Similar to node_type, the edge_type field has defined values (subClassOf, equivalentClass, disjointWith, seeAlso). A Literal type would provide better validation:

+EdgeType = Literal["subClassOf", "equivalentClass", "disjointWith", "seeAlso"]
+
 class GraphEdge(BaseModel):
     """An edge in the entity graph."""
 
     id: str
     source: str
     target: str
-    edge_type: str
+    edge_type: EdgeType
     label: str | None = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/schemas/graph.py` around lines 22 - 29, The GraphEdge.model currently
types edge_type as a plain string; change it to a Literal of the allowed values
to enforce validation: update the GraphEdge class's edge_type annotation to use
Literal['subClassOf','equivalentClass','disjointWith','seeAlso'] and add the
necessary import (typing.Literal or typing_extensions.Literal) at the top of the
module so Pydantic will validate those exact values.

8-19: Consider using Literal type for node_type to enforce valid values.

Per PR objectives, node_type has defined values: focus, root, class, individual, property, external, unexplored. Using a Literal type provides validation and better documentation:

+from typing import Literal
+
+NodeType = Literal["focus", "root", "class", "individual", "property", "external", "unexplored"]
+
 class GraphNode(BaseModel):
     """A node in the entity graph."""
 
     id: str
     label: str
     iri: str
     definition: str | None = None
     is_focus: bool = False
     is_root: bool = False
     depth: int = 0
-    node_type: str = "class"
+    node_type: NodeType = "class"
     child_count: int | None = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/schemas/graph.py` around lines 8 - 19, The node_type field should be
narrowed to an explicit Literal to enforce allowed values: change
GraphNode.node_type's annotation from str to
Literal["focus","root","class","individual","property","external","unexplored"]
(keeping the default "class"), and add the necessary import for Literal (from
typing or typing_extensions if needed) at the top of the file so Pydantic
validation and type checkers will enforce and document the valid node types;
update any related usages/tests expecting a plain str if necessary.
ontokit/api/routes/classes.py (1)

138-138: Consider wrapping include_see_also in Query() for consistency.

Other query parameters use Query(default=...) for explicit declaration, but include_see_also uses a bare default value. While functionally equivalent, wrapping it maintains consistency:

-    include_see_also: bool = True,
+    include_see_also: bool = Query(default=True),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/api/routes/classes.py` at line 138, The include_see_also parameter is
using a bare default True while other query params use Query(...); update the
route handler signature to wrap include_see_also with Query(default=True) (i.e.,
change include_see_also: bool = True to include_see_also: bool =
Query(default=True)) and ensure Query is imported from fastapi in
ontokit/api/routes/classes.py so the parameter declaration is consistent with
the other query parameters.
ontokit/services/ontology.py (1)

380-387: Consider extracting EXTERNAL_NAMESPACES as a module-level constant.

This tuple is defined inside the method but doesn't depend on instance state. Moving it to module level improves readability and avoids recreating the tuple on each call:

+# Namespaces considered external for graph node classification
+EXTERNAL_NAMESPACES = (
+    "http://www.w3.org/2000/01/rdf-schema#",
+    "http://www.w3.org/2002/07/owl#",
+    "http://xmlns.com/foaf/0.1/",
+    "http://purl.org/dc/elements/1.1/",
+    "http://purl.org/dc/terms/",
+    "http://www.w3.org/2004/02/skos/core#",
+)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` around lines 380 - 387, Extract the
EXTERNAL_NAMESPACES tuple out of the method and declare it as a module-level
constant (uppercase) at the top of ontokit/services/ontology.py so it is created
once; remove the local definition inside the method and ensure all usages in
that method reference the module-level EXTERNAL_NAMESPACES constant instead of a
local variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ontokit/services/ontology.py`:
- Around line 354-364: The return annotation EntityGraphResponse on
build_entity_graph is undefined at function-definition time; fix by importing
the type at module scope (e.g., add from ontokit.schemas.graph import
EntityGraphResponse near the top of the module) so the annotation resolves, and
remove or keep the redundant local import inside build_entity_graph only if
needed for runtime types like GraphNode/GraphEdge; alternatively you may add
from __future__ import annotations at the top to defer annotation evaluation if
you prefer that approach.
- Around line 413-427: Remove the unused depth parameter from the _classify_node
function signature and body (change def _classify_node(uri: URIRef, is_focus:
bool): ...), and update every call site that currently passes a depth argument
to only pass the uri and is_focus parameters; specifically locate and fix the
call that invoked _classify_node with three args (the call mentioned in the
review) to supply only the two required arguments. Ensure any references in
higher-order uses (e.g., map/filter or callbacks) match the new two-argument
signature and run tests/type checks to confirm no remaining usages of the
removed parameter.
- Around line 540-566: The seeAlso outgoing and incoming loops incorrectly use
break when either the item is not a URIRef or the sa_count limit is reached;
change the logic so non-URIRef items simply continue (e.g., check isinstance and
continue) and only exit the loop when sa_count >= max_see_also_per_node
(separate check before processing or break after incrementing); update both
loops that iterate graph.objects(node_uri, RDFS.seeAlso) and
graph.subjects(RDFS.seeAlso, node_uri) and ensure you still call _make_node,
append to see_also_nodes, and call _add_edge exactly as before when items are
valid.

---

Nitpick comments:
In `@ontokit/api/routes/classes.py`:
- Line 138: The include_see_also parameter is using a bare default True while
other query params use Query(...); update the route handler signature to wrap
include_see_also with Query(default=True) (i.e., change include_see_also: bool =
True to include_see_also: bool = Query(default=True)) and ensure Query is
imported from fastapi in ontokit/api/routes/classes.py so the parameter
declaration is consistent with the other query parameters.

In `@ontokit/schemas/graph.py`:
- Around line 22-29: The GraphEdge.model currently types edge_type as a plain
string; change it to a Literal of the allowed values to enforce validation:
update the GraphEdge class's edge_type annotation to use
Literal['subClassOf','equivalentClass','disjointWith','seeAlso'] and add the
necessary import (typing.Literal or typing_extensions.Literal) at the top of the
module so Pydantic will validate those exact values.
- Around line 8-19: The node_type field should be narrowed to an explicit
Literal to enforce allowed values: change GraphNode.node_type's annotation from
str to
Literal["focus","root","class","individual","property","external","unexplored"]
(keeping the default "class"), and add the necessary import for Literal (from
typing or typing_extensions if needed) at the top of the file so Pydantic
validation and type checkers will enforce and document the valid node types;
update any related usages/tests expecting a plain str if necessary.

In `@ontokit/services/ontology.py`:
- Around line 380-387: Extract the EXTERNAL_NAMESPACES tuple out of the method
and declare it as a module-level constant (uppercase) at the top of
ontokit/services/ontology.py so it is created once; remove the local definition
inside the method and ensure all usages in that method reference the
module-level EXTERNAL_NAMESPACES constant instead of a local variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a1a7daa8-6c2e-45c9-b31c-075e27fc2657

📥 Commits

Reviewing files that changed from the base of the PR and between 1a14032 and c7fe862.

📒 Files selected for processing (4)
  • ontokit/api/routes/classes.py
  • ontokit/api/routes/projects.py
  • ontokit/schemas/graph.py
  • ontokit/services/ontology.py

Comment thread ontokit/services/ontology.py Outdated
Comment thread ontokit/services/ontology.py Outdated
Comment on lines +413 to +427
def _classify_node(uri: URIRef, is_focus: bool, depth: int) -> str:
iri = str(uri)
if is_focus:
return "focus"
if _is_external(iri):
return "external"
# Check if individual (instance, not a class)
if (uri, RDF.type, OWL.Class) not in graph:
for rdf_type in graph.objects(uri, RDF.type):
if rdf_type in (OWL.ObjectProperty, OWL.DatatypeProperty, OWL.AnnotationProperty):
return "property"
return "individual"
if _is_root_class(uri):
return "root"
return "class"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused depth parameter from _classify_node (ARG001).

The pipeline reports ARG001: Unused function argument 'depth'. The parameter is declared but never referenced in the function body:

-        def _classify_node(uri: URIRef, is_focus: bool, depth: int) -> str:
+        def _classify_node(uri: URIRef, is_focus: bool) -> str:

Also update the call site on line 461:

-                node_type=_classify_node(uri, is_focus, depth),
+                node_type=_classify_node(uri, is_focus),
🧰 Tools
🪛 GitHub Actions: Distribution

[error] 413-413: ruff check (ARG001) Unused function argument: depth in _classify_node(uri: URIRef, is_focus: bool, depth: int).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` around lines 413 - 427, Remove the unused depth
parameter from the _classify_node function signature and body (change def
_classify_node(uri: URIRef, is_focus: bool): ...), and update every call site
that currently passes a depth argument to only pass the uri and is_focus
parameters; specifically locate and fix the call that invoked _classify_node
with three args (the call mentioned in the review) to supply only the two
required arguments. Ensure any references in higher-order uses (e.g., map/filter
or callbacks) match the new two-argument signature and run tests/type checks to
confirm no remaining usages of the removed parameter.

Comment thread ontokit/services/ontology.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 94.94585% with 14 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
ontokit/services/ontology.py 94.97% 1 Missing and 9 partials ⚠️
ontokit/services/entity_graph_helpers.py 90.24% 0 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ontokit/services/ontology.py`:
- Around line 353-358: Add explicit validation at the start of
build_entity_graph: ensure max_nodes > 0, ancestors_depth >= 0,
descendants_depth >= 0, max_see_also_per_node >= 0 and that include_see_also is
a boolean; if any check fails, raise a clear ValueError (or return a
deterministic error) with a descriptive message so invalid params do not cause a
silent None/focus-node failure. Apply the same guardrail checks to the other
related entry points that accept these traversal params (the overloads/variants
referenced around the other occurrences) so all callers uniformly validate
inputs before graph construction.
- Around line 565-579: The reverse discovery in _get_see_also_referrers only
looks for restrictions using OWL.someValuesFrom; update it to also consider
OWL.allValuesFrom and OWL.hasValue so incoming restrictions encoded with those
patterns are found. Specifically, when iterating restrictions that point to the
target URI, check subjects(...) for OWL.someValuesFrom, OWL.allValuesFrom, and
OWL.hasValue, then for each restriction get OWL.onProperty and ensure it equals
RDFS.seeAlso before collecting classes found via RDFS.subClassOf whose (cls,
RDF.type, OWL.Class) holds; update references to OWL.allValuesFrom and
OWL.hasValue alongside the existing
OWL.someValuesFrom/OWL.onProperty/RDFS.seeAlso/RDFS.subClassOf/RDF.type/OWL.Class
symbols.
- Around line 453-461: The node's is_root is being determined by calling
_is_root_class(uri) for every URI even when the node is not a class; compute the
node type first via _classify_node(uri, is_focus, depth) (or use an existing
_is_class predicate) and only call _is_root_class(uri) or mark is_root True when
node_type indicates a class; update the _make_node GraphNode construction so
node_type is determined before is_root and set is_root based on node_type ==
"class" (or equivalent) to avoid marking properties/individuals/external nodes
as roots.

In `@tests/unit/test_entity_graph.py`:
- Around line 380-383: The tests currently use conditional checks like ext_node
= next((n for n in result.nodes if n.iri == str(ext)), None) followed by if
ext_node: assert ext_node.node_type == "external", which makes the assertion a
no-op when the node is missing; change these to assert the node exists first
(e.g., assert ext_node is not None) before asserting its attributes so the test
fails if the node is absent—apply the same fix for prop_node and ind_node checks
referenced in the diff (the next(...) lookups and subsequent conditional
assertions).
- Around line 354-368: The test test_see_also_max_per_node currently asserts
len(sa_edges) <= 3 which is too permissive; update the assertion to require
exactly 3 (since 10 seeAlso targets were added and max_see_also_per_node=3) so
change the check on sa_edges (the list comprehension filtering result.edges by
edge_type "seeAlso") to assert len(sa_edges) == 3 to ensure the limiter is
enforced.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 91a3b656-dd00-4582-9e99-da0cdb722e4d

📥 Commits

Reviewing files that changed from the base of the PR and between c7fe862 and 69f933e.

📒 Files selected for processing (5)
  • ontokit/api/routes/classes.py
  • ontokit/api/routes/projects.py
  • ontokit/schemas/graph.py
  • ontokit/services/ontology.py
  • tests/unit/test_entity_graph.py
✅ Files skipped from review due to trivial changes (1)
  • ontokit/schemas/graph.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • ontokit/api/routes/projects.py
  • ontokit/api/routes/classes.py

Comment thread ontokit/services/ontology.py
Comment thread ontokit/services/ontology.py Outdated
Comment thread ontokit/services/ontology.py Outdated
Comment thread tests/unit/test_entity_graph.py
Comment thread tests/unit/test_entity_graph.py
@JohnRDOrazio JohnRDOrazio force-pushed the entity-graph-endpoint branch from 69f933e to 5c81f84 Compare April 11, 2026 20:56
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
tests/unit/test_entity_graph.py (2)

380-383: ⚠️ Potential issue | 🟠 Major

Assert node existence before type assertions.

These checks are currently optional due to if ext_node/prop_node/ind_node; missing nodes won’t fail the tests.

Suggested fix
         ext_node = next((n for n in result.nodes if n.iri == str(ext)), None)
-        if ext_node:
-            assert ext_node.node_type == "external"
+        assert ext_node is not None
+        assert ext_node.node_type == "external"
@@
         prop_node = next((n for n in result.nodes if n.iri == str(EX.myProp)), None)
-        if prop_node:
-            assert prop_node.node_type == "property"
+        assert prop_node is not None
+        assert prop_node.node_type == "property"
@@
         ind_node = next((n for n in result.nodes if n.iri == str(EX.john)), None)
-        if ind_node:
-            assert ind_node.node_type == "individual"
+        assert ind_node is not None
+        assert ind_node.node_type == "individual"

Also applies to: 392-395, 404-406

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_entity_graph.py` around lines 380 - 383, The tests currently
skip failures when nodes are missing because they conditionally check node_type
only inside "if ext_node/prop_node/ind_node"; replace those optional checks with
explicit existence assertions first (e.g., assert ext_node is not None, assert
prop_node is not None, assert ind_node is not None) and then assert attributes
like node_type, so use the variables ext_node, prop_node, ind_node (and the
result.nodes lookup) to first assert presence and only then assert their
properties.

354-368: ⚠️ Potential issue | 🟡 Minor

Make the seeAlso cap assertion strict.

Line 367 uses <= 3, which can still pass if the implementation returns fewer than expected edges. Assert the exact expected count for this fixture.

Suggested fix
         sa_edges = [e for e in result.edges if e.edge_type == "seeAlso"]
-        assert len(sa_edges) <= 3
+        assert len(sa_edges) == 3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_entity_graph.py` around lines 354 - 368, In
test_see_also_max_per_node change the non-strict assertion to assert the exact
expected count: after calling svc.build_entity_graph with
max_see_also_per_node=3, compute sa_edges = [e for e in result.edges if
e.edge_type == "seeAlso"] and replace the current assert len(sa_edges) <= 3 with
assert len(sa_edges) == 3 so the test verifies the cap behavior exactly
(references: test_see_also_max_per_node, build_entity_graph,
max_see_also_per_node, sa_edges, result.edges).
ontokit/services/ontology.py (2)

453-461: ⚠️ Potential issue | 🟠 Major

is_root is incorrectly set for non-class nodes.

At Line 459, is_root is computed for every URI, so properties/individuals/external nodes can be marked as roots.

Suggested fix
         def _make_node(uri: URIRef, depth: int) -> GraphNode | None:
@@
             is_focus = uri == class_uri
+            is_class = (uri, RDF.type, OWL.Class) in graph
+            node_type = _classify_node(uri, is_focus, depth)
             node = GraphNode(
                 id=iri,
                 label=_get_label(uri),
                 iri=iri,
                 definition=_get_definition(uri),
                 is_focus=is_focus,
-                is_root=_is_root_class(uri),
+                is_root=is_class and _is_root_class(uri),
                 depth=depth,
-                node_type=_classify_node(uri, is_focus, depth),
+                node_type=node_type,
                 child_count=_child_count(uri),
             )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` around lines 453 - 461, The GraphNode
constructor sets is_root for every URI, marking properties/individuals/external
nodes as roots; compute the node_type first (using _classify_node or whatever
returns the node category) and then set is_root = _is_root_class(uri) only when
node_type indicates a class (e.g., node_type == "class" or the GraphNode
class-type enum), and pass that boolean into GraphNode; update the ordering in
the node creation block so node_type is determined before calling
_is_root_class.

565-579: ⚠️ Potential issue | 🟡 Minor

Incoming restriction-based seeAlso discovery is incomplete.

This path only checks OWL.someValuesFrom, so incoming links encoded with OWL.allValuesFrom or OWL.hasValue are missed.

Suggested fix
         def _get_see_also_referrers(uri: URIRef) -> list[URIRef]:
@@
-            # Find restrictions that someValuesFrom -> uri
-            for restriction in graph.subjects(OWL.someValuesFrom, uri):
-                on_prop = next(graph.objects(restriction, OWL.onProperty), None)
-                if on_prop == RDFS.seeAlso:
-                    for cls in graph.subjects(RDFS.subClassOf, restriction):
-                        if isinstance(cls, URIRef) and (cls, RDF.type, OWL.Class) in graph:
-                            referrers.append(cls)
+            # Find restrictions that point to uri via some/all/has value
+            for value_predicate in (OWL.someValuesFrom, OWL.allValuesFrom, OWL.hasValue):
+                for restriction in graph.subjects(value_predicate, uri):
+                    on_prop = next(graph.objects(restriction, OWL.onProperty), None)
+                    if on_prop == RDFS.seeAlso:
+                        for cls in graph.subjects(RDFS.subClassOf, restriction):
+                            if isinstance(cls, URIRef) and (cls, RDF.type, OWL.Class) in graph:
+                                referrers.append(cls)
             return referrers
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` around lines 565 - 579, In
_get_see_also_referrers, the code only inspects restrictions that use
OWL.someValuesFrom and thus misses incoming restrictions expressed with
OWL.allValuesFrom or OWL.hasValue; update the logic that iterates
graph.subjects(OWL.someValuesFrom, uri) to also iterate graph.subjects for
OWL.allValuesFrom and OWL.hasValue, then for each found restriction use
next(graph.objects(restriction, OWL.onProperty), None) to check if on_prop ==
RDFS.seeAlso and collect the corresponding classes from
graph.subjects(RDFS.subClassOf, restriction) (same URIRef/type checks as
existing code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ontokit/services/ontology.py`:
- Around line 591-603: The loop over _get_see_also_targets(node_uri) increments
sa_count even when the target was already seen and the edge is deduped, which
wastes the per-node budget; change the logic in the block (and the analogous
block at the other occurrence) so sa_count is only incremented when a new unique
related IRI actually results in adding/recording the edge (e.g., check was_new
or whether related_iri is in visited before appending to see_also_nodes and
before calling _add_edge, and only bump sa_count when you add a new edge/record
the node), ensuring _add_edge and visited/see_also_nodes updates remain
consistent.

---

Duplicate comments:
In `@ontokit/services/ontology.py`:
- Around line 453-461: The GraphNode constructor sets is_root for every URI,
marking properties/individuals/external nodes as roots; compute the node_type
first (using _classify_node or whatever returns the node category) and then set
is_root = _is_root_class(uri) only when node_type indicates a class (e.g.,
node_type == "class" or the GraphNode class-type enum), and pass that boolean
into GraphNode; update the ordering in the node creation block so node_type is
determined before calling _is_root_class.
- Around line 565-579: In _get_see_also_referrers, the code only inspects
restrictions that use OWL.someValuesFrom and thus misses incoming restrictions
expressed with OWL.allValuesFrom or OWL.hasValue; update the logic that iterates
graph.subjects(OWL.someValuesFrom, uri) to also iterate graph.subjects for
OWL.allValuesFrom and OWL.hasValue, then for each found restriction use
next(graph.objects(restriction, OWL.onProperty), None) to check if on_prop ==
RDFS.seeAlso and collect the corresponding classes from
graph.subjects(RDFS.subClassOf, restriction) (same URIRef/type checks as
existing code).

In `@tests/unit/test_entity_graph.py`:
- Around line 380-383: The tests currently skip failures when nodes are missing
because they conditionally check node_type only inside "if
ext_node/prop_node/ind_node"; replace those optional checks with explicit
existence assertions first (e.g., assert ext_node is not None, assert prop_node
is not None, assert ind_node is not None) and then assert attributes like
node_type, so use the variables ext_node, prop_node, ind_node (and the
result.nodes lookup) to first assert presence and only then assert their
properties.
- Around line 354-368: In test_see_also_max_per_node change the non-strict
assertion to assert the exact expected count: after calling
svc.build_entity_graph with max_see_also_per_node=3, compute sa_edges = [e for e
in result.edges if e.edge_type == "seeAlso"] and replace the current assert
len(sa_edges) <= 3 with assert len(sa_edges) == 3 so the test verifies the cap
behavior exactly (references: test_see_also_max_per_node, build_entity_graph,
max_see_also_per_node, sa_edges, result.edges).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 170047c0-ba93-4648-ac95-b57dda5f76ad

📥 Commits

Reviewing files that changed from the base of the PR and between 69f933e and 5c81f84.

📒 Files selected for processing (5)
  • ontokit/api/routes/classes.py
  • ontokit/api/routes/projects.py
  • ontokit/schemas/graph.py
  • ontokit/services/ontology.py
  • tests/unit/test_entity_graph.py
✅ Files skipped from review due to trivial changes (1)
  • ontokit/schemas/graph.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • ontokit/api/routes/projects.py

Comment thread ontokit/services/ontology.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
ontokit/services/ontology.py (1)

499-499: Consider using collections.deque for BFS queues.

list.pop(0) is O(n) because it shifts all elements. For BFS with potentially many nodes, collections.deque.popleft() provides O(1) dequeue operations.

♻️ Suggested improvement
+from collections import deque
+
         # BFS upward (ancestors)
-        ancestor_queue: list[tuple[URIRef, int]] = [(class_uri, 0)]
+        ancestor_queue: deque[tuple[URIRef, int]] = deque([(class_uri, 0)])
         ancestor_visited: set[str] = {class_iri}
         while ancestor_queue:
-            current_uri, current_depth = ancestor_queue.pop(0)
+            current_uri, current_depth = ancestor_queue.popleft()

Apply the same pattern to descendant_queue and sa_queue.

Also applies to: 518-518

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` at line 499, The code uses list.pop(0) for BFS
queues which is O(n); change ancestor_queue, descendant_queue and sa_queue to
use collections.deque for O(1) dequeues: import deque (from collections import
deque), initialize those queues as deque([...]) or deque() and replace all
.pop(0) calls with .popleft() so BFS remains efficient.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ontokit/services/ontology.py`:
- Line 499: The code uses list.pop(0) for BFS queues which is O(n); change
ancestor_queue, descendant_queue and sa_queue to use collections.deque for O(1)
dequeues: import deque (from collections import deque), initialize those queues
as deque([...]) or deque() and replace all .pop(0) calls with .popleft() so BFS
remains efficient.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e31e9161-5112-4e3b-a817-9005ed87f84d

📥 Commits

Reviewing files that changed from the base of the PR and between 5c81f84 and 6bbfaff.

📒 Files selected for processing (3)
  • ontokit/services/ontology.py
  • tests/unit/test_entity_graph.py
  • tests/unit/test_graph_routes.py

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ontokit/services/ontology.py`:
- Around line 605-632: The see-also seeding logic only appends a node to
see_also_nodes when it is new, so already-visited seeAlso targets never get
their ancestor pass run; change the blocks that handle related/referrer nodes so
that see_also_nodes.append(related) and see_also_nodes.append(referrer) occur
regardless of was_new (but keep the existing _make_node(...) call only when
was_new and continue when it returns None), i.e., always enqueue the seeAlso IRI
for the ancestor traversal while still avoiding duplicate node creation; update
both the _get_see_also_targets loop and the _get_see_also_referrers loop
accordingly.
- Around line 457-463: The function _make_node increments total_discovered
before enforcing max_nodes and only memoizes retained nodes in visited, causing
repeated counts for omitted URIs; fix by tracking first-seen URIs separately
(e.g., a new set like discovered_or_seen) or by checking visited first and only
incrementing total_discovered when the URI has not been seen before, and ensure
you also memoize omitted URIs in a separate map/set so subsequent encounters
return a non-counting sentinel (or cached None) rather than incrementing again;
update logic in _make_node (referencing _make_node, visited, total_discovered,
max_nodes, and visited[iri]) to: check seen/discovered first, increment
total_discovered only on first discovery, and record omitted URIs to prevent
double-counting.

In `@tests/unit/test_entity_graph.py`:
- Around line 677-692: The test test_duplicate_see_also_edge_not_counted doesn't
actually trigger the duplicate-target budget path because EX.Animal is visited
but the seeAlso edge to it is new; update the test to ensure the same target is
presented through two distinct extraction paths so the dedupe logic must block
the second consumption. Specifically, modify the graph built by _base_graph()/g
to add two separate seeAlso edges (or introduce an intermediate node that yields
EX.Animal via another path) so build_entity_graph(PROJECT_ID, str(EX.Person),
BRANCH, max_see_also_per_node=1) will be forced to encounter EX.Animal twice and
only allow EX.Related if deduping works; keep references to EX.Animal,
EX.Related, max_see_also_per_node, test_duplicate_see_also_edge_not_counted,
_service_with_graph and build_entity_graph to locate and change the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 72d6ddf5-317c-4f60-add4-4985ac1853e7

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbfaff and a5e4124.

📒 Files selected for processing (2)
  • ontokit/services/ontology.py
  • tests/unit/test_entity_graph.py

Comment thread ontokit/services/ontology.py
Comment thread ontokit/services/ontology.py
Comment thread tests/unit/test_entity_graph.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
ontokit/services/ontology.py (2)

554-581: Consider deduplicating targets within _get_see_also_targets.

The function can return the same URI multiple times if it appears both as a direct rdfs:seeAlso triple and within an OWL restriction. While downstream code handles duplicates gracefully (via edge_ids and sa_visited), deduplicating here would reduce redundant processing.

♻️ Suggested refactor
         def _get_see_also_targets(uri: URIRef) -> list[URIRef]:
-            targets: list[URIRef] = []
+            targets: set[URIRef] = set()
             # Direct rdfs:seeAlso triples
             for obj in graph.objects(uri, RDFS.seeAlso):
                 if isinstance(obj, URIRef):
-                    targets.append(obj)
+                    targets.add(obj)
             # OWL restrictions...
             for sc in graph.objects(uri, RDFS.subClassOf):
                 # ... existing logic ...
                 if on_prop == RDFS.seeAlso:
                     for val in graph.objects(sc, OWL.someValuesFrom):
                         if isinstance(val, URIRef):
-                            targets.append(val)
+                            targets.add(val)
                     # ... same for allValuesFrom, hasValue ...
-            return targets
+            return list(targets)

Apply the same pattern to _get_see_also_referrers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` around lines 554 - 581, _get_see_also_targets
collects duplicate URIRefs from direct RDFS.seeAlso triples and OWL restriction
values; change it to deduplicate before returning (e.g., build targets using an
ordered set or track seen URIs while appending) so each URIRef appears only
once, and apply the same deduplication pattern to _get_see_also_referrers to
avoid redundant downstream processing.

386-393: Consider extracting EXTERNAL_NAMESPACES to module level.

This tuple is recreated on every build_entity_graph call. Moving it to a module-level constant (alongside LABEL_PROPERTY_MAP and similar constants) would marginally improve performance and improve discoverability for future maintainers.

♻️ Suggested refactor

Add near the top of the file with other constants:

# Namespaces considered external (W3C standards, Dublin Core, etc.)
EXTERNAL_NAMESPACES = (
    "http://www.w3.org/2000/01/rdf-schema#",
    "http://www.w3.org/2002/07/owl#",
    "http://xmlns.com/foaf/0.1/",
    "http://purl.org/dc/elements/1.1/",
    "http://purl.org/dc/terms/",
    "http://www.w3.org/2004/02/skos/core#",
)

Then reference it directly in _is_external.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/ontology.py` around lines 386 - 393, The EXTERNAL_NAMESPACES
tuple is being recreated inside build_entity_graph on each call; move that tuple
to a module-level constant (near LABEL_PROPERTY_MAP and other constants) named
EXTERNAL_NAMESPACES, then update _is_external (and any local references in
build_entity_graph) to reference the module-level EXTERNAL_NAMESPACES instead of
the inline tuple so the constant is reused across calls and is discoverable by
maintainers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ontokit/services/ontology.py`:
- Around line 554-581: _get_see_also_targets collects duplicate URIRefs from
direct RDFS.seeAlso triples and OWL restriction values; change it to deduplicate
before returning (e.g., build targets using an ordered set or track seen URIs
while appending) so each URIRef appears only once, and apply the same
deduplication pattern to _get_see_also_referrers to avoid redundant downstream
processing.
- Around line 386-393: The EXTERNAL_NAMESPACES tuple is being recreated inside
build_entity_graph on each call; move that tuple to a module-level constant
(near LABEL_PROPERTY_MAP and other constants) named EXTERNAL_NAMESPACES, then
update _is_external (and any local references in build_entity_graph) to
reference the module-level EXTERNAL_NAMESPACES instead of the inline tuple so
the constant is reused across calls and is discoverable by maintainers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b508a75c-8b22-4961-b94a-5ffa368a148a

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbfaff and 7808813.

📒 Files selected for processing (2)
  • ontokit/services/ontology.py
  • tests/unit/test_entity_graph.py
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/test_entity_graph.py

@JohnRDOrazio JohnRDOrazio added the enhancement New feature or request label Apr 13, 2026
@JohnRDOrazio JohnRDOrazio modified the milestones: v0.3.0, v0.4.0 Apr 13, 2026
@damienriehl
Copy link
Copy Markdown
Contributor Author

Peer review summary (companion review on the frontend at CatholicOS/ontokit-web#88):

Findings: 2 BLOCKING + 4 HIGH + 7 MEDIUM + 9 LOW.

The PR is fundamentally sound: BFS is correct, recent commits fixed real bugs around dedup/seeAlso ancestor traversal, max_nodes truncation flag is accurate via the separate seen set, and the test suite is genuinely substantive. The two blockers were:

  1. The /api/v1/ontologies/{id}/classes/graph route in classes.py was unauthenticated AND non-functional (raised 500 on every call).
  2. project.label_preferences was silently ignored, so multilingual projects would see English graph labels but their preferred language everywhere else.

🤖 Review assisted by Claude Code

damienriehl added a commit that referenced this pull request May 2, 2026
fix(graph): address blocker + high findings from PR #37 review
damienriehl and others added 4 commits May 3, 2026 20:20
Add GET /projects/{id}/ontology/classes/{iri}/graph endpoint that builds
a multi-hop entity graph via BFS traversal. Returns nodes and edges for
visualization with lineage-based node types (focus, root, class, etc.).

Configurable: ancestors_depth, descendants_depth, max_nodes, include_see_also.
Truncation detection when node count exceeds max_nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The entity graph BFS only checked outgoing rdfs:seeAlso (graph.objects),
missing incoming connections (graph.subjects). For example, "Proceeding
Closed / Disposed seeAlso Motion to Dismiss" was invisible because only
MTD's outgoing seeAlso was checked, not what points TO MTD.

Now checks both directions, surfacing all cross-branch root ancestors
(e.g., "Service", "Status") that connect via seeAlso to visited nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The {class_iri:path}/graph route pattern was broken because FastAPI's
:path converter greedily captures /graph as part of the IRI. Move
class_iri to a query parameter and reorder routes before path-parameter
routes. Also removes the dead hierarchy endpoint (unimplemented stub),
fixes mypy no-any-return errors with TYPE_CHECKING import, fixes ruff
ARG001 unused depth param, and adds comprehensive tests for
build_entity_graph.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Validate max_nodes, ancestors_depth, descendants_depth,
max_see_also_per_node, and include_see_also at the service layer
to prevent silent failures from invalid parameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JohnRDOrazio and others added 17 commits May 3, 2026 20:20
_get_see_also_referrers previously only checked OWL.someValuesFrom
when finding incoming restriction-based seeAlso references. Now also
checks OWL.allValuesFrom and OWL.hasValue, matching the forward
_get_see_also_targets implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compute node_type before is_root so that properties, individuals,
and external nodes are not unnecessarily checked for root status.
is_root is now only set to True when node_type indicates a class.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change conditional node-type assertions (if node: assert ...) to
assert the node exists first (assert node is not None) so tests
fail when expected nodes are absent. Also tighten
test_see_also_max_per_node from <= 3 to == 3 to verify the limiter
produces exactly the expected count.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_add_edge now returns a bool indicating whether a new edge was created.
The seeAlso collection loops (outgoing and incoming) only increment
sa_count when the edge is genuinely new, avoiding wasted budget on
duplicate edges for already-visited nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add tests for all five ValueError branches in build_entity_graph
parameter validation. Add tests for incoming seeAlso discovery via
allValuesFrom and hasValue OWL restrictions in _get_see_also_referrers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cover both graph route handlers (classes.py get_class_graph and
projects.py get_ontology_class_graph) with success, not-found, and
default-branch tests to close the diff coverage gap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add tests for: descendant diamond (visited node reuse), max_nodes
truncation in ancestors/seeAlso/incoming referrers, seeAlso ancestor
depth limits, equivalentClass both ordering directions, disjointWith
forward direction, BNode child skipping, and incoming seeAlso budget
exhaustion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The focus node is always the first node created (empty visited dict,
max_nodes >= 1 from validation) and class existence is already
verified, so _make_node can never return None here. Remove the
unreachable guard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace list.pop(0) (O(n)) with collections.deque.popleft() (O(1))
for all three BFS queues: ancestor, descendant, and seeAlso ancestor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ncestor traversal

Two fixes in build_entity_graph:

1. Track all seen URIs in a separate set so _make_node only increments
   total_discovered on first encounter, preventing inflated counts when
   the same URI is rejected multiple times due to max_nodes.

2. Always enqueue seeAlso targets for ancestor BFS regardless of whether
   they were already visited, so targets discovered as descendants still
   get their own ancestor branch explored.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous test used a single seeAlso edge to an already-visited
node, which was always a new edge (different edge_type). Rewrite to
have two visited nodes (Person and Animal) both seeAlso the same
target, so the second _add_edge call returns False, verifying the
budget is not wasted on duplicate edges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eAlso results

Move the EXTERNAL_NAMESPACES tuple from inside build_entity_graph to a
module-level constant for reuse and discoverability. Deduplicate
_get_see_also_targets and _get_see_also_referrers using ordered sets
to avoid redundant downstream processing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The endpoint had no auth dependency and constructed a fresh OntologyService
per request via get_ontology_service — that service never loads any graph,
so calling the endpoint would raise ValueError("Graph for project ... not
loaded") and 500 the request. The frontend uses the project-scoped route
at /projects/{id}/ontology/classes/graph instead, which is correctly auth'd
via verify_project_access and goes through _ensure_ontology_loaded.

Removes the dead route + its companion test class.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Schema previously typed node_type/edge_type as `str`, but the BFS only ever
emits one of: focus / root / secondary_root / class / individual / property /
external for nodes, and subClassOf / equivalentClass / disjointWith / seeAlso
for edges. Frontend already has GraphNodeType / GraphEdgeType union types in
lib/graph/types.ts; the API contract should match.

Adds GraphNodeType and GraphEdgeType Literal aliases. Pydantic enforces them
at serialization, OpenAPI generates proper enums for clients, and a typo in
the service code (e.g., "focuss") fails type-check instead of shipping
silently.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The original _get_see_also_targets and _get_see_also_referrers were inner
closures inside build_entity_graph (a 330-line function with 9 nested
closures). They could only be exercised through the full BFS path, requiring
fixture graphs and assertions on downstream BFS state.

Extracts both as top-level functions in `entity_graph_helpers.py` so they
can be unit-tested directly. Adds 12 targeted tests covering FOLIO-style
restriction encoding (someValuesFrom / allValuesFrom / hasValue), direct
rdfs:seeAlso triples, dedup across encodings, named-superclass exclusion,
restriction-with-unrelated-property exclusion, and class-only filtering for
referrers.

build_entity_graph still wraps them as thin closures (so the BFS code reads
the same) but the heavy lifting is now reusable + independently tested.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bundle of related changes to build_entity_graph:

- B2: thread project.label_preferences through. _get_label now passes them
  to select_preferred_label so multilingual projects see graph labels in
  their preferred language. _get_definition derives a preferred language
  from the same preferences and prefers literals matching that tag for
  skos:definition / rdfs:comment, falling back to any literal.
- H2: drop the unreachable `isinstance(include_see_also, bool)` runtime
  check — FastAPI coerces query params before they reach this code, and
  type hints handle direct service callers.
- H3: replace the Python-2-style `total_discovered = [0]` closure
  workaround with `nonlocal total_discovered`.
- H4: replace inner closures with calls to top-level
  get_see_also_targets / get_see_also_referrers from entity_graph_helpers.
- node_type / edge_type closures now return GraphNodeType / GraphEdgeType
  to match the tightened schema.

Drops the test that asserted the removed isinstance check.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e preference

Codecov flagged 81% patch coverage on PR #104; the gap was concentrated
in build_entity_graph where the label_preferences plumbing and the
new language-aware _get_definition branches had no tests.

Adds six cases:

- label_preferences reaches select_preferred_label so graph node labels
  resolve in the project's preferred language.
- _get_definition prefers preferred-language literals when both en and
  es are present.
- _get_definition falls back to any literal when no preferred-language
  literal exists for the same predicate.
- SKOS definition still wins over rdfs:comment in the same language.
- An unparseable label_preferences entry (no matching property) is
  ignored — definition behavior degrades to first-literal-wins, which
  is the no-preference default.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@damienriehl damienriehl force-pushed the entity-graph-endpoint branch from 3df1cf5 to 1feeea8 Compare May 4, 2026 01:22
@damienriehl
Copy link
Copy Markdown
Contributor Author

Rebased onto current dev (no conflicts). Quality gates clean: 1586 unit tests pass, ruff clean, mypy clean on ontology.py / entity_graph_helpers.py / schemas/graph.py. New HEAD: 1feeea8.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/unit/test_graph_routes.py (1)

46-57: ⚡ Quick win

Add a regression test for label_preferences forwarding.

This fixture always returns label_preferences=None, so the suite would miss the multilingual-label regression that was already called out in the PR discussion. Please add one case with a non-empty preference list and assert build_entity_graph() receives it.

Also applies to: 91-105

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_graph_routes.py` around lines 46 - 57, Update the unit test
fixtures so they cover forwarding of label_preferences: adjust the AsyncMock
return for mock_project_svc.get to include at least one non-empty
label_preferences list (e.g., ["en","fr"]) and add assertions in the test that
verify build_entity_graph(...) was called with that label_preferences value
(inspect the mock for build_entity_graph or the call args). Apply the same
change to the duplicate fixture block referenced (the block around lines 91-105)
so both test cases include a non-empty label_preferences and assert it is
forwarded to build_entity_graph.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ontokit/services/entity_graph_helpers.py`:
- Around line 63-65: The direct reverse branch currently adds any URIRef found
via graph.subjects(RDFS.seeAlso, uri) which disagrees with the restriction
branch that only includes owl:Class subjects; update that loop so it only calls
_add(subj) for subjects that are URIRef and have an RDF.type of OWL.Class (e.g.,
check isinstance(subj, URIRef) and that OWL.Class appears in graph.objects(subj,
RDF.type)) so both branches return the same set of class referrers.

---

Nitpick comments:
In `@tests/unit/test_graph_routes.py`:
- Around line 46-57: Update the unit test fixtures so they cover forwarding of
label_preferences: adjust the AsyncMock return for mock_project_svc.get to
include at least one non-empty label_preferences list (e.g., ["en","fr"]) and
add assertions in the test that verify build_entity_graph(...) was called with
that label_preferences value (inspect the mock for build_entity_graph or the
call args). Apply the same change to the duplicate fixture block referenced (the
block around lines 91-105) so both test cases include a non-empty
label_preferences and assert it is forwarded to build_entity_graph.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0cd91389-44ec-4673-a7e6-6ff75dd68a51

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbfaff and 1feeea8.

📒 Files selected for processing (8)
  • ontokit/api/routes/classes.py
  • ontokit/api/routes/projects.py
  • ontokit/schemas/graph.py
  • ontokit/services/entity_graph_helpers.py
  • ontokit/services/ontology.py
  • tests/unit/test_entity_graph.py
  • tests/unit/test_entity_graph_helpers.py
  • tests/unit/test_graph_routes.py
💤 Files with no reviewable changes (1)
  • ontokit/api/routes/classes.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • ontokit/api/routes/projects.py
  • tests/unit/test_entity_graph.py

Comment on lines +63 to +65
for subj in graph.subjects(RDFS.seeAlso, uri):
if isinstance(subj, URIRef):
_add(subj)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make direct and restriction referrer filtering agree.

The direct reverse branch adds any URIRef, while the restriction branch only returns owl:Class subjects. That means the graph shape changes based on RDF encoding for the same seeAlso link, and it also contradicts this helper’s docstring. Either filter the direct path to classes too, or relax the contract/tests in both places.

Suggested fix
     # Direct reverse rdfs:seeAlso
     for subj in graph.subjects(RDFS.seeAlso, uri):
-        if isinstance(subj, URIRef):
+        if isinstance(subj, URIRef) and (subj, RDF.type, OWL.Class) in graph:
             _add(subj)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ontokit/services/entity_graph_helpers.py` around lines 63 - 65, The direct
reverse branch currently adds any URIRef found via graph.subjects(RDFS.seeAlso,
uri) which disagrees with the restriction branch that only includes owl:Class
subjects; update that loop so it only calls _add(subj) for subjects that are
URIRef and have an RDF.type of OWL.Class (e.g., check isinstance(subj, URIRef)
and that OWL.Class appears in graph.objects(subj, RDF.type)) so both branches
return the same set of class referrers.

Comment on lines +482 to +507
def _make_node(uri: URIRef, depth: int) -> GraphNode | None:
nonlocal total_discovered
iri = str(uri)
if iri in visited:
return visited[iri]
if iri not in seen:
seen.add(iri)
total_discovered += 1
if len(visited) >= max_nodes:
return None
is_focus = uri == class_uri
node_type = _classify_node(uri, is_focus, depth)
is_root = _is_root_class(uri) if node_type in ("class", "root") else False
node = GraphNode(
id=iri,
label=_get_label(uri),
iri=iri,
definition=_get_definition(uri),
is_focus=is_focus,
is_root=is_root,
depth=depth,
node_type=node_type,
child_count=_child_count(uri),
)
visited[iri] = node
return node
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Truncation never yields an unexplored node.

Once _make_node() starts returning None, the response only flips truncated=True; it never tells the client which branches were cut. That leaves the endpoint without the unexplored lineage type described in the linked issue/PR objectives, so the companion graph UI cannot render a deterministic truncation placeholder.

Also applies to: 653-661

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: server-side BFS entity graph endpoint

2 participants