diff --git a/.changeset/afraid-dryers-try.md b/.changeset/afraid-dryers-try.md
new file mode 100644
index 00000000..f03b4b11
--- /dev/null
+++ b/.changeset/afraid-dryers-try.md
@@ -0,0 +1,10 @@
+---
+'rushdb-dashboard': minor
+'rushdb-core': minor
+'rushdb-docs': minor
+'@rushdb/javascript-sdk': minor
+'@rushdb/mcp-server': minor
+'@rushdb/skills': minor
+---
+
+Add relationship patterns suggestions
diff --git a/docs/docs/concepts/bring-your-own-vectors.mdx b/docs/docs/concepts/bring-your-own-vectors.mdx
new file mode 100644
index 00000000..77e555c2
--- /dev/null
+++ b/docs/docs/concepts/bring-your-own-vectors.mdx
@@ -0,0 +1,143 @@
+---
+sidebar_position: 10
+---
+
+# Bring Your Own Vectors (BYOV)
+
+By default, RushDB generates embeddings server-side when you create an **embedding index** on a string property. With **BYOV** (Bring Your Own Vectors) you compute the embeddings yourself and push them alongside your records. RushDB stores, indexes, and searches them — you stay in full control of the model and the pipeline.
+
+## Why BYOV?
+
+| Scenario | Why BYOV helps |
+| ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
+| Domain-specific or fine-tuned model | Use any model — a locally fine-tuned LLM, a multimodal encoder, a document-structure model — without configuring it server-side |
+| Compliance / data residency | Raw text never leaves your infrastructure; only the numeric vector is sent to RushDB |
+| Multimodal embeddings | Encode images, audio, or structured documents into vectors before storing them |
+| Existing ML pipeline | Re-use vectors already produced by your data pipeline |
+| Reproducibility | Lock embedding logic to a specific model version; no coupling to server-side model upgrades |
+
+## Managed vs. External
+
+| Aspect | Managed | External (BYOV) |
+| ------------------------------- | ---------------------------------- | --------------------------------------------------------- |
+| `sourceType` | `managed` (default) | `external` |
+| Who generates embeddings | RushDB server | Your application |
+| Search input | Natural-language `query` string | Pre-computed `queryVector` array |
+| `dimensions` required on create | No — uses server default | **Yes** — must match your model |
+| Initial index status | `pending` → `ready` after backfill | `awaiting_vectors` → `ready` once first vector is written |
+| Backfill existing records | Automatic | Manual via `upsertVectors` or inline writes |
+
+Both index types can coexist on the same `(label, propertyName)` pair.
+
+## Write Flows
+
+There are two ways to push vectors into an external index.
+
+### Option A — Inline at write time
+
+Attach vectors directly inside any record create or import call. The index must already exist before vectors are written.
+
+```typescript
+await db.records.create('Article', {
+ title: 'Understanding Graph RAG',
+ body: 'Graphs provide context that plain vector search lacks...',
+ __vectors: [
+ {
+ propertyName: 'body',
+ vector: await embed('Understanding Graph RAG...') // your embedding function
+ }
+ ]
+})
+```
+
+This is the lowest-latency path: one round-trip creates the record and stores its vector.
+
+### Option B — Upsert after the fact
+
+Push vectors separately, useful for seeding an index from an existing dataset or syncing after a batch embedding job.
+
+```typescript
+await db.ai.indexes.upsertVectors(indexId, {
+ items: [
+ { recordId: 'rec_001', vector: [0.1, 0.2, ...] },
+ { recordId: 'rec_002', vector: [0.7, 0.8, ...] }
+ ]
+})
+```
+
+The upsert call is **idempotent** — re-running it with the same `recordId` replaces the stored vector.
+
+## Searching with a Pre-computed Vector
+
+Once vectors are stored, search with `queryVector` instead of `query`:
+
+```typescript
+const results = await db.ai.search({
+ label: 'Article',
+ propertyName: 'body',
+ queryVector: await embed('graph databases and retrieval'), // your embedding function
+ limit: 10
+})
+```
+
+The result shape is the same as a managed semantic search — records ranked by cosine (or euclidean) similarity, with an optional `__score` field.
+
+## Lifecycle
+
+```mermaid
+graph LR
+ A["Create external index sourceType: external dimensions: N"] --> B["awaiting_vectors"]
+ B --> C["Write first vector (inline or upsertVectors)"]
+ C --> D["ready"]
+ D --> E["Search with queryVector"]
+```
+
+An external index stays in `awaiting_vectors` until at least one vector has been written. After that it is `ready` and searchable.
+
+---
+
+## Implementation Reference
+
+
diff --git a/docs/docs/concepts/index.mdx b/docs/docs/concepts/index.mdx
new file mode 100644
index 00000000..a9ff5373
--- /dev/null
+++ b/docs/docs/concepts/index.mdx
@@ -0,0 +1,307 @@
+---
+sidebar_position: 0
+title: Main Concepts
+description: The fundamental ideas behind RushDB — how it stores, links, and queries your data.
+---
+
+import Tabs from '@site/src/components/LanguageTabs'
+import TabItem from '@theme/TabItem'
+
+# Main Concepts
+
+RushDB is a **graph database built for developers and AI agents**. You push raw data — JSON objects, nested trees, flat arrays — and RushDB automatically infers types, decomposes nested structures into linked records, and builds a fully traversable graph. No schema definitions. No migration files. No manual relationship wiring.
+
+This page gives you the mental model you need to work with RushDB effectively. Each section links to a deeper reference when you're ready for it.
+
+---
+
+## Records
+
+A **Record** is the fundamental unit of data in RushDB. Think of it as a typed row in a table, a document in a document store, or a node in a graph — but with seamless relationship traversal built in.
+
+Every record has:
+
+- A **label** — a category name like `User`, `Article`, or `Product`
+- **Properties** — typed key-value fields (`name: "John"`, `score: 4.9`, `active: true`)
+- A system-generated **ID** (UUIDv7) — lexicographically sortable, embeds a creation timestamp
+
+```typescript
+{
+ // ── generated on write ──────────────────────────────────────
+ "__id": "01968aa4-22c1-781a-8e8c-8fe6be6c3fd4", // UUIDv7; embeds creation timestamp
+ "__label": "User", // record type
+ "__proptypes": { // types inferred from your data
+ "name": "string",
+ "email": "string",
+ "rating": "number",
+ "emailConfirmed": "boolean",
+ "registeredAt": "datetime"
+ },
+
+ // ── written by you ──────────────────────────────────────────
+ "name": "John Galt",
+ "email": "john.galt@example.com",
+ "rating": 4.98,
+ "emailConfirmed": true,
+ "registeredAt": "2022-07-19T08:30:28.000Z"
+}
+```
+
+Records never require a predefined schema. RushDB infers the type of every value at write time.
+
+→ [Records in depth](./records.md)
+
+---
+
+## Labels
+
+A **Label** is the type name assigned to a record — `User`, `Car`, `Invoice`. Labels work like table names in a relational database but without the rigidity: there is no table to define ahead of time.
+
+Key characteristics:
+
+- Every record has exactly **one custom label** (required)
+- Labels are **case-sensitive** (`User` ≠ `user`)
+- When importing nested JSON, child objects automatically inherit their label from the **parent key name** — no manual assignment needed
+
+Labels are the primary lens for filtering: `labels: ["User", "Admin"]` in a search query.
+
+→ [Labels in depth](./labels.md)
+
+---
+
+## Properties
+
+A **Property** is a named, typed field shared across all records that carry it. In RushDB's internal graph model, properties are first-class nodes — not just columns — which means the same `color` property node connects every `Car`, `Jacket`, and `House` record that has a color field.
+
+Property nodes hold **only the `(name, type)` pair** — no values. Actual values live exclusively on the Record node. This means Property nodes act as schema/metadata guardrails: they define what fields exist and what types they carry, while records remain the sole source of truth for user-defined data.
+
+This design enables a capability unique to RushDB: **discovering relationships between otherwise unconnected records** by finding records that share the same property value — without duplicating any data in the property layer.
+
+Supported types: `string`, `number`, `boolean`, `datetime`, `null`, and arrays of a consistent type.
+
+→ [Properties in depth](./properties.md)
+
+---
+
+## Relationships
+
+**Relationships** are the edges that connect records in the graph. They are first-class citizens — not foreign key references, not join tables — which means traversing them is fast regardless of dataset size.
+
+RushDB manages two relationship types automatically:
+
+| Type | Created by | Purpose |
+| --------------------------------------------- | --------------- | --------------------------------------- |
+| **Default** (`__RUSHDB__RELATION__DEFAULT__`) | Data ingestion | Parent → child links for nested objects |
+| **Value** (`__RUSHDB__RELATION__VALUE__`) | Property system | Property node → Record connections |
+
+You can also define **custom relationships** of any type and direction between any two records — created, updated, and deleted through the API or SDKs at any time.
+
+→ [Relationships in depth](./relationships.mdx)
+
+---
+
+## Data Ingestion
+
+RushDB accepts data the way it arrives — no upfront schema required. The ingestion pipeline does five things automatically:
+
+1. **Parse** — Walk the input with a breadth-first algorithm. Each nested object becomes a separate record.
+2. **Infer types** — Every value is classified as `string`, `number`, `boolean`, `datetime`, or `null`.
+3. **Assign labels** — Top-level records use the label you provide; nested objects derive theirs from the parent key name.
+4. **Wire relationships** — Parent and child records are linked with default relationships.
+5. **Index properties** — Property nodes are created and connected to each record via value relationships.
+
+A single `importJson` call on this input:
+
+```json
+{
+ "company": {
+ "name": "Acme Corp",
+ "founded": "2020-01-15T00:00:00Z",
+ "departments": [
+ { "name": "Engineering", "headcount": 42 },
+ { "name": "Design", "headcount": 12 }
+ ]
+ }
+}
+```
+
+produces **4 linked records** (`Company`, `Department` × 2) — all typed, all connected.
+
+For flat, row-shaped data (CSV-like), use `createMany` instead — it skips the BFS decomposition and is the fastest write path.
+
+→ [Data Ingestion in depth](./data-ingestion.mdx)
+
+---
+
+## Search
+
+All queries in RushDB use a single, composable **SearchQuery** structure. The `where` clause supports exact match, range, text contains, negation, boolean logic, and **multi-hop graph traversal** — querying across relationships in the same expression:
+
+
+
+
+```typescript
+const { data } = await db.records.find({
+ labels: ['Article'],
+ where: {
+ status: 'published',
+ Author: {
+ // traverse the relationship to Author records
+ country: 'Germany'
+ }
+ },
+ orderBy: { registeredAt: 'desc' },
+ limit: 20
+})
+```
+
+
+
+
+```python
+results = db.records.find({
+ "labels": ["Article"],
+ "where": {
+ "status": "published",
+ "Author": { # traverse the relationship to Author records
+ "country": "Germany"
+ }
+ },
+ "orderBy": { "registeredAt": "desc" },
+ "limit": 20
+})
+```
+
+
+
+
+```bash
+curl -X POST "https://api.rushdb.com/api/v1/records/search" \
+ -H "Authorization: Bearer $RUSHDB_API_KEY" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "labels": ["Article"],
+ "where": {
+ "status": "published",
+ "Author": { "country": "Germany" }
+ },
+ "orderBy": { "registeredAt": "desc" },
+ "limit": 20
+ }'
+```
+
+
+
+
+→ [Search in depth](./search/introduction.md)
+
+---
+
+## Semantic Search
+
+RushDB lets you search by **meaning**, not just exact values. Create an embedding index on any string property and every record that carries that property becomes searchable by natural-language similarity — while still composing with all standard field filters.
+
+
+
+
+```typescript
+const { data } = await db.ai.search({
+ propertyName: 'description',
+ query: 'space exploration',
+ labels: ['Movie'],
+ where: { genre: 'sci-fi', year: { $gte: 2000 } },
+ limit: 10
+})
+```
+
+
+
+
+```python
+results = db.ai.search({
+ "propertyName": "description",
+ "query": "space exploration",
+ "labels": ["Movie"],
+ "where": { "genre": "sci-fi", "year": { "$gte": 2000 } },
+ "limit": 10
+})
+```
+
+
+
+
+```bash
+curl -X POST "https://api.rushdb.com/api/v1/ai/search" \
+ -H "Authorization: Bearer $RUSHDB_API_KEY" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "propertyName": "description",
+ "query": "space exploration",
+ "labels": ["Movie"],
+ "where": { "genre": "sci-fi", "year": { "$gte": 2000 } },
+ "limit": 10
+ }'
+```
+
+
+
+
+Two modes are available:
+
+- **Managed** — RushDB generates and stores embeddings automatically.
+- **External (BYOV)** — Your application supplies pre-computed vectors; RushDB stores and indexes them.
+
+Both modes use Neo4j's native vector index and compose fully with the structured `where` clause.
+
+→ [Semantic Search in depth](./semantic-search.mdx) · [Bring Your Own Vectors](./bring-your-own-vectors.mdx)
+
+---
+
+## Transactions
+
+A **Transaction** groups any number of read and write operations into a single atomic unit. Either all operations succeed and are committed, or none of them persist.
+
+RushDB transactions are built on Neo4j's native ACID guarantees:
+
+- **Atomicity** — All-or-nothing. No partial writes.
+- **Consistency** — The database moves from one valid state to another.
+- **Isolation** — Concurrent transactions do not interfere with each other.
+- **Durability** — Committed changes survive system failures.
+
+Transactions have a configurable TTL. If the client does not commit before expiry, the transaction is automatically rolled back.
+
+→ [Transactions in depth](./transactions.mdx)
+
+---
+
+## Storage Architecture
+
+RushDB uses a **dual storage model**:
+
+- **Neo4j** — stores all records, properties, and relationships. Graph traversal, vector similarity search, and ACID transactions all run here.
+- **SQL** (SQLite locally, PostgreSQL in production) — stores operational metadata: users, workspaces, projects, and API tokens.
+
+This separation keeps account management in a familiar relational layer while keeping all knowledge graph operations in Neo4j, where they are most efficient.
+
+→ [Storage in depth](./storage.md)
+
+---
+
+## RushDB as Agent Memory
+
+RushDB is designed to serve as **structured long-term memory for AI agents**. It provides three memory layers out of the box:
+
+| Layer | What it stores | RushDB primitive |
+| -------------- | -------------------------------------------------------- | ------------------------------ |
+| **Episodic** | Facts, events, entities, and their connections | Records + Relationships |
+| **Semantic** | Meaning encoded as dense vectors | Vector Properties + AI Indexes |
+| **Structural** | Schema: what labels, properties, and relationships exist | Ontology API |
+
+A typical agent retrieval flow:
+
+1. Call the **Ontology API** to discover what labels and properties exist in the current project.
+2. Use a structured **`where` filter** to narrow candidates by known field values.
+3. Apply a **`vector.similarity`** aggregation to re-rank by semantic relevance.
+4. Return scored, fully structured records — not raw text chunks — to the agent.
+
+→ [Agent Memory Model in depth](./agent-memory-model.md) · [Ontology & Schema Discovery](./ontology-schema-discovery.md)
diff --git a/docs/docs/concepts/pricing.md b/docs/docs/concepts/pricing.md
deleted file mode 100644
index 8b74182a..00000000
--- a/docs/docs/concepts/pricing.md
+++ /dev/null
@@ -1,95 +0,0 @@
----
-sidebar_position: 9
----
-
-# Pricing
-
-## Simple, Predictable Pricing
-
-RushDB pricing is based on [Knowledge Units (KU)](./knowledge-units.md) — a single unit that represents the structured knowledge created and maintained from your data. No infrastructure tiers, no node counts, no storage pricing.
-
-```
-You pay for knowledge created. Nothing else.
-```
-
-## Plans
-
-### Free
-
-- **100,000 KU / month** included
-- Up to 2 projects
-- Self-hosted support
-- Bring Your Own Cloud (BYOC) — connect to your own Neo4j or Aura instance
-- Community support
-- No credit card required
-
-Perfect for prototypes, side projects, and getting started.
-
-### Pro — $29/month
-
-- **10,000,000 KU / month** included
-- Overage at **$3 per additional million KU** — no hard stop, apps keep running
-- Unlimited projects
-- Priority support
-- Team members (up to 3, then $10/member)
-- Bring Your Own Cloud (BYOC) — connect to your own Neo4j or Aura instance
-
-Ideal for production applications and growing teams. Predictable base cost, pay-as-you-go beyond the included allowance.
-
-### Scale — from $99/month
-
-- **Usage-based** — $99 platform fee + **$2 per million KU** consumed
-- No included KU bundle — cheaper per-KU rate than Pro at volume
-- SLA guarantee
-- Advanced support
-- Unlimited team members
-- Bring Your Own Cloud (BYOC) — connect to your own Neo4j or Aura instance
-
-For high-volume or highly variable workloads where you want the lowest per-KU rate without worrying about tiers. The $2/M KU rate on Scale is 33% cheaper than Pro's overage rate.
-
-### Enterprise
-
-- **Platform license** — flat fee, unlimited KU
-- Bring Your Own Cloud (BYOC)
-- Embedded / OEM use
-- Dedicated support and SLA
-- Custom contract
-
-For organisations embedding RushDB into their products or needing full data sovereignty.
-
-## Estimating Your KU Usage
-
-Use this formula to estimate your monthly KU consumption:
-
-```
-estimated KU ≈ records_per_day × 30 × avg_fields_per_record × nesting_factor
-```
-
-**Example:**
-- 1,000 records/day
-- 10 fields per record on average
-- Flat structure (nesting factor ≈ 1.0)
-
-```
-1,000 × 30 × 10 × 1.0 = 300,000 KU/month → Pro plan
-```
-
-The interactive KU Calculator on the [pricing page](https://rushdb.com/pricing) can help you get a more precise estimate.
-
-## Self-Hosted
-
-Running RushDB on your own infrastructure? Self-hosted mode is **free and unlimited** — no KU limits, no billing. See the [self-hosting guide](../get-started/quick-tutorial) to get started.
-
-## FAQ
-
-**Can I exceed my plan's KU limit?**
-On the Free plan, writes are blocked when the limit is reached — reads always continue. On Pro, overage is billed at $3 per million KU beyond the 10M included. On Scale there is no hard limit — you pay $2 per million KU consumed on top of the $99/month base.
-
-**Does deleting data reduce my KU usage?**
-KU from creation operations is never reversed. However, once data is deleted, its ongoing stored footprint stops contributing to KU from that point forward.
-
-**Do read operations consume KU?**
-Standard read and search operations do not consume KU. Heavy analytical operations (multi-hop traversals, vector similarity search at scale) may consume a small amount of KU.
-
-**Is there a free trial for paid plans?**
-Yes — start on the Free plan with no credit card. Upgrade at any time and your remaining free KU carries over for the rest of the billing period.
diff --git a/docs/docs/concepts/relationships.mdx b/docs/docs/concepts/relationships.mdx
index 388a0812..93db5dc3 100644
--- a/docs/docs/concepts/relationships.mdx
+++ b/docs/docs/concepts/relationships.mdx
@@ -1,6 +1,7 @@
---
sidebar_position: 5
---
+
# Relationships
In RushDB, relationships are the connections that link Records together, creating a powerful graph structure that represents both the data itself and how different pieces of data relate to one another. These connections enable intuitive data modeling that aligns with how we naturally think about information and its associations.
@@ -20,6 +21,7 @@ graph TD
```
These relationships are automatically created during the data import process when nested objects are detected. Learn more at [REST API - Import Data](../rest-api/records/import-data) or through the language-specific SDKs:
+
- [TypeScript SDK](../typescript-sdk/records/import-data)
- [Python SDK](../python-sdk/records/import-data)
@@ -42,16 +44,18 @@ This structure allows for finding connections between otherwise unrelated record
Beyond the built-in relationships that RushDB creates automatically during data import, users can define and reconstruct relationships manually in any direction and of any type needed. This flexibility enables sophisticated data modeling that precisely captures your domain's relationship semantics.
You can create, modify, and delete relationships programmatically using the [REST API](../rest-api/relationships) or through the language-specific SDKs:
+
- [TypeScript SDK](../typescript-sdk/relationships)
- [Python SDK](../python-sdk/relationships)
This capability allows you to:
+
- Define domain-specific relationship types (e.g., "BELONGS_TO", "MANAGES", "DEPENDS_ON")
- Create relationships between previously unconnected records
- Build complex graph structures that evolve over time
- Restructure relationships as your data model changes
-Bulk creation and many-to-many caution
+### Bulk creation and many-to-many caution
RushDB supports a bulk relationship creation endpoint (`POST /relationships/create-many`) that can either:
@@ -60,6 +64,101 @@ RushDB supports a bulk relationship creation endpoint (`POST /relationships/crea
The many-to-many mode is opt-in and guarded: the request must set a flag (e.g. `manyToMany`) and provide non-empty `where` filters for both sides; otherwise the server requires keys to perform a safe equality join. This prevents accidental, unbounded cartesian products which can be expensive to execute and store.
+## Suggested Relationships in the Dashboard
+
+Imported data is often structurally useful but semantically incomplete.
+
+For nested JSON, RushDB can already see the parent-child structure and creates default relationships automatically. For flat data imported from systems like MongoDB, PostgreSQL exports, CSV, or external APIs, related records may arrive as separate collections with reference fields such as `userId`, `orderId`, or `addressRef`. In both cases, the ontology can describe the labels, properties, and existing edges, but it may not know the domain-specific relationship names you want to keep long term.
+
+The **Relationships** tab in the Dashboard helps close that gap. RushDB analyzes the project ontology after writes, suggests relationship patterns, and lets you approve only the patterns that match your model.
+
+> Suggested relationships require LLM analysis to be configured for the project. Without it, RushDB still creates default relationships for nested imports and supports manual relationship creation through the API and SDKs.
+
+```mermaid
+sequenceDiagram
+ participant App as App / Import
+ participant RushDB
+ participant Ontology as Ontology API
+ participant LLM as Relationship Analyzer
+ participant UI as Dashboard
+
+ App->>RushDB: Write or import records
+ RushDB-->>App: Write response returned
+ RushDB->>Ontology: Refresh schema snapshot
+ RushDB->>LLM: Analyze labels, fields, relationships
+ LLM-->>RushDB: Suggested patterns
+ UI->>RushDB: User approves a pattern
+ RushDB->>RushDB: Create or retype matching relationships
+ RushDB->>RushDB: Apply approved pattern to future writes
+```
+
+### Two kinds of suggestions
+
+RushDB distinguishes between two common cases.
+
+| Suggestion kind | When it appears | What approval does |
+| ------------------- | ------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
+| **Match fields** | Two labels have reference-like fields, such as `ORDER.userId` and `USER.userId` | Creates relationships between matching records now and applies the same pattern to future writes |
+| **Rename existing** | Nested JSON already created default relationships, but the edge type is generic | Replaces matching default relationships with a semantic type, such as `DEPARTMENT` -> `HAS_PROJECT` -> `PROJECT` |
+
+### Example: flat imported collections
+
+If users and orders are imported separately, no graph edge exists yet:
+
+```json
+{
+ "USER": [
+ { "userId": "usr_001", "name": "Ava Chen" },
+ { "userId": "usr_002", "name": "Noah Smith" }
+ ],
+ "ORDER": [
+ { "orderId": "ord_101", "userId": "usr_001", "total": 129.5 },
+ { "orderId": "ord_102", "userId": "usr_002", "total": 48.0 }
+ ]
+}
+```
+
+The analyzer can suggest a **Match fields** pattern:
+
+```mermaid
+graph LR
+ U["USER userId = usr_001"] -->|"PLACED_ORDER"| O["ORDER userId = usr_001"]
+```
+
+Approving the pattern creates matching relationships for existing records and keeps applying the same rule as more orders arrive.
+
+### Example: nested import with generic edges
+
+Nested payloads already contain structure:
+
+```json
+{
+ "DEPARTMENT": [
+ {
+ "name": "Engineering",
+ "PROJECT": [{ "name": "Search relevance" }, { "name": "Ontology explorer" }]
+ }
+ ]
+}
+```
+
+RushDB imports this as `DEPARTMENT` connected to `PROJECT` through default relationships. The analyzer should not invent a field match like `DEPARTMENT.name` -> `PROJECT.name`; those fields are descriptive, not references. Instead, it can suggest a **Rename existing** pattern:
+
+```mermaid
+graph LR
+ BeforeA["DEPARTMENT"] -->|"__RUSHDB__RELATION__DEFAULT__"| BeforeB["PROJECT"]
+ AfterA["DEPARTMENT"] -->|"HAS_PROJECT"| AfterB["PROJECT"]
+```
+
+Approving this kind of pattern retypes the existing default relationships into semantic relationships. Future nested imports with the same structure can be upgraded the same way.
+
+### Why this helps
+
+- **Less manual stitching:** You do not need to write one-off relationship scripts for every imported collection.
+- **Safer graph evolution:** Suggestions stay in draft form until approved, and ignored suggestions can be removed later if you want them reconsidered.
+- **Better ontology for agents:** Semantic relationship types make schema discovery more useful. An agent can reason over `USER -> PLACED_ORDER -> ORDER` more reliably than over a generic default edge.
+- **Lower write latency:** Relationship discovery and application run as side effects after writes, so record writes can return without waiting for graph enrichment to finish.
+
## Nested Data Example
Consider this JSON structure:
@@ -84,6 +183,7 @@ Consider this JSON structure:
```
When imported into RushDB, this is transformed into a graph structure with:
+
- 3 Records (Person, Contact, and Address)
- 8 Properties (Name, Age, Email, Phone, Street, City, State, ZipCode)
- Default relationships connecting Person to Contact and Person to Address
@@ -130,17 +230,32 @@ This relationship model provides several advantages:
Each interface covers creating, querying, and managing relationships — pick the one that fits your stack:
-
diff --git a/docs/docs/concepts/semantic-search.mdx b/docs/docs/concepts/semantic-search.mdx
index e7480dad..2155d545 100644
--- a/docs/docs/concepts/semantic-search.mdx
+++ b/docs/docs/concepts/semantic-search.mdx
@@ -2,6 +2,9 @@
sidebar_position: 9
---
+import Tabs from '@site/src/components/LanguageTabs'
+import TabItem from '@theme/TabItem'
+
# Semantic Search
RushDB lets you search records by **meaning**, not just exact field values. Create an embedding index on any string property and every record that carries that property becomes searchable by natural-language similarity — while still supporting all the usual field filters, pagination, and graph traversal.
@@ -23,25 +26,75 @@ graph LR
## Managed vs. External Indexes
-| Aspect | Managed | External (BYOV) |
-|---|---|---|
-| Embeddings generated by | RushDB (server-side) | Your application |
-| Write flow | Automatic on record create/update | Supply vectors via `upsertVectors` or inline on write |
-| Search input | Natural-language `query` string | Pre-computed `queryVector` array |
-| Model control | RushDB-managed model | Any model, any dimension |
+| Aspect | Managed | External (BYOV) |
+| ----------------------- | --------------------------------- | ----------------------------------------------------- |
+| Embeddings generated by | RushDB (server-side) | Your application |
+| Write flow | Automatic on record create/update | Supply vectors via `upsertVectors` or inline on write |
+| Search input | Natural-language `query` string | Pre-computed `queryVector` array |
+| Model control | RushDB-managed model | Any model, any dimension |
Both types store vectors on the value relationship between the property node and the record node, using Neo4j's native vector index for fast retrieval.
+## Why Vectors Live on Property–Record Edges
+
+RushDB stores embeddings **on the edge** between the Property node and the Record node, not on the record itself. This is a deliberate design choice with several concrete benefits:
+
+- **One index spans every label.** A Property node (e.g. `description:string`) is shared across all records that carry that field, regardless of their label. A vector index on the Property→Record edge therefore covers `Article`, `Product`, `Movie`, and any other label with a `description` field — from a single index, with no duplication.
+- **Records stay clean.** Record nodes contain only typed scalar properties. Embedding arrays — which can be hundreds or thousands of floats — live on the relationship layer. This keeps record storage lean and read performance high for non-vector queries.
+- **Graph traversal and similarity compose naturally.** Because the vector is a property of a graph edge, Neo4j can traverse relationships and apply cosine scoring in the same query step. Multi-hop traversal followed by semantic re-ranking requires no extra join or post-processing pass.
+- **Automatic candidate scoping.** Only records actually connected to the indexed Property node are ever considered for similarity ranking. Tenant isolation, label filtering, and `where` prefiltering all reduce the candidate set through normal graph traversal — no metadata tricks required.
+
## Combining with Field Filters
Semantic search is not an either/or — it composes with RushDB's structured query capabilities. Pass a `where` clause to pre-filter candidates before similarity ranking:
+
+
+
+```typescript
+const { data } = await db.ai.search({
+ propertyName: 'description',
+ query: 'space exploration',
+ labels: ['Movie'],
+ where: { genre: 'sci-fi', year: { $gte: 2000 } },
+ limit: 10
+})
+// results ranked by cosine similarity, scoped to sci-fi films from 2000+
```
-Search "space exploration"
- WHERE genre = "sci-fi" AND year >= 2000
- LIMIT 10
+
+
+
+
+```python
+results = db.ai.search({
+ "propertyName": "description",
+ "query": "space exploration",
+ "labels": ["Movie"],
+ "where": { "genre": "sci-fi", "year": { "$gte": 2000 } },
+ "limit": 10
+})
+# results.data ranked by cosine similarity, scoped to sci-fi films from 2000+
+```
+
+
+
+
+```bash
+curl -X POST "https://api.rushdb.com/api/v1/ai/search" \
+ -H "Authorization: Bearer $RUSHDB_API_KEY" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "propertyName": "description",
+ "query": "space exploration",
+ "labels": ["Movie"],
+ "where": { "genre": "sci-fi", "year": { "$gte": 2000 } },
+ "limit": 10
+ }'
```
+
+
+
This narrows the vector search to only matching records, keeping results precise and fast.
## Two Ways to Search Semantically
@@ -62,21 +115,21 @@ For advanced use cases, add a `vector.similarity.cosine` or `vector.similarity.e
## Index Lifecycle
-| State | Description |
-|---|---|
-| `pending` | Index created, backfill not yet started. |
-| `indexing` | Backfill in progress — existing records are being embedded. |
-| `ready` | All records indexed. New records are embedded on write automatically. |
+| State | Description |
+| ---------- | --------------------------------------------------------------------- |
+| `pending` | Index created, backfill not yet started. |
+| `indexing` | Backfill in progress — existing records are being embedded. |
+| `ready` | All records indexed. New records are embedded on write automatically. |
You can check index status at any time and list all indexes for a project.
## When to Use Semantic Search
-| Scenario | Approach |
-|---|---|
-| User knows the exact value | Structured `where` filter |
-| User describes what they want in natural language | `db.ai.search()` with `query` |
-| Combine meaning + exact constraints | `db.ai.search()` with `where` pre-filter |
+| Scenario | Approach |
+| -------------------------------------------------------- | -------------------------------------------------------- |
+| User knows the exact value | Structured `where` filter |
+| User describes what they want in natural language | `db.ai.search()` with `query` |
+| Combine meaning + exact constraints | `db.ai.search()` with `where` pre-filter |
| Need groupBy, collect, or multi-hop alongside similarity | `db.records.find()` with `vector.similarity` aggregation |
→ See also [Agent Memory Model](./agent-memory-model.md) for how semantic search fits into the three-layer retrieval stack.
@@ -87,17 +140,32 @@ You can check index status at any time and list all indexes for a project.
Each interface covers search, indexing, and BYOV — pick the one that fits your stack:
-
+ RushDB analyzes your ontology after writes and suggests relationship patterns between records
+ that look connected. Some patterns match reference fields; others rename imported default
+ relationships when nested data already created the structure. Approving a pattern applies it now
+ and keeps applying it to future writes.
+