Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions reference/analytics/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,9 +196,40 @@ Applications can record custom metrics using the `server.recordAnalytics()` API.

## Analytics Configuration

The `analytics.aggregatePeriod` configuration option controls how frequently aggregate summaries are written. See [Configuration Overview](../configuration/overview.md) for details.
The `analytics` configuration section controls aggregation, replication, and storage-volume sampling. All options are optional.

```yaml
analytics:
aggregatePeriod: 60
storageInterval: 10
replicate: false
logging:
level: info
```

### `analytics.aggregatePeriod`

Type: `number` (seconds)  •  Default: `60`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Type: `number` (seconds)    Default: `60`
Type: `number` (seconds) - Default: `60`

Not going to block on this, but this whole double space and special dot character (at least its not on my keyboard) seems extra. Most of the docs use regular characters like - here.


How frequently Harper aggregates raw per-second entries into the `hdb_analytics` summary table. Lowering this gives higher-resolution aggregate data at the cost of more frequent aggregation work and more rows in `hdb_analytics`.

### `analytics.storageInterval`

Type: `number`  •  Default: `10`

Number of aggregation cycles between disk-volume measurements. With the default `aggregatePeriod` of `60` and `storageInterval` of `10`, Harper records `database-size`, `table-size`, and `storage-volume` metrics every 10 minutes. Set to `0` to disable storage-volume sampling entirely — useful when running on systems where `statfs` is expensive or unavailable (e.g., some FUSE mounts).

### `analytics.replicate`

Type: `boolean`  •  Default: `false`

When enabled, aggregate analytics entries are replicated across the cluster so a single peer can answer aggregate queries for the whole topology. Raw per-thread entries (`hdb_raw_analytics`) are always node-local. Enable when running a centralized analytics consumer; leave disabled in large clusters to avoid replication overhead for high-cardinality metrics.

### `analytics.logging`

Type: `object`

Per-component analytics logging can be configured via `analytics.logging`. See [Logging Configuration](../logging/configuration.md) for details.
Per-subsystem logging override for the analytics writer. See [Logging Configuration — `analytics.logging`](../logging/configuration.md#analyticslogging).

## Related

Expand Down
111 changes: 111 additions & 0 deletions reference/configuration/debugging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
title: Worker Thread Debugging
---

# Worker Thread Debugging

Harper runs as a main thread plus a pool of worker threads (configurable via `threads.count`). The `threads.debug` option exposes the Node.js inspector on each thread so you can attach Chrome DevTools, VS Code, or any [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/) (CDP) client to step through component code, inspect heap state, or capture CPU profiles.

For the worker thread architecture, see [Architecture Overview](../database/overview.md#architecture-overview).

## Enabling the Debugger

The simplest form starts the inspector on the main thread at the default port (`9229`):

```yaml
threads:
debug: true
```

For per-thread debugging, expand to an object:

```yaml
threads:
debug:
startingPort: 9229
host: 127.0.0.1
waitForDebugger: false
```

| Property | Type | Default | Description |
| ----------------- | --------- | ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `port` | `integer` | `9229` | Port for the main thread inspector. Use this when you only need to debug startup or main-thread behavior. |
| `startingPort` | `integer` | _(none)_ | When set, each worker thread gets a sequential inspector port starting from this value. Thread N uses port `startingPort + N`. The main thread keeps `port`. |
| `host` | `string` | `127.0.0.1` | Interface the inspector binds to. Leave on loopback in production; use `0.0.0.0` only when tunneling over SSH or operating in a trusted network. |
| `waitForDebugger` | `boolean` | `false` | Pause each thread at startup until a debugger attaches. Useful for catching bugs that occur during component initialization. |

## Attaching Chrome DevTools

1. Set `threads.debug.startingPort: 9230` (so worker threads use 9230, 9231, … and the main thread keeps 9229).
2. Start Harper.
3. Open `chrome://inspect` in Chrome.
4. Click **Configure** and add `localhost:9229`, `localhost:9230`, … for each thread you want to inspect.
5. The threads appear under **Remote Target**. Click **inspect** to open DevTools for that thread.

## Attaching VS Code

Add an entry per thread to `.vscode/launch.json`. The example below attaches to the main thread and two workers:

```json
{
"version": "0.2.0",
"configurations": [
{ "type": "node", "request": "attach", "name": "Harper main", "port": 9229 },
{ "type": "node", "request": "attach", "name": "Harper worker 1", "port": 9230 },
{ "type": "node", "request": "attach", "name": "Harper worker 2", "port": 9231 }
],
"compounds": [
{ "name": "Harper (all threads)", "configurations": ["Harper main", "Harper worker 1", "Harper worker 2"] }
]
}
```

Run the compound configuration to attach to every thread at once.

## Debugging Remote Instances

Inspector ports must remain on `127.0.0.1` in production. To reach them from a developer workstation, tunnel each port over SSH:

```bash
ssh -L 9229:127.0.0.1:9229 \
-L 9230:127.0.0.1:9230 \
-L 9231:127.0.0.1:9231 \
harper.example.com
```

Then point Chrome DevTools or VS Code at `localhost:9229–9231` as if they were local.

## Waiting for the Debugger at Startup

When a bug only reproduces during component initialization, set `waitForDebugger: true`. Each thread starts paused on its first line until a debugger attaches and resumes execution. This is also the safest way to debug an initialization sequence that completes too quickly to manually attach.

```yaml
threads:
debug:
startingPort: 9230
waitForDebugger: true
```

**Health checks and load balancers** will fail while the threads are paused — only enable `waitForDebugger` in dedicated debug environments.

## Heap Snapshots Near the Limit

### `threads.heapSnapshotNearLimit`

Type: `boolean`  •  Default: `false`

When the V8 heap approaches the limit set by `threads.maxHeapMemory`, the thread writes a `.heapsnapshot` file to the Harper root directory before the process exits with an out-of-memory error. The snapshot can be loaded into Chrome DevTools (Memory tab → **Load profile**) to identify retained objects responsible for the leak.

```yaml
threads:
maxHeapMemory: 1024
heapSnapshotNearLimit: true
```

Snapshots can be large (often a sizable fraction of the heap limit) and writing them blocks the thread briefly — leave disabled for normal operation and enable only when investigating an out-of-memory pattern.

## Related

- [Configuration Options — `threads`](./options.md#threads) — full thread configuration reference
- [Architecture Overview](../database/overview.md#architecture-overview) — how worker threads fit into Harper
- [Node.js Inspector documentation](https://nodejs.org/en/learn/getting-started/debugging) — debugger protocol details
12 changes: 8 additions & 4 deletions reference/configuration/options.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ threads:

- `count` — Number of worker threads; _Default_: CPU count minus one
- `maxHeapMemory` — Heap limit per thread (MB)
- `heapSnapshotNearLimit` — Take heap snapshot when approaching limit
- `debug` — Enable debugging; sub-options: `port`, `startingPort`, `host`, `waitForDebugger`
- `heapSnapshotNearLimit` — Write a `.heapsnapshot` file when a thread nears its heap limit (loadable in Chrome DevTools Memory tab); _Default_: `false`. See [Worker Thread Debugging](./debugging.md#heap-snapshots-near-the-limit)
- `debug` — Enable Node.js inspector; sub-options: `port`, `startingPort`, `host`, `waitForDebugger`. See [Worker Thread Debugging](./debugging.md)

---

Expand Down Expand Up @@ -212,7 +212,7 @@ replication:

## `storage`

Database storage configuration. See [Database Overview](../database/overview.md) and [Compaction](../database/compaction.md).
Database storage configuration. See [Storage Tuning](../database/storage-tuning.md) for guidance on tuning these options for production workloads, [Database Overview](../database/overview.md) for general database concepts, and [Compaction](../database/compaction.md) for reclaiming space inside existing files.

```yaml
storage:
Expand All @@ -235,7 +235,9 @@ storage:
- `path` — Database files directory; _Default_: `<rootPath>/database`
- `blobPaths` — Blob storage directory or directories; _Default_: `<rootPath>/blobs` (Added in: v4.5.0)
- `pageSize` — Database page size (bytes); _Default_: OS default
- `reclamation.threshold` / `reclamation.interval` / `reclamation.evictionFactor` — Background storage reclamation settings (Added in: v4.5.0)
- `reclamation.threshold` — Free-space ratio below which reclamation begins evicting from caching tables; _Default_: `0.4` (Added in: v4.5.0)
- `reclamation.interval` — Free-space check interval; _Default_: `1h`
- `reclamation.evictionFactor` — Heuristic factor for early eviction under disk pressure; _Default_: `100000`. See [Storage Tuning — Reclamation](../database/storage-tuning.md#storage-reclamation)

---

Expand Down Expand Up @@ -270,7 +272,9 @@ analytics:
```

- `aggregatePeriod` — Aggregation interval (seconds); _Default_: `60` (Added in: v4.5.0)
- `storageInterval` — Aggregation cycles between storage-volume measurements (`0` disables); _Default_: `10`
- `replicate` — Replicate analytics data across cluster; _Default_: `false`
- `logging` — Per-subsystem logger override for analytics writes. See [Logging Configuration](../logging/configuration.md#analyticslogging)

---

Expand Down
7 changes: 7 additions & 0 deletions reference/configuration/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,3 +227,10 @@ Safe mode skips:
- Loading package-based extensions defined in component configs

Built-in plugins (REST, HTTP, operations API, etc.) and the core database are unaffected.

## See Also

- [Configuration Options](./options.md) — complete reference for every `harper-config.yaml` key
- [Worker Thread Debugging](./debugging.md) — attaching Node.js inspector to Harper's worker threads
- [Storage Tuning](../database/storage-tuning.md) — production tuning of `storage.*` options
- [Logging Configuration](../logging/configuration.md) — main and per-subsystem log settings
1 change: 1 addition & 0 deletions reference/database/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ For deeper coverage of each database feature, see the dedicated pages in this se
- **[API](./api.md)** — The `tables`, `databases`, `transaction()`, and `createBlob()` globals for interacting with the database from code
- **[Data Loader](./data-loader.md)** — Loading seed or initial data into tables as part of component deployment
- **[Storage Algorithm](./storage-algorithm.md)** — How Harper stores data using LMDB with universal indexing and ACID compliance
- **[Storage Tuning](./storage-tuning.md)** — Tuning `storage.*` options for production: durability vs. throughput, compression, blob paths, reclamation
- **[Jobs](./jobs.md)** — Asynchronous bulk data operations (CSV import/export, S3 import/export)
- **[System Tables](./system-tables.md)** — Harper internal tables for analytics, data loader state, and other system features
- **[Compaction](./compaction.md)** — Reducing database file size by eliminating fragmentation and free space
Expand Down
Loading
Loading