From 62e45a6a518b7ff6e78875f490b2742026ee517f Mon Sep 17 00:00:00 2001
From: Loki FastStart <loki@faststart.internal>
Date: Sun, 10 May 2026 10:56:20 +0000
Subject: [PATCH 1/3] docs: add telemetry opt-out to Step 1 install section

Shows what's collected (OS/arch/duration only), how to opt out
(touch ~/.lowkey/telemetry-off or LOWKEY_TELEMETRY=0), and links
to full privacy details.
---
 README.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/README.md b/README.md
index f7b211b..425d733 100644
--- a/README.md
+++ b/README.md
@@ -52,6 +52,12 @@
 
 Run `curl -sfL install.lowkey.run | bash` — the installer walks you through **pack**, **profile**, **instance size**, and **deploy method** (CloudFormation or Terraform).
 
+> **📊 Telemetry opt-out:** The installer sends anonymous install telemetry (start/success/failure + OS/arch/duration — no code, credentials, IPs, or hostnames). To opt out before installing:
+> ```bash
+> mkdir -p ~/.lowkey && touch ~/.lowkey/telemetry-off
+> ```
+> Or set `LOWKEY_TELEMETRY=0` when running the installer. [Full privacy details →](https://docs.lowkey.run/reference/telemetry-privacy)
+
 **CLI flags for non-interactive deploys:**
 
 | Flag | Description |

From 124c966868cfde0a6b0e51b4113cd56b9f7a42dc Mon Sep 17 00:00:00 2001
From: Roy Osherove <575051+royosherove@users.noreply.github.com>
Date: Mon, 11 May 2026 09:01:14 +0000
Subject: [PATCH 2/3] Remove memory-management bootstrap + fix alarm
 false-positive sources

- Delete BOOTSTRAP-MEMORY-SEARCH.md (bedrockify-backed semantic memory search)
- Drop BedrockifyAlive/loki-bedrockify-down references from BOOTSTRAP-ALARMS
  (metric was never published by the health-check script -> alarm stuck in
  INSUFFICIENT_DATA / flapped, pure false positive)
- Renumber tier-3 sections after removal
- Add explicit 'rebind on instance replacement' warning: custom alarms are
  scoped to InstanceId dimension; when the EC2 box is replaced the alarms
  must be redeployed against the new id or they sit in INSUFFICIENT_DATA
  forever. Recommend TreatMissingData=missing to avoid spurious paging.

Live fix applied alongside this commit:
- Rebound 9 loki-* alarms from i-04d527003d0a094ff (old) to i-0229529f514ef6fd7
  (current) with TreatMissingData=missing.
- Deleted obsolete loki-bedrockify-down alarm.
---
 bootstraps/essential/BOOTSTRAP-ALARMS.md      |  22 +-
 .../essential/BOOTSTRAP-MEMORY-SEARCH.md      | 238 ------------------
 deploy/README.md                              |   1 -
 3 files changed, 5 insertions(+), 256 deletions(-)
 delete mode 100644 bootstraps/essential/BOOTSTRAP-MEMORY-SEARCH.md

diff --git a/bootstraps/essential/BOOTSTRAP-ALARMS.md b/bootstraps/essential/BOOTSTRAP-ALARMS.md
index 816286c..60935aa 100644
--- a/bootstraps/essential/BOOTSTRAP-ALARMS.md
+++ b/bootstraps/essential/BOOTSTRAP-ALARMS.md
@@ -12,6 +12,8 @@ Alarms to deploy on every EC2 instance running a Loki agent. Designed to catch t
 - SNS topic for notifications (create one or pass existing ARN)
 - Instance ID and region known at deploy time
 
+> ⚠️ **Rebind on instance replacement.** All custom alarms (Tier 3) are scoped to a specific `InstanceId` dimension. When the EC2 instance is replaced (manual rebuild, ASG refresh, etc.), you **must** redeploy the alarms against the new instance id — otherwise alarms stay in `INSUFFICIENT_DATA` forever (or flap to ALARM depending on `TreatMissingData`). All custom alarms here set `TreatMissingData=missing` to avoid spurious paging on short metric gaps.
+
 ## Tier 1 — Instance Survival (auto-recover)
 
 These use built-in EC2/CloudWatch metrics. No agent needed.
@@ -120,18 +122,7 @@ Action: SNS notify
 
 ### Common Service Checks (All Agents)
 
-### 3.2 Bedrockify Alive
-
-Both OpenClaw and Hermes depend on bedrockify. Monitor it on all instances.
-
-```
-Metric: Custom/Loki BedrockifyAlive
-Value: 1 = systemd active + HTTP 200 on health endpoint (port 8090), 0 = down
-Threshold: < 1 for 2 consecutive periods (1 min each)
-Action: SNS notify
-```
-
-### 3.3 Systemd Failed Units
+### 3.2 Systemd Failed Units
 
 Catches: any crash-looping service, not just the ones we know about.
 **Would have caught the bedrock-embed-proxy crash-loop immediately.**
@@ -143,7 +134,7 @@ Threshold: > 0 for 1 period (1 min)
 Action: SNS notify
 ```
 
-### 3.4 Bedrock API Reachable
+### 3.3 Bedrock API Reachable
 
 Catches: credential expiry, region issues, service disruptions, model access revoked.
 
@@ -172,8 +163,7 @@ Pushes all Tier 3 custom metrics in a single `put-metric-data` call (batched).
 **What it checks:**
 1. **OpenClaw instances:** `pgrep -f openclaw-gatewa` — OpenClaw gateway process alive
    **Hermes instances:** `pgrep -f hermes` — Hermes agent process alive
-2. `systemctl is-active bedrockify` + `curl -sf localhost:8090/` — Bedrockify alive + healthy (required for all agents)
-3. `systemctl list-units --failed --no-legend | wc -l` — Failed unit count
+2. `systemctl list-units --failed --no-legend | wc -l` — Failed unit count
 4. `df --output=pcent / | tail -1` — Root disk percent
 5. `free | awk '/Mem/ {printf "%.0f", $3/$2*100}'` — Memory percent
 6. Quick Bedrock `InvokeModel` with tiny payload (1 embedding, cached model) — API reachable
@@ -248,7 +238,6 @@ Provides a single-pane view of all alarms, service health, compute resources, ne
           "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-instance-status-check-failed",
           "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-openclaw-down",
           "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-hermes-down",
-          "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-bedrockify-down",
           "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-bedrock-unreachable",
           "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-failed-units",
           "arn:aws:cloudwatch:us-east-1:ACCOUNT_ID:alarm:loki-cpu-high",
@@ -284,7 +273,6 @@ Provides a single-pane view of all alarms, service health, compute resources, ne
       "properties": {
         "title": "⚡ Bedrockify",
         "metrics": [
-          [ "Custom/Loki", "BedrockifyAlive", "InstanceId", "INSTANCE_ID", { "label": "Bedrockify Alive", "color": "#1f77b4" } ]
         ],
         "view": "timeSeries", "stacked": false, "region": "us-east-1",
         "period": 60, "stat": "Minimum",
diff --git a/bootstraps/essential/BOOTSTRAP-MEMORY-SEARCH.md b/bootstraps/essential/BOOTSTRAP-MEMORY-SEARCH.md
deleted file mode 100644
index f208908..0000000
--- a/bootstraps/essential/BOOTSTRAP-MEMORY-SEARCH.md
+++ /dev/null
@@ -1,238 +0,0 @@
-# BOOTSTRAP-MEMORY-SEARCH.md — Enable Semantic Memory Search with Bedrock Embeddings
-
-> **Applies to:** All agents (with agent-specific sections below)
-
-> **Run this once to enable memory search.** If `memory/.bootstrapped-memory-search` exists, skip — you've already done this.
-
-## Overview
-
-Semantic memory search uses an OpenAI-compatible embeddings API. [bedrockify](https://github.com/inceptionstack/bedrockify) — already installed as a dependency of all agent packs — provides `/v1/embeddings` on localhost, translating OpenAI embedding calls into Amazon Bedrock embedding calls. No external API keys needed — uses the EC2 instance profile.
-
-```
-memory_search → http://127.0.0.1:8090/v1/embeddings → bedrockify → Bedrock Titan Embed v2 → vector results
-```
-
-## Prerequisites
-
-- EC2 instance with IAM role that has `bedrock:InvokeModel` permission
-- Bedrock model access enabled for `amazon.titan-embed-text-v2:0` in us-east-1
-- **bedrockify already running** — installed and started by the bedrockify pack (dependency of both OpenClaw and Hermes)
-
-## Step 1: Verify bedrockify Is Running
-
-bedrockify is installed as a systemd service by the bedrockify pack. No separate installation needed.
-
-```bash
-# Check service status
-systemctl status bedrockify
-# Should show: active (running)
-
-# Health check
-curl -s http://127.0.0.1:8090/
-# Expected: {"status":"ok",...}
-```
-
-If bedrockify is not running, check the service:
-
-```bash
-sudo journalctl -u bedrockify -n 20
-sudo systemctl restart bedrockify
-```
-
-## Step 2: Verify Embeddings Endpoint
-
-**Single embedding:**
-```bash
-curl -s -X POST http://127.0.0.1:8090/v1/embeddings \
-  -H "Content-Type: application/json" \
-  -d '{"input": "test embedding", "model": "amazon.titan-embed-text-v2:0"}' \
-  | jq '{object, model, dims: (.data[0].embedding | length)}'
-# Expected: {"object":"list","model":"amazon.titan-embed-text-v2:0","dims":1024}
-```
-
-**Batch embeddings:**
-```bash
-curl -s -X POST http://127.0.0.1:8090/v1/embeddings \
-  -H "Content-Type: application/json" \
-  -d '{"input": ["first text", "second text"], "model": "amazon.titan-embed-text-v2:0"}' \
-  | jq '{results: (.data | length), dims: [.data[].embedding | length]}'
-# Expected: {"results":2,"dims":[1024,1024]}
-```
-
-## OpenClaw-Specific Configuration
-
-### Step 3: Configure OpenClaw Memory Search
-
-Add this to your `openclaw.json` under `agents.defaults`:
-
-```json
-"memorySearch": {
-  "enabled": true,
-  "provider": "openai",
-  "remote": {
-    "baseUrl": "http://127.0.0.1:8090/v1/",
-    "apiKey": "not-needed"
-  },
-  "fallback": "none",
-  "model": "amazon.titan-embed-text-v2:0",
-  "query": {
-    "hybrid": {
-      "enabled": true,
-      "vectorWeight": 0.7,
-      "textWeight": 0.3
-    }
-  },
-  "cache": {
-    "enabled": true,
-    "maxEntries": 50000
-  }
-}
-```
-
-Then restart the OpenClaw gateway.
-
-### Step 4: Verify End-to-End
-
-Ask the agent to run `memory_search` with any query. It should return ranked results from workspace memory files using hybrid search (70% vector, 30% text).
-
-### Step 5: Backfill Existing Memory
-
-After enabling semantic search for the first time, existing memory files are **not** automatically indexed. Run:
-
-```bash
-openclaw memory index --force
-```
-
-This vectorizes all current memory files. Without this step, `memory_search` will only find content written *after* the setup — silently missing everything prior.
-
-### Memory Quality Matters
-
-Vector search ranks results by cosine similarity. **Low-signal, repetitive content tanks the scores of everything nearby** — making recall unreliable even for genuinely useful memories.
-
-#### What hurts search quality
-
-High-frequency repetitive content compresses into a dense cluster in vector space. This raises the effective similarity floor and pushes useful content below the retrieval threshold.
-
-The biggest offender is **heartbeat logs**:
-
-```markdown
-## Heartbeat 02:19 UTC
-### Apps: ✅ frontend + api healthy
-### Security Hub: no change — 1 CRITICAL, 94 HIGH
-## Heartbeat 02:49 UTC
-### Apps: ✅ frontend + api healthy
-### Security Hub: no change — 1 CRITICAL, 94 HIGH
-```
-
-A single daily memory file can contain 40–50 of these. They're semantically near-identical, contribute nothing to recall, and dilute chunk quality across the entire index.
-
-#### Rule: only write what changed
-
-**Don't write:**
-- "no change", "all healthy", "nothing to report"
-- Repeated status confirmations
-- Routine cron completions with no notable outcome
-
-**Do write:**
-- App went down or returned unexpected status
-- Security finding count changed (new CVE, severity shift)
-- A decision was made
-- A bug was found or fixed
-- A TODO was started or completed autonomously
-
-#### Keep heartbeat files out of the index
-
-If your agent writes verbose heartbeat logs that are useful for audit but not for recall, route them to a separate file pattern and exclude from indexing:
-
-```bash
-# Write heartbeat noise here (not indexed)
-memory/heartbeat-YYYY-MM-DD.md
-
-# Keep daily notes clean (indexed)
-memory/YYYY-MM-DD.md
-```
-
-To exclude a pattern from indexing, configure the memory sources in `openclaw.json`:
-
-```json
-"memorySearch": {
-  "sources": {
-    "exclude": ["memory/heartbeat-*.md"]
-  }
-}
-```
-
-## Hermes-Specific Configuration
-
-Hermes has its own built-in memory system:
-
-- **MEMORY.md** (~2,200 chars) — agent's personal notes, environment facts, lessons learned
-- **USER.md** (~1,375 chars) — user preferences, communication style
-- **Session search** — FTS5 full-text search across all past sessions in `~/.hermes/state.db`
-
-Hermes memory is managed via the `memory` tool (add/replace/remove) and injected into the system prompt at session start. Session search uses `session_search` for finding past conversations.
-
-**Bedrockify embeddings** are still available on `localhost:8090` for custom embedding workflows or MCP-based memory extensions:
-
-```bash
-curl -s -X POST http://127.0.0.1:8090/v1/embeddings \
-  -H "Content-Type: application/json" \
-  -d '{"input": "your text here", "model": "amazon.titan-embed-text-v2:0"}'
-```
-
-To configure Hermes memory limits:
-
-```yaml
-# In ~/.hermes/config.yaml
-memory:
-  memory_enabled: true
-  user_profile_enabled: true
-  memory_char_limit: 2200
-  user_char_limit: 1375
-```
-
-## Supported Models
-
-bedrockify supports embedding models based on the `--embed-model` flag set at install time. The default is `amazon.titan-embed-text-v2:0`. Common options:
-
-| Model | ID | Dims |
-|-------|----|------|
-| **Titan Embed Text V2** (default) | `amazon.titan-embed-text-v2:0` | 1024 |
-| Titan Embed G1 Text | `amazon.titan-embed-g1-text-02` | 1536 |
-| Cohere Embed English v3 | `cohere.embed-english-v3` | 1024 |
-| Cohere Embed Multilingual v3 | `cohere.embed-multilingual-v3` | 1024 |
-
-To change the embedding model, update the bedrockify service configuration:
-
-```bash
-# Edit the bedrockify systemd service to change --embed-model
-sudo systemctl edit bedrockify
-# Add override for ExecStart with your preferred --embed-model
-sudo systemctl restart bedrockify
-```
-
-## Pi-Specific Configuration
-
-Pi has no built-in memory system. There is no `memory_search` tool or persistent session storage.
-
-To build custom memory, use bedrockify's `/v1/embeddings` endpoint (available on `localhost:8090`) to generate and store vectors in a file or SQLite database, then query them manually via a Pi extension or bash tool. This is opt-in and requires custom implementation.
-
-## IronClaw-Specific Configuration
-
-IronClaw has a built-in state database (PostgreSQL or embedded libSQL at `~/.ironclaw/state.db`). It may provide its own memory/session search — check IronClaw's documentation for available search tools.
-
-Bedrockify embeddings are also available on `localhost:8090` for custom semantic search workflows:
-
-```bash
-curl -s -X POST http://127.0.0.1:8090/v1/embeddings \
-  -H "Content-Type: application/json" \
-  -d '{"input": "your text here", "model": "amazon.titan-embed-text-v2:0"}'
-```
-
-These can be used alongside IronClaw's native state DB for hybrid retrieval if needed.
-
-## Finish
-
-```bash
-mkdir -p memory && echo "Memory search bootstrapped $(date -u +%Y-%m-%dT%H:%M:%SZ)" > memory/.bootstrapped-memory-search
-```
diff --git a/deploy/README.md b/deploy/README.md
index dc71c27..62073b5 100644
--- a/deploy/README.md
+++ b/deploy/README.md
@@ -85,7 +85,6 @@ These set up security baselines, coding guidelines, MCP tools, memory search, an
 | `BOOTSTRAP-SECURITY.md` | Security hardening + AWS Budgets alerts |
 | `BOOTSTRAP-SKILLS.md` | Installs AWS infrastructure skills |
 | `BOOTSTRAP-MCPORTER.md` | Sets up MCP server tooling |
-| `BOOTSTRAP-MEMORY-SEARCH.md` | Enables semantic memory search via Bedrock embeddings |
 | `BOOTSTRAP-CODING-GUIDELINES.md` | Coding standards and project conventions |
 | `BOOTSTRAP-SECRETS-AWS.md` | AWS Secrets Manager integration |
 | `BOOTSTRAP-PLAYWRIGHT.md` | Browser automation via Playwright MCP |

From 9dbc3f44a750d7c357ee87602d4b5faea751dff4 Mon Sep 17 00:00:00 2001
From: Roy Osherove <575051+royosherove@users.noreply.github.com>
Date: Mon, 11 May 2026 09:05:19 +0000
Subject: [PATCH 3/3] fix(alarms): exclude systemd-coredump@* transient units
 from FailedUnits count

systemd-coredump@<uid>-<pid>-<n>.service units are one-shot transient units
that systemd spawns to handle a coredump and then leaves in 'failed' state
after exit. They are not real service failures but they inflate the
FailedUnits metric and cause loki-failed-units to fire on any box that
has recently dumped a core.

Patch the documented health-check command to grep them out. (Live fix also
applied to /usr/local/bin/loki-health-check.sh on the current instance.)
---
 bootstraps/essential/BOOTSTRAP-ALARMS.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bootstraps/essential/BOOTSTRAP-ALARMS.md b/bootstraps/essential/BOOTSTRAP-ALARMS.md
index 60935aa..ef665a8 100644
--- a/bootstraps/essential/BOOTSTRAP-ALARMS.md
+++ b/bootstraps/essential/BOOTSTRAP-ALARMS.md
@@ -163,7 +163,7 @@ Pushes all Tier 3 custom metrics in a single `put-metric-data` call (batched).
 **What it checks:**
 1. **OpenClaw instances:** `pgrep -f openclaw-gatewa` — OpenClaw gateway process alive
    **Hermes instances:** `pgrep -f hermes` — Hermes agent process alive
-2. `systemctl list-units --failed --no-legend | wc -l` — Failed unit count
+2. `systemctl list-units --failed --no-legend | grep -v 'systemd-coredump@' | wc -l` — Failed unit count (excludes transient coredump handler units, which linger in `failed` state after handling any crash)
 4. `df --output=pcent / | tail -1` — Root disk percent
 5. `free | awk '/Mem/ {printf "%.0f", $3/$2*100}'` — Memory percent
 6. Quick Bedrock `InvokeModel` with tiny payload (1 embedding, cached model) — API reachable