Skip to content

East-west agent: identity MMDB, declared-edge microseg, behavioral + direction fields#410

Closed
pigri wants to merge 14 commits into
mainfrom
feat/east-west-agent
Closed

East-west agent: identity MMDB, declared-edge microseg, behavioral + direction fields#410
pigri wants to merge 14 commits into
mainfrom
feat/east-west-agent

Conversation

@pigri

@pigri pigri commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Draft — blocked on the amygdala scheme fields (see depends-on below).

Adds the agent half of the east-west / lateral-movement detection chain: make internal pod-to-pod traffic visible, enrich it with direction + behavior + workload identity, and express microsegmentation as a declared-edge allow-list.

What's here

  • Direction + behavioral fields — populates ids.dst_home_net/ids.dst_external_net, ids.src_pod_net/ids.dst_pod_net, and the flow.* behavioral group (unique_dst_ports, flows_per_min, dst_port_entropy, …) at both eval sites: kernel_pump (XDP) and the AF_PACKET fallback, so enrichment never silently no-ops on fallback nodes. POD_NETS/is_pod_ip (in synapse-access-rules) distinguishes true pod-to-pod from node-to-node-over-public-IP (both are in HOME_NET, so HOME_NET alone can't). New BlockSource::Microseg / Behavioral for source attribution.

  • Workload identity (pod IP → workload/namespace)IdentityMmdbWorker pulls identity.mmdb on the existing threat-MMDB rails (push-aware via the config SSE channel + interval-poll fallback). A security::identity lookup module resolves src/dst IPs, surfaced as id.* fields. No in-cluster RBAC on the agent — it's a downloaded artifact like the threat MMDB.

  • Declared-edge microsegmentationsecurity::edge_set consumes a policy-edges allow-list; edge.declared / edge.policy_violation are evaluated per flow so a rule like edge.policy_violation == 1 enforces microseg at the wire.

  • Per-rule log smart-firewall action — matches and records a notice but installs no kernel block, regardless of enforce_block. Lets a single new rule dry-run while others stay enforcing (staged alert-first rollout).

  • Overlay VXLAN/Geneve decap in the IDS inspect path so pod-to-pod overlay frames decode to their inner Ethernet frame.

Depends on

Requires the matching amygdala scheme fields — ids.dst_*, ids.*_pod_net, flow.*, id.*, edge.*, and Action::Log (gen0sec/amygdala#9). Until that merges and publishes, this won't build; the amygdala dependency version + Cargo.lock need bumping at that point. Kept as a draft for that reason.

Notes for review

  • Branched off the merge-base this work was authored against; main has since advanced (0.7.4 + the classifier HOME_NET skip), so rebase onto main before un-drafting.
  • Cargo.lock is intentionally unchanged — regenerate after the amygdala bump.
  • Alert-first throughout: nothing drops until a rule is authored and enforce_block is flipped; the is_protected_ip never-ban guard still wraps every block.

pigri and others added 14 commits June 26, 2026 17:29
…ral + direction fields

Adds the agent half of the east-west / lateral-movement detection chain.

- Direction + behavior: populate ids.dst_home_net/dst_external_net,
  ids.src_pod_net/dst_pod_net and the flow.* behavioral group (unique_dst_ports,
  flows_per_min, dst_port_entropy, ...) at BOTH eval sites (kernel_pump XDP and
  the AF_PACKET fallback). POD_NETS/is_pod_ip distinguishes true pod-to-pod from
  node-to-node-over-public-IP. New BlockSource::Microseg/Behavioral.

- Workload identity: IdentityMmdbWorker pulls identity.mmdb on the threat-MMDB
  rails (push-aware via config SSE + interval-poll fallback); a security::identity
  lookup module resolves src/dst IP -> workload/namespace, surfaced as id.* fields.

- Declared-edge microsegmentation: security::edge_set consumes the policy-edges
  allow-list; edge.declared / edge.policy_violation evaluated per flow.

- Per-rule `log` smart-firewall action (records a notice, installs no kernel
  block) for staged alert-first rollout.

- Overlay VXLAN/Geneve decap in the IDS inspect path so pod-to-pod overlay
  frames decode.

Requires the matching amygdala scheme fields (ids.dst_*, ids.*_pod_net, flow.*,
id.*, edge.*, Action::Log). The amygdala dependency + Cargo.lock need bumping once
that publishes; build is blocked until then.

Claude-Session: https://claude.ai/code/session_01MAY62VuZHhkeXgQ9feF3fJ
…path scans

is_pod_ip and is_protected_ip each took an RwLock read and walked a Vec<IpNet>
linearly, so the capture hot path paid four locked O(N) scans per flow
(src/dst x pod/protected).

Add a precompiled SegmentTable (`classify_ip` -> {protected, pod}) returning both
memberships in one pass, behind an ArcSwap rebuilt on every HOME_NET / POD_NET
mutation — reads are now lock-free and a hot-reload is a single pointer swap.
is_pod_ip / is_protected_ip delegate to it (API unchanged).

Representation is adaptive: a flat scan at or below 16 CIDRs (the common k8s
handful, where hashing overhead would lose), a prefix-length-bucketed hash above
it (O(distinct prefix lengths), for the many-subnet case). std-only, no new
lookup crate. Measured ~7.7x on a ~200-CIDR set; parity on small sets.

Fuse the two enrichment sites (kernel_pump XDP + AF_PACKET fallback) to two
classify_ip calls instead of four is_*_ip calls.

Tests cover both representations, equivalence vs the legacy linear scan, IPv6,
and hot-reload; an ignored micro-bench records the crossover.

Claude-Session: https://claude.ai/code/session_01MAY62VuZHhkeXgQ9feF3fJ
The per-source flow accumulator used SipHash everywhere: the global
DashMap<IpAddr, Entry> plus the per-entry dst_port_counts / src_ports maps,
all touched on the capture hot path.

Switch to ahash. The outer map is keyed by source IP (attacker-influenceable),
so it takes a per-process random seed (RandomState::new) for hash-flood
resistance; the inner port maps are bounded by PORT_CAP, so a default seed is
fine. Public API and behaviour unchanged — the existing aggregation / entropy /
TTL / port-cap tests cover it.

Claude-Session: https://claude.ai/code/session_01MAY62VuZHhkeXgQ9feF3fJ
…rary

The IdentityInfo/IdentityClient model + MMDB reader and the EdgeSet parser/
evaluator move to the hippocampus crate; synapse-core keeps the process-global
client singletons + capability registration as thin wrappers that re-export the
library types. Public API (lookup/evaluate/init_*/refresh_*/get_version_cache,
IdentityInfo/EdgeVerdict) is unchanged, so the workers and enrichment call sites
are untouched.

hippocampus is a dev path dependency for now; switch to the gen0sec registry
once it is published.
Every src/dst identity lookup did a memmap B-tree walk + maxminddb decode (three
String allocations), and the same pod/node IPs recur across every connection, so
the walk was repeated constantly for a small stable set of IPs.

Add an IP-keyed cache (positive and negative results) in front of the walk,
keyed with a per-process-seeded ahash map. Identity is stable for an IP between
MMDB refreshes; the cache is cleared wholesale on every refresh so a re-labelled
IP is never served stale. Bounded by a coarse cap (clear-on-full) so a
many-IP scan can't grow it without limit. Public `lookup` API unchanged.

Tested with a counting fake resolver: resolves once per IP, caches negatives,
re-resolves after clear (refresh), and stays bounded.

Claude-Session: https://claude.ai/code/session_01MAY62VuZHhkeXgQ9feF3fJ
The default release profile already sets lto="thin" + codegen-units=1. Add an
opt-in release-lto profile (inherits release, upgrades to fat LTO) for the
shipping image build only — fat LTO is materially slower to build, so dev/CI
keep using the thin default.

Measured on the synapse agent binary (default features):
  - size: 52 MiB (thin) -> 46 MiB (fat), -5.2 MiB / -10.3% (better cross-crate
    dead-code elimination)
  - build: ~196s -> ~354s for the binary
  - compatibility unchanged: no target-cpu/target-feature is set, output stays
    baseline x86-64 (ELF for GNU/Linux 3.2.0), so it runs on every node; the
    SSL-uprobe symbol table is preserved (strip="debuginfo" inherited). LTO is
    link-time only and does not change the target ISA.

Build the shipping image with `cargo build --profile release-lto`.

Claude-Session: https://claude.ai/code/session_01MAY62VuZHhkeXgQ9feF3fJ
…istic bench)

A realistic east-west benchmark (bench_realistic_enrichment: pod/node/external
mix, 32-CIDR HOME_NET) showed the original LINEAR_THRESHOLD=16 was a net ~1.6x
REGRESSION: a real HOME_NET (broad pod /16 + service /12 + node /32s) tipped over
16 entries into the Bucketed hash, which pays a fixed per-prefix-length probe
count and cannot exploit the short-circuit that real pod traffic gets on the
broad CIDR. The earlier 7.7x was on adversarial all-miss input, not real traffic.

Fix: raise the threshold to 64 (above any realistic cluster's node/LB count) and
sort the Linear representation broad-prefix-first so the scan short-circuits on
the broad pod/service CIDR regardless of config order. The real win is the
lock-free ArcSwap read + 4->2 call fusion, not the data structure. Bucketed is
kept only for pathological large no-broad-prefix configs.

Full-path realistic result now: OLD RwLock+4 linear 64 ns/pkt -> NEW ArcSwap+2
classify 49 ns/pkt = ~1.3x; large disjoint sets still ~7.9x; small parity.

Claude-Session: https://claude.ai/code/session_01MAY62VuZHhkeXgQ9feF3fJ
hippocampus is published, so synapse-core depends on it via the registry instead
of a local path. A commented [patch.gen0sec] entry keeps the local-path dev
override available, matching the amygdala/cortex pattern.
amygdala 0.1.6 publishes the id./edge./flow./ids. wirefilter enrichment fields
this branch depends on; synapse-app now builds against the registry crate.
Adds bench_concurrent_classify: N threads hammering the lookup, old RwLock
read+scan vs the new lock-free ArcSwap classify_ip, measuring how aggregate
throughput scales with thread count.

Result (8 threads): the RwLock baseline does not scale (25.7 -> 16.6 -> ~28
Mops/s for 1/2/8 threads; concurrent readers bounce the lock's reader-count
cache line), while ArcSwap scales near-linearly (23 -> 45 -> 86 -> 174 Mops/s),
6.3x the RwLock at 8 threads. This is the real system-level justification for the
lock-free change: per-packet single-thread cost is a wash, but a multi-core
sharded capture path serializes on the old RwLock and not on ArcSwap.
…lpm-segment-fusion

# Conflicts:
#	crates/synapse-core/src/security/identity/mod.rs
access-rules: lock-free concurrency-scaling segment classifier + ahash, identity cache, opt-in fat-LTO
# Conflicts:
#	crates/synapse-app/src/kernel_pump.rs
@pigri

pigri commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Superseded by #420, which contains this work rebased onto current main plus the follow-on additions (service-graph visibility, dns.fqdn egress, pod-label selectors, log/allow rule actions) and the published amygdala 0.1.9 / hippocampus 0.0.3 dependencies. Closing in favor of #420.

@pigri pigri closed this Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant