Skip to content

fix: compaction pairwise fallback, conflict-aware merge prompt, metadata caps, and litellm pickle compat#273

Open
Piotr1215 wants to merge 7 commits intoredis:mainfrom
Piotr1215:fix/compaction-pairwise-merge-and-conflict-prompt
Open

fix: compaction pairwise fallback, conflict-aware merge prompt, metadata caps, and litellm pickle compat#273
Piotr1215 wants to merge 7 commits intoredis:mainfrom
Piotr1215:fix/compaction-pairwise-merge-and-conflict-prompt

Conversation

@Piotr1215
Copy link
Copy Markdown
Contributor

@Piotr1215 Piotr1215 commented Apr 8, 2026

Problem

Four related issues affecting memory compaction reliability and quality:

1. Compaction deadlock in dense memory clusters (merges 0 memories)

_semantic_merge_group_is_cohesive() rejects all merge groups when memories form dense topical clusters. Each memory has "extra neighbors" outside the candidate group, triggering the ambiguous-group skip. Any user who creates memories about related topics (e.g., multiple commits on the same feature) will see compaction scan hundreds of memories and merge exactly zero.

Logs show dozens of Skipping ambiguous semantic merge group entries per run, with Merged 0 memories at the end.

2. Merge prompt does not handle conflicting memories

The current merge prompt ("merge similar or duplicate memories into a single, coherent memory") blends everything together. When memories contain contradictory information — common when iterating on a solution across commits — the merged result preserves both sides of the contradiction instead of resolving it.

Example: Memory A says "use tool X for task Y", Memory B (created later) says "switched from X to Z for task Y because X had issues." The current prompt merges them into a single memory that mentions both tools without clarity on which is current.

3. Unbounded topic/entity growth on successive merges

merge_memories_with_llm unions all topics and entities from source memories without limit. Over successive compaction rounds where merged memories get re-merged, topic counts inflate to 20+ and entity counts to 30+, mostly with LLM-generated verbose synonyms of the same concept.

4. Docket worker crashes on litellm exception serialization

When a litellm BadRequestError is raised and retries are exhausted, the Docket worker tries to serialize the exception via cloudpickle.dumps(). Litellm exception classes have mandatory __init__ args (message, model, llm_provider) and contain unpickleable _thread.RLock objects, causing TypeError: cannot pickle '_thread.RLock' object. This crashes the worker process repeatedly.

Fix

1. Pairwise merge fallback in deduplicate_by_semantic_search

When the full candidate group fails the cohesion check, fall back to merging just the anchor memory with its single closest neighbor (by vector distance). The pair is re-validated through _semantic_merge_group_is_cohesive — a pair has far fewer "extra neighbors" than a large group, so it passes where the group did not.

This breaks the deadlock gradually: each compaction run merges the tightest pairs, reducing the cluster density for the next run.

2. Conflict-aware merge prompt with recency precedence

The merge prompt now:

  • Includes creation timestamps (UTC-normalized) for each memory
  • Instructs the LLM to prefer newer information when facts conflict
  • Drops superseded information rather than blending contradictions
  • Preserves concrete details (commit hashes, file paths, version numbers)
  • Enforces plain text output (no markdown formatting in merged text)

3. Topic and entity caps in merge_memories_with_llm

Module-level constants MAX_MERGED_TOPICS (15) and MAX_MERGED_ENTITIES (20) cap metadata after union. When over the limit, shorter strings are kept preferentially — shorter identifiers tend to be more precise while longer ones are often LLM-generated verbose synonyms.

4. Litellm pickle compatibility patch

New module litellm_pickle_compat.py patches 14 litellm exception classes with custom __reduce__ methods that filter out unpickleable attributes before serialization. Auto-applied via side-effect import in docket_tasks.py. Note: This is the same fix as PR branch bsb/fix-litellm-pickle-231 (cherry-picked commits 2b7febd, 90a8122, 708f91a) — included here for completeness since that branch hasn't been merged yet.

Test plan

  • Run compaction on a memory store with dense topical clusters — verify non-zero merges
  • Verify worker no longer crashes from pickle errors after sustained operation
  • Create two memories with conflicting facts (different timestamps), trigger merge, verify newer fact wins
  • Create memories with 20+ topics each, merge them, verify output has ≤15 topics
  • Verify existing full-group cohesive merges still work (pairwise fallback only fires when full group fails)
  • Verify Pairwise merged memory X with Y appears in logs for fallback path
  • Verify cloudpickle round-trip for litellm exceptions (covered by test_docket_serialization.py)

Note

Medium Risk
Changes semantic deduplication/compaction behavior (including new pairwise-merge fallback) and LLM merge prompting, which can alter what memories get merged and deleted. It also monkey-patches third-party litellm exception classes at import time, which could have unexpected serialization or logging side effects if the state capture misses attributes.

Overview
Semantic compaction now has a pairwise fallback: when a candidate merge group fails the cohesion check, deduplicate_by_semantic_search retries a single merge with the closest neighbor (within threshold) to avoid dense-cluster “no merges ever happen” deadlocks.

LLM-based merging is made more conflict-aware by including per-memory UTC creation timestamps in the prompt and explicitly instructing the model to prefer newer facts and drop outdated contradictions; merged metadata growth is also bounded via caps on merged topics/entities.

Docket failure handling is hardened by adding litellm_pickle_compat (auto-imported in docket_tasks.py) to monkey-patch key litellm exception classes with custom __reduce__ so task exceptions round-trip through cloudpickle; new tests cover argument/exception serialization and adjust semantic compaction expectations for the fallback behavior.

Reviewed by Cursor Bugbot for commit c8da3eb. Bugbot is set up for automated code reviews on this repo. Configure here.

…pickle-231

worker crashes with "TypeError: cannot pickle '_thread.RLock' object" when
docket tries to serialize litellm BadRequestError after exhausted retries.
patches litellm exception classes with custom __reduce__ to bypass
unpickleable attributes. cherry-picked 2b7febd, 90a8122, 708f91a.
compaction was merging 0 memories because _semantic_merge_group_is_cohesive
rejects all groups in dense clusters (every memory has "extra neighbors").
now falls back to pairwise merge with the single closest neighbor when the
full group is non-cohesive, breaking the deadlock.

also caps merged topics at 15 and entities at 20 to prevent unbounded
growth across successive merge rounds. shorter strings kept preferentially
as they tend to be more precise identifiers vs LLM-generated synonyms.
memories created from successive commits often conflict (commit A says
"use approach X", commit B supersedes with "use approach Y"). the old
merge prompt just blended both into one text, preserving contradictions.

new prompt includes creation timestamps and instructs the LLM to prefer
newer memories when facts conflict, drop superseded information, and
preserve concrete details (hashes, paths, versions). also enforces plain
text output to prevent markdown formatting in merged memory text.
Copilot AI review requested due to automatic review settings April 8, 2026 19:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves long-term memory semantic compaction behavior to make merges progress in dense clusters, produce conflict-aware merged text, and prevent topic/entity metadata from growing without bound; it also adds a Docket/cloudpickle compatibility patch for LiteLLM exceptions.

Changes:

  • Add pairwise merge fallback in deduplicate_by_semantic_search when a full candidate group fails the cohesion check.
  • Update merge_memories_with_llm to (a) cap merged topics/entities and (b) use a conflict/recency-aware merge prompt that includes timestamps.
  • Introduce litellm_pickle_compat monkey-patch and add tests validating cloudpickle round-tripping for Docket task args/exceptions.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
agent_memory_server/long_term_memory.py Adds topic/entity caps, conflict-aware merge prompt with timestamps, and pairwise merge fallback for dense semantic clusters.
agent_memory_server/litellm_pickle_compat.py Adds a pickle-safe __reduce__ patch for LiteLLM exception classes (auto-applied on import).
agent_memory_server/docket_tasks.py Imports the LiteLLM pickle-compat module for side-effect patching in Docket task workers.
tests/test_docket_serialization.py Adds regression tests for cloudpickle serialization of task arguments and LiteLLM exceptions after patching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agent_memory_server/long_term_memory.py Outdated
Comment thread agent_memory_server/long_term_memory.py Outdated
Comment on lines 7 to 10
from docket import Docket

import agent_memory_server.litellm_pickle_compat # noqa: F401 — patches litellm exceptions for cloudpickle
from agent_memory_server.config import settings
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR’s title/description focuses on semantic compaction changes, but this file also introduces a global litellm exception monkey-patch (side-effect import) to address cloudpickle/Docket serialization, plus adds dedicated tests. Please update the PR description/title to mention this additional behavior change, since it affects runtime behavior for all Docket tasks and is orthogonal to compaction.

Copilot uses AI. Check for mistakes.
- move MAX_MERGED_TOPICS/ENTITIES to module level (ruff N806)
- fix comment indentation inside if-raise block (dead code appearance)
- use explicit UTC normalization for timestamps instead of isoformat[:19]
@Piotr1215 Piotr1215 changed the title fix: pairwise merge fallback, conflict-aware prompt, and metadata caps in compaction fix: compaction pairwise fallback, conflict-aware merge prompt, metadata caps, and litellm pickle compat Apr 8, 2026
the pairwise fallback intentionally changes behavior: dense clusters
that previously blocked all merges now allow closest-pair merges.
tests updated to verify the anti-snowball property (no mega-memories)
rather than asserting zero merges. fake_merge returns known text so
re-indexing after merge doesn't fail on ControlledEmbeddings lookup.
bridge pair (bridge+diet) passes pairwise cohesion because the distant
cluster member (sports) is outside the distance threshold from diet's
perspective. test now accepts either outcome and verifies at-most-one
merge. chain with 35-degree polar spacing allows adjacent-pair merges
so >= 2 surviving memories is the correct anti-snowball bound.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit c8da3eb. Configure here.

user_id_filter=user_id_filter,
session_id_filter=session_id_filter,
vector_distance_threshold=vector_distance_threshold,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pairwise fallback ineffective in dense cluster scenario

Medium Severity

The pairwise fallback re-validates through _semantic_merge_group_is_cohesive, but for a pair, the extra_ids check is actually stricter than for the full group, not more lenient as intended. With candidate_memories=[closest], candidate_ids is just {closest.id}. The extra_ids = related_ids - candidate_ids computation subtracts a much smaller set than in the full-group case, so MORE of closest's neighbors appear as "extra." In the dense cluster scenario this fix targets — where every memory has neighbors outside the candidate group — the pair will almost always also fail cohesion, making the fallback a no-op and the deadlock unresolved.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c8da3eb. Configure here.

Copy link
Copy Markdown
Contributor Author

@Piotr1215 Piotr1215 Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: Cursor Bugbot's concern about pairwise fallback being a no-op in dense clusters

The analysis is correct in the abstract but doesn't account for the distance threshold filtering. For a pair (anchor, closest), the cohesion check searches from closest and only finds neighbors within vector_distance_threshold (0.35). In dense topical clusters, the closest neighbor's other neighbors are often just outside that threshold.

Production evidence: first compaction run after the fix merged 17 memories (7 via pairwise fallback) from a store where the previous behavior consistently merged 0. The CI test suite (bridge, chain, snowball scenarios) also confirms the fallback fires correctly.

The key geometric insight: dense clusters in real embedding space don't form perfect cliques, they have varying inter-memory distances, so a tight pair often has a clean neighborhood even when the full N-member group does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants