fix(rewards): recover missing reward disbursements and prevent future loss#826
Merged
Merged
Conversation
… loss Resolves a ~55k-row gap between challenge_disbursements and sol_reward_disbursements traced to three independent causes: 1. ProcessTransaction errors were silently discarded in the live program indexer and Backfiller, so per-tx failures advanced the slot checkpoint without ever reaching the retry queue. Both paths now surface errors so the retry queue (live) and zap logs (backfill) see them. 2. No cron ever invoked the existing Backfiller, so subscription gaps (e.g. the 17h45m outage on 2026-03-23) were never recovered. Added a gap-detection job that scans sol_slot_checkpoints, merges intervals, and dispatches Backfill on each gap via a Backfillable interface that indexers can implement per their own subscription shape. 3. Migration 0152's INNER JOIN on user_bank_accounts excluded ~29k rows from the original challenge_disbursements -> sol_reward_disbursements backfill. Migration 0201 re-runs the backfill using the latest sol_claimable_account per (wallet, AUDIO mint), recovering rows whose user is still current. Rows for hard-deleted users are intentionally skipped (no recoverable relational state). Also moves the Backfiller into the program package since it's program-indexer-specific (walks GetSignaturesForAddress for hardcoded program IDs); other indexers will own their own Backfill implementations. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
raymondjacobson
approved these changes
May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ProcessTransactionreturns an error:solana/indexer/program/indexer.go:113now surfaces the error so it lands on the existing retry queue, andsolana/indexer/program/backfiller.go:210now logs failures instead of advancing the cursor in silence.jobs/checkpoint_gap_backfill.go, scheduled every 1h insolana/indexer/solana_indexer.go) that scanssol_slot_checkpoints, finds uncovered slot ranges per indexer name, and dispatchesBackfill(fromSlot, toSlot)on anything satisfying the newjobs.Backfillableinterface. Each indexer owns its own backfill strategy; the program indexer's lives next to it via the movedBackfiller.0201_backfill_missing_reward_disbursements.sql, a one-shot recovery of ~29k legacychallenge_disbursementsrows excluded by migration 0152'sINNER JOIN user_bank_accounts. Uses aLATERALagainstsol_claimable_accountsto pick the current AUDIO claimable account per user. Rows for hard-deleted users are intentionally skipped (no recoverable relational state - ~20k rows).Background
~55k
challenge_disbursementsrows were missing fromsol_reward_disbursements. Investigation split the gap into three causes, each addressed independently:ProcessTransactionwere discarded by both the liveHandleUpdatepath andBackfiller.backfillAddressTransactions. The slot checkpoint kept advancing past the failed tx, leaving no record. Fixed at the two source lines above.Backfillertype was never invoked in production — no caller existed outside its unit tests. The new cron job wires it up via theBackfillableinterface and writes a checkpoint row after each successful gap fill so subsequent runs don't re-trigger.challenge_disbursements→sol_reward_disbursementsbackfill inner-joineduser_bank_accounts, dropping any disbursement whose author lacked a current user_bank entry. Migration 0201 re-runs that backfill usingsol_claimable_accounts(the modern source for user banks) andLATERALdeduping.Test plan
go test ./solana/indexer/program/ ./jobs/ ./solana/indexer/...locally — all pass.CheckpointGapBackfillJobin indexer logs).sol_reward_disbursementsrow count increases by ~29k and thatv_challenge_disbursementsnow returns the previously-missing rewards for affected users.EvaluateAttestationstransaction with a per-tx processing error lands insol_retry_queueinstead of being silently dropped (can be confirmed againstsol_retry_queue.errorafter deploy).markGapFilledshould write a checkpoint row that suppresses re-triggering. If you see duplicate runs, the subscription_hash logic inmarkGapFilledneeds review.🤖 Generated with Claude Code