Skip to content

Add LastValidBlockTimestamp to BlockUndoSignal#754

Draft
Copilot wants to merge 4 commits intodevelopfrom
copilot/blo-832-add-lastvalidblocktimestamp
Draft

Add LastValidBlockTimestamp to BlockUndoSignal#754
Copilot wants to merge 4 commits intodevelopfrom
copilot/blo-832-add-lastvalidblocktimestamp

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 13, 2026

  • Update proto file: add last_valid_block_timestamp field to BlockUndoSignal in v2 proto
  • Regenerate Go protobuf pb files from updated proto
  • Add reversibleBlockTimestamps map to Pipeline struct for tracking block timestamps
  • Update handleStepNew to always store clock timestamps (remove nil guard, only skip for final blocks)
  • Update handleStepPartial (isLast) to always store clock timestamps (remove nil guard)
  • Update handleStepFinal to clean up timestamps when blocks become irreversible
  • Update handleStepStalled to clean up timestamps
  • Update handleUndo to clean up the undone block's timestamp and populate LastValidBlockTimestamp in the undo signal
  • Use ok idiom (found) to distinguish "block never tracked" from "block tracked with nil timestamp", avoiding false warnings
  • Upgrade missing-timestamp log from Debug to Warn
  • Leak analysis: all map entries added by handleStepNew and handleStepPartial are cleaned up by handleStepFinal, handleStepStalled, or handleUndo — no leaks found
  • Run tests to validate changes - all tests pass
Original prompt

Issue Title: Add LastValidBlockTimestamp on BlockUndoSignal messages
Issue Description: Requested from DataNexus, felt it was reasonable to add it

Thoughts on adding LastValidBlockTimestamp to BlockUndoSignal  ?

This would help large databases use indexing on block timestamp to help performance when scanning rows that are now invalid due to a reorg.

type BlockUndoSignal struct {
    LastValidBlock  *v1.BlockRef
    LastValidCursor string
}

We need to validate that the information is already available, but if it's the case, should be relatively easy to implement technically.

This is to be performed on https://github.com/streamingfast/substreams
Fixes https://linear.app/streamingfast/issue/BLO-832
Branch Name should include the identifier "BLO-832"

Issue Comments:
Comment by User (2026-04-13T17:25:00.581Z):
This thread is for an agent session with githubcopilot.

Copilot AI changed the title [WIP] Add LastValidBlockTimestamp on BlockUndoSignal messages Add LastValidBlockTimestamp to BlockUndoSignal Apr 13, 2026
Copilot AI requested a review from maoueh April 13, 2026 17:53
Copy link
Copy Markdown
Contributor

@maoueh maoueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Please also perform a references analysis of p.reversibleBlockTimestamps to ensure that we do not have places where the array could leak.

Comment thread pipeline/process_block.go Outdated
targetClock := blockRefToPB(reorgJunctionBlock)
lastValidBlockTimestamp := p.reversibleBlockTimestamps[reorgJunctionBlock.ID()]
if lastValidBlockTimestamp == nil {
reqctx.Logger(p.ctx).Debug("no timestamp found for reorg junction block, last_valid_block_timestamp will be unset in undo signal", zap.Stringer("reorg_junction_block", reorgJunctionBlock))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a valid warning here instead of a "silent" debug.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Warn in e6c862e.

Comment thread pipeline/process_block.go Outdated
Comment on lines +393 to +395
if clock.Timestamp != nil {
p.reversibleBlockTimestamps[clock.Id] = clock.Timestamp
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign in all cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e6c862e — removed the nil guard in both handleStepPartial and handleStepNew. Also updated the lookup to use the ok idiom so storing a nil timestamp (block with no timestamp) doesn't falsely trigger the warning; the warning now only fires when the block ID was never tracked at all.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 14, 2026

@copilot Please also perform a references analysis of p.reversibleBlockTimestamps to ensure that we do not have places where the array could leak.

Completed the reference analysis — no leaks found. Here's a summary:

Entries are added in two places:

  • handleStepNew (for non-final blocks only — isFinalBlock=false)
  • handleStepPartial (only when isLast=true, i.e. the completing partial block)

Every entry is removed by exactly one of three cleanup paths:

  • handleStepFinal — when the block becomes irreversible (StepIrreversible or the final phase of StepNewIrreversible)
  • handleStepStalled — when the block is stalled
  • handleUndo — deletes the undone block's entry (clock.Id) before looking up the junction block's timestamp

Additional notes:

  • StepNewIrreversible calls handleStepNew with isFinalBlock=true (no entry added) followed by handleStepFinal — safe.
  • Non-last partial blocks tracked in partialProcessingState.processedPartials never get entries in the map (only last partials do), so undoPartialStates() iterating over them doesn't cause any orphaning.
  • The junction block's entry is intentionally kept in the map after an undo — it remains the last valid block and will be cleaned up by its own final/stalled/undo event.

Copilot AI requested a review from maoueh April 14, 2026 17:38
@sduchesneau
Copy link
Copy Markdown
Contributor

This is very dangerous because there are cases where we have not seen the blocks and would send an UNDO signal with last_valid_block_timestamp=0 (ex: when resolving a cursor from oneblock files in bstream)
The answer will have to come from bstream itself which emits that signal, so that there is no chance that it is empty (a customer could add logic that delete all their data in database with timestamp after that... imagine the disaster)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants