Skip to content

Add timeout check for smartstacks#463

Open
timbeccue wants to merge 2 commits into
mainfrom
feature/stacking-timeout
Open

Add timeout check for smartstacks#463
timbeccue wants to merge 2 commits into
mainfrom
feature/stacking-timeout

Conversation

@timbeccue

@timbeccue timbeccue commented May 16, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds fixed timeout handling for smart-stack subframes.

The basic behavior is that an in-progress smart stack enters a timeout state if more than 15 minutes elapses between reduced subframes. This will trigger the final stack and cleanup routine for the smartstack (not yet implemented). The purpose is to prevent a hanging stack from indefinitely blocking the stack worker from picking up subsequent stack jobs.

Details

  • Adds fixed latest-reduced-row timeout detection for active smart stacks
  • Marks incomplete stale stacks as timeout
  • Keeps complete stacks marked as complete before considering timeout
  • Makes get_subframes() return rows ordered by stack_num
  • Keeps timeout bookkeeping based only on reduced subframe rows with required filepaths
  • Updates smart-stacking tests and architecture docs for the fixed timeout behavior

@timbeccue timbeccue force-pushed the feature/stacking-timeout branch from ebbdffb to 79b9afd Compare May 16, 2026 23:47
@timbeccue timbeccue force-pushed the feature/separate-site-deps branch from adf8300 to 5f760f6 Compare May 16, 2026 23:47
@timbeccue timbeccue requested a review from cmccully May 18, 2026 17:57
@timbeccue timbeccue changed the title Add adaptive smart-stack timeouts Add timeout check for smartstacks May 20, 2026
@timbeccue timbeccue force-pushed the feature/stacking-timeout branch from 79b9afd to 396f4f8 Compare May 20, 2026 00:08
@timbeccue timbeccue force-pushed the feature/separate-site-deps branch from 84e206a to c421b9e Compare June 22, 2026 17:19

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a fixed “latest reduced subframe” timeout mechanism to smart-stacking so that incomplete stacks don’t remain active indefinitely (which can otherwise stall per-camera stacking workers). The stacking worker now performs a DB sweep after draining Redis notifications to finalize stacks that are either complete or have exceeded the fixed 15-minute threshold.

Changes:

  • Add fixed 15-minute timeout detection based on the newest reduced subframe created_at, with timeout/complete finalization during a per-camera sweep.
  • Make get_subframes() return rows ordered by stack_num, and add a helper to fetch active subframes for a camera.
  • Update unit tests, E2E tests, and architecture docs to reflect the new sweep/timeout behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
docs/smartstacking_architecture.md Documents the new fallback sweep and fixed timeout behavior (incl. updated flow).
banzai/stacking.py Introduces STACK_TIMEOUT_SECONDS, timeout detection, and per-tick timeout sweep in the worker loop.
banzai/dbs.py Orders get_subframes() by stack_num; adds get_active_subframes_for_camera() for sweep queries.
banzai/tests/test_smart_stacking.py Adds unit tests for ordering and fixed timeout behavior; updates worker-loop resilience tests.
banzai/tests/site_e2e/test_site_e2e.py Adds an E2E test that forces staleness via created_at update and verifies timeout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread banzai/dbs.py
Comment on lines +670 to +672
def get_active_subframes_for_camera(db_address, camera):
"""Get active subframe records for a camera ordered by stack and arrival time."""
with get_session(db_address, site_deploy=True) as session:
Comment thread banzai/stacking.py
Comment on lines +146 to +148
def check_timeouts(db_address, camera, now=None):
"""Finalize active stacks that are complete or have exceeded the fixed timeout."""
active_subframes = dbs.get_active_subframes_for_camera(db_address, camera)
@timbeccue timbeccue changed the base branch from feature/separate-site-deps to main June 24, 2026 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants