Fix workflow stop in online mode by sending stop command to RQ worker#383
Fix workflow stop in online mode by sending stop command to RQ worker#383t0mdavid-m merged 2 commits intomainfrom
Conversation
Job.cancel() only updates Redis registries; for jobs already executing in a worker it leaves the work-horse running, so the workflow keeps producing log output and the UI shows inconsistent state after Stop is pressed. cancel_job now sends rq.command.send_stop_job_command to the worker for started jobs, treats already-canceled/stopped jobs as success (idempotent double-clicks), and maps RQ's 'stopped' status to CANCELED in get_job_info so stopped jobs don't appear stuck in 'queued'.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
After PR #383 the RQ worker actually terminates on Stop, but the UI kept showing "running" and a second Stop click rendered "Errors occurred". Two causes: 1. stop_workflow cleared .job_id on success, so get_workflow_status fell through to the local-mode pid_dir fallback. The killed worker left stale child PID files there, so the fallback flipped running back to True forever. 2. The static log-display branch only knew "WORKFLOW FINISHED" vs error, so a cancelled run was misreported as an error. Fix: stop_workflow now writes a "WORKFLOW CANCELLED" marker via Logger and removes the stale pid_dir; .job_id is kept so the queue status flow stays authoritative and renders the Cancelled pill. StreamlitUI's static display dispatches through a new pure helper classify_log_outcome (finished/cancelled/error). Also fills in the missing canceled branch in _show_queue_status so the queue pill actually renders for canceled jobs.
Summary
Fixes a bug where clicking "Stop Workflow" in online mode (vendor queue) would not actually interrupt a running workflow. The worker would continue executing while the UI showed inconsistent state.
Problem
When
QueueManager.cancel_job()was called on a job in the "started" state, it only calledJob.cancel(), which marks the job as canceled in Redis registries but does not interrupt the worker that is actively executing the workflow. This left the worker running the job to completion while the UI showed the job as canceled.Solution
rq.command.send_stop_job_command()to message the worker over Redis pubsub and interrupt the work-horse process.Job.cancel()InvalidJobOperationerrorsFalseinstead of raisingJobStatus.CANCELEDmapping inget_job_info()so the UI correctly displays stopped jobs as canceled rather than queued.Key Changes
cancel_job()to detect started jobs and send stop command before cancelingInvalidJobOperationandNoSuchJobErrorhttps://claude.ai/code/session_01Ny1NgFejDt9w6mNnNpEFuB