feat: Enhance Code City Apocalypse dashboard and metrics by BleakNarratives · Pull Request #35 · BleakNarratives/AIRTBench-Code

BleakNarratives · 2026-05-30T21:58:42Z

The dashboard for AIRTBench (Code City Apocalypse) has been significantly upgraded. Key improvements include a new terminal-style telemetry feed in the "System Logs" tab, accurate success tracking for both live and archived agent runs, and a more robust model leaderboard that doesn't skip agents with zero successes. Visual styling was also refined to better reflect the "digital restoration" theme when sectors are secured.

PR created automatically by Jules for task 15041334110763897136 started by @BleakNarratives

- Add a new "System Logs" tab with a live telemetry feed and timestamp parsing. - Update success detection logic to include archive data via `flag_found_last_attempt_flag`. - Ensure all active models (including those with 0 successes) are displayed in the leaderboard. - Optimize 'Sectors Secured' metric to count unique successful challenges rather than total attempts. - Add visual enhancements: neon-green success glow for secured beacons and normalized difficulty mapping. - Add guard clauses for empty monster data to prevent dashboard crashes. - Verified with Playwright and passed ruff/mypy linting. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

google-labs-jules · 2026-05-30T21:58:43Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

gemini-code-assist

Code Review

This pull request introduces several enhancements to the Streamlit frontend, including a new 'System Telemetry Feed' tab, improved leaderboard success rate calculations that correctly handle models with zero cleared sectors, and visual updates like success-themed lore cards. Additionally, Playwright has been added as a development dependency. The review feedback highlights three important issues: a potential logic bug where boolean False values in flag_found_last_attempt_flag could be incorrectly classified as successes, potential null values appearing in the leaderboard, and a Python 3.10 compatibility issue when parsing ISO timestamps containing the 'Z' suffix.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: BleakNarratives <bleaknarratives@gmail.com>

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

Comment thread airtbench/frontend.py Outdated

Comment thread airtbench/frontend.py Outdated

Comment thread airtbench/frontend.py

BleakNarratives and others added 3 commits June 4, 2026 10:14

Update airtbench/frontend.py

7cf097a

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: BleakNarratives <bleaknarratives@gmail.com>

Update airtbench/frontend.py

73757bf

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: BleakNarratives <bleaknarratives@gmail.com>

Update airtbench/frontend.py

715c42d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: BleakNarratives <bleaknarratives@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Enhance Code City Apocalypse dashboard and metrics#35

feat: Enhance Code City Apocalypse dashboard and metrics#35
BleakNarratives wants to merge 4 commits into
mainfrom
feat/code-city-dashboard-enhancements-15041334110763897136

BleakNarratives commented May 30, 2026

Uh oh!

google-labs-jules Bot commented May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BleakNarratives commented May 30, 2026

Uh oh!

google-labs-jules Bot commented May 30, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant