Skip to content

Refine and Enhance Code City Apocalypse Frontend#41

Open
BleakNarratives wants to merge 1 commit into
mainfrom
frontend-enhancements-code-city-9407084791095251838
Open

Refine and Enhance Code City Apocalypse Frontend#41
BleakNarratives wants to merge 1 commit into
mainfrom
frontend-enhancements-code-city-9407084791095251838

Conversation

@BleakNarratives
Copy link
Copy Markdown
Owner

This PR improves the AIRTBench frontend dashboard with new interactive features: a 'PURGE RECORD' button for mission log management, an 'INITIATE LIVE TELEMETRY' simulation tool, and search/filter capabilities for the Digital Bestiary and Monster Manifest. It also refines the sidebar state indicators to better distinguish between live monitoring and archive modes.


PR created automatically by Jules for task 9407084791095251838 started by @BleakNarratives

- Implement 'PURGE RECORD' button in Fallout Inspector.
- Add 'INITIATE LIVE TELEMETRY' button for dashboard simulation.
- Add text search to Digital Bestiary.
- Add Danger Level filter to Monster Manifest.
- Improve sidebar state feedback (ACTIVE vs ARCHIVE).
- Fix Ruff SIM102 linting issue.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several interactive features to the frontend, including danger level filtering for the monster manifest, search functionality for the digital bestiary, a run purging mechanism, and a button to initiate live telemetry with mock events. The review feedback highlights critical security and robustness improvements: first, validating and resolving paths before calling shutil.rmtree to prevent directory traversal, along with handling potential OSError exceptions; second, clearing the cached failure data and handling file-system errors when initiating live telemetry to avoid displaying stale data.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread airtbench/frontend.py
Comment on lines +1035 to +1039
if st.button("🗑️ PURGE RECORD", use_container_width=True) and run_path.exists():
shutil.rmtree(run_path)
st.toast(f"Sector {selected_id} purged from history.", icon="🔥")
time.sleep(1)
st.rerun()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The shutil.rmtree call is performed on run_path without validating that the path is strictly within the expected RUNS_DIR directory. This could potentially allow directory traversal if the selectbox state is manipulated. Additionally, file system operations can fail due to permission issues or locked files, raising an unhandled OSError that would crash the Streamlit application. Wrapping the deletion in a try-except block and resolving the paths to verify the directory structure improves both security and robustness.

Suggested change
if st.button("🗑️ PURGE RECORD", use_container_width=True) and run_path.exists():
shutil.rmtree(run_path)
st.toast(f"Sector {selected_id} purged from history.", icon="🔥")
time.sleep(1)
st.rerun()
if st.button("🗑️ PURGE RECORD", use_container_width=True) and run_path.exists():
try:
resolved_runs_dir = RUNS_DIR.resolve()
resolved_run_path = run_path.resolve()
if resolved_runs_dir in resolved_run_path.parents:
shutil.rmtree(resolved_run_path)
st.toast(f"Sector {selected_id} purged from history.", icon="🔥")
time.sleep(1)
st.rerun()
else:
st.error("Invalid directory path.")
except OSError as e:
st.error(f"Failed to purge record: {e}")

Comment thread airtbench/frontend.py
Comment on lines +1145 to +1173
if st.sidebar.button("📡 INITIATE LIVE TELEMETRY"):
BRIDGE_DIR.mkdir(parents=True, exist_ok=True)
with LIVE_EVENTS_FILE.open("w") as f:
# Add mock events for the ticker
mock_events = [
{
"event_type": "beacon_found",
"challenge": "bear1",
"model_display": "Jules Agent v1",
"timestamp": time.time() - 60,
},
{
"error_type": "kernel_timeout",
"challenge": "bear2",
"model_display": "Broken Agent",
"timestamp": time.time() - 30,
},
{
"error_type": "found_flag",
"challenge": "bear3",
"model_display": "Jules Agent v1",
"timestamp": time.time(),
},
]
for event in mock_events:
f.write(json.dumps(event) + "\n")
st.sidebar.success("Telemetry uplink established.")
time.sleep(1)
st.rerun()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When initiating live telemetry, the mock events are written to LIVE_EVENTS_FILE. However, load_failure_data is cached using @st.cache_data(ttl=5). If the user initiates telemetry, the dashboard might still display stale cached data for up to 5 seconds after the rerun. Explicitly clearing the cache with load_failure_data.clear() ensures the new telemetry is loaded immediately. Additionally, wrapping the file operations in a try-except block handles potential OSError exceptions gracefully.

        if st.sidebar.button("📡 INITIATE LIVE TELEMETRY"):
            try:
                BRIDGE_DIR.mkdir(parents=True, exist_ok=True)
                with LIVE_EVENTS_FILE.open("w") as f:
                    # Add mock events for the ticker
                    mock_events = [
                        {
                            "event_type": "beacon_found",
                            "challenge": "bear1",
                            "model_display": "Jules Agent v1",
                            "timestamp": time.time() - 60,
                        },
                        {
                            "error_type": "kernel_timeout",
                            "challenge": "bear2",
                            "model_display": "Broken Agent",
                            "timestamp": time.time() - 30,
                        },
                        {
                            "error_type": "found_flag",
                            "challenge": "bear3",
                            "model_display": "Jules Agent v1",
                            "timestamp": time.time(),
                        },
                    ]
                    for event in mock_events:
                        f.write(json.dumps(event) + "\n")
                st.sidebar.success("Telemetry uplink established.")
                load_failure_data.clear()
                time.sleep(1)
                st.rerun()
            except OSError as e:
                st.sidebar.error(f"Failed to initiate telemetry: {e}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant