Skip to content

Show child process cpu usage in dtop#1880

Open
aclauer wants to merge 11 commits intodevfrom
andrew/feat/dtop-subprocess-cpu-usage
Open

Show child process cpu usage in dtop#1880
aclauer wants to merge 11 commits intodevfrom
andrew/feat/dtop-subprocess-cpu-usage

Conversation

@aclauer
Copy link
Copy Markdown
Collaborator

@aclauer aclauer commented Apr 18, 2026

Problem

dtop only shows cpu usage for Python workers spawned by DimOS. Any native modules spawned by that worker don't show up in the cpu statistics. Also add --log flag to log dtop statistics and dtop-plot to generate plots of cpu usage.

Closes DIM-XXX

Solution

Read the pids of any processes spawned and include their cpu usage in a drop down of the main worker.

dtop

Breaking Changes

None

How to Test

dimos --dtop --replay --replay-db=go2_bigoffice run unitree-go2

and

dtop

When dimos spawns the viewer, it will show up as a subprocess of the rerun bridge worker.

Contributor License Agreement

  • I have read and approved the CLA.

@aclauer aclauer changed the title Initial subprocess display Show child process cpu usage in dtop Apr 18, 2026
Comment thread dimos/core/resource_monitor/stats.py
Comment thread dimos/utils/cli/dtop.py Outdated
@aclauer aclauer marked this pull request as ready for review April 24, 2026 21:13
@aclauer aclauer requested a review from paul-nechifor April 24, 2026 21:13
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR extends dtop to collect and display CPU usage of child processes spawned by each worker, adds a --log flag for JSONL output, and introduces a new dtop-plot CLI for generating matplotlib plots from those logs. The core logic—collecting children via proc.children(), caching them through _get_process, and folding their CPU into the parent's total—is sound and well-structured.

Confidence Score: 5/5

Safe to merge; all findings are P2 memory-growth issues that don't affect correctness.

Both new comments are P2 (unbounded growth in _proc_cache and _child_cpu_history for terminated child PIDs). These don't cause crashes or incorrect data in typical usage and are not regressions — they only matter for very long-running sessions with high child-process churn. No P0 or P1 issues found.

dimos/core/resource_monitor/stats.py and dimos/utils/cli/dtop.py for the cache-cleanup gaps.

Important Files Changed

Filename Overview
dimos/core/resource_monitor/stats.py Adds ChildProcessStats dataclass and collect_children_stats() to gather per-child CPU usage; extends WorkerStats with a children field. Child PIDs are cached via _get_process but dead child entries are never evicted from _proc_cache.
dimos/utils/cli/dtop.py Adds child-process rows beneath each worker in the TUI, extracts CPU rendering into _cpu_metric, adds --log flag for JSONL output. _child_cpu_history grows without bound as children come and go.
dimos/utils/cli/dtop_plot.py New dtop-plot CLI tool that reads a JSONL log and produces matplotlib plots. Works correctly for well-formed logs; previously-flagged NaN/KeyError edge cases remain unaddressed.
dimos/core/resource_monitor/monitor.py Calls the new collect_children_stats after collect_process_stats and folds children's CPU into the parent's cpu_percent; passes children list to WorkerStats. Logic is straightforward and correct.
pyproject.toml Registers dtop-plot as a console script entry point pointing to the new dtop_plot:main.

Sequence Diagram

sequenceDiagram
    participant M as StatsMonitor
    participant S as stats.py
    participant LC as LCM
    participant D as dtop TUI

    loop every poll interval
        M->>S: collect_process_stats(worker_pid)
        S-->>M: ProcessStats (parent CPU, mem, …)
        M->>S: collect_children_stats(worker_pid)
        S->>S: proc.children(recursive=False)
        loop each child pid
            S->>S: _get_process(child.pid) → cached Process
            S->>S: cpu_percent(interval=None)
        end
        S-->>M: list[ChildProcessStats]
        M->>M: ps_dict[cpu_percent] += Σ child cpu
        M->>M: WorkerStats(…, children=[…])
        M->>LC: publish ResourceStatsMessage
    end

    LC-->>D: _on_msg(msg)
    D->>D: write JSONL line (if --log)
    D->>D: _refresh() → render child rows under each worker
Loading

Reviews (6): Last reviewed commit: "Add log flush" | Re-trigger Greptile

Comment thread dimos/utils/cli/dtop.py
Comment thread dimos/utils/cli/dtop.py
Comment thread dimos/utils/cli/dtop.py
Comment thread dimos/utils/cli/dtop_plot.py
Comment thread dimos/utils/cli/dtop_plot.py
@aclauer aclauer requested a review from leshy May 3, 2026 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants