Predictive soft-cooling for Linux. Local realtime ML that smooths thermal spikes before the reactive controller wakes up.
Read-only dashboard at http://127.0.0.1:18889/ — newspaper-style instrument log over live thermal telemetry.
When the predictor sees a load spike in the next thirty seconds, the fans get a head start. The boost limit lifts before silicon heats. By the time the workload actually lands, RPM is already at target — no impeller inertia, no radiator warm-up lag. The mirror trick on the way down: drop the boost limit as soon as load falls, before the fan curve starts chasing a temperature that's already dropping.
The reactive controller is not broken. It is structurally late, because fans have inertia and silicon doesn't. coolstep operates in the few seconds the reactive loop can't reach.
→ docs/concept.md · docs/physics-rationale.md
The same KNN runs on a laptop and on a rack. The signals differ — fan RPM and EPP versus inlet temperature and BMC power — but the inference is the same. coolstep auto-detects which side of the line you're on and arranges itself: quiet and responsive on the desktop, foresight on the server. One CLI flag overrides the detection when needed.
A laptop user asks "why is the fan loud right now?" A rack admin asks "when will this chassis degrade?" Same daemon, different question, different presentation.
asusctl, ryzenadj, nbfc-linux, coolercontrol,
nct6687-driver-dkms-git — these already work, already shipped, already
loved by their communities. coolstep is the conductor, not the violin.
When a driver is missing on your distro, you get a pointer to install
it, not a warning to ignore. The orchestrator never writes to hwmon
directly; it wraps the tools that know what they're doing.
→ docs/stack-decisions.md (ADR-010)
You can run coolstep as a read-only analytical dashboard and never opt in to hardware writes. Many users do exactly that. The "armed" mode — actually nudging the fan curve — is gated behind eight calibration checks and stays disabled by default until you decide otherwise. The dashboard is the product; the actuator is the privilege.
The predictor cockpit tile makes one promise: every number is paired with the parameter that gives it meaning, so the operator never reads a naked figure they have to translate in their head.
| where | what it shows | paired with |
|---|---|---|
| LIVE NOW | latest sample, °C | Δ — predicted change over the +5 s horizon |
| TREND | phrase: →78° in 6s · cooling −10°/21s · asymptote eq ≈ 76° · past knee · steady |
raw dT/dt printed underneath for the operator who wants the °C/s anyway |
| Canvas | past 30 s actual (gold trail) overlaid with forecast +5 s (dashed cool) | σ-corridor shaded around the forecast, 78 °C knee + 90 °C danger as dashed reference lines |
| Past predictions | hollow rings where we said the chip would land | thin segment to where it actually did, coloured by |residual| (≤ 2° / 2–5° / > 5°) |
| ⚡ Spike chip | visible when the predictor was surprised: |residual| ≥ 5 °C for 2 consecutive ticks | duration · max |residual| · workload label; closure writes an Incident(kind="predictor_spike") so future zen / Steam / speedtest fingerprints get recognised |
| ±err 15 M / 30 S | twin pill — slow rolling quality vs what this workload is doing right now | same colour ladder, so the split reads as "long-haul fine, transient spike" at a glance |
Canvas detail — gold actual trail descends from a past plateau; hollow rings mark where we previously said it would land, with thin segments to where it actually did (red = miss > 5°). Forecast at now sits inside the σ-corridor; the knee reference line is the dashed band cutting horizontally across.
The math underneath is intentionally pedestrian. Newton-cooling
saturation through T₀ + s·τ·(1 − e^(−h/τ)) for the +5 s forecast,
a Bayesian shrinkage prior on the per-bucket residual correction so
two agreeing samples don't report σ = 0.01 °C confidence, one OLS
slope over the last five frames for the live dT/dt readout. Stdlib
only, no scipy. The forecast curve anchors on the meta-corrected
endpoint so the dashed line and the predicted ring agree — operator
never sees "line shoots to 91°, ring sits at 85°".
→ docs/stack-decisions.md (ADR-020, ADR-021) ·
screenshots in v0.5.4
- v0.5.5 release notes
Nothing leaves the host. No model upload, no telemetry, no cloud sync. Thermal data correlates with what you're doing — your apps, your schedule, your day. coolstep treats that as personal data and keeps it on disk where it belongs.
→ docs/stack-decisions.md (ADR-008)
Coolstep shares the thermal/power surface with other actors —
power-profiles-daemon, TLP, asusctl, thermald, game-mode.service,
Feral gamemoded, BMC firmware, and (worst of all) another coolstep
instance. v0.5.1 introduces a two-layer test pillar that makes those
coordination points visible:
Hardware matrix — 12 synthetic snapshots replay representative
/sys + /proc + /etc trees through detect_caps() end-to-end:
| Class | Snapshot |
|---|---|
| ASUS gaming laptop | TUF A15 FA507XV (Ryzen 9 7940HS + 4060M) |
| AMD desktop + dGPU | Ryzen 9 7950X + RTX 4090, Fedora + KDE |
| Intel ultrabook | ThinkPad X1 Carbon Gen 11, Ubuntu + GNOME |
| AMD laptop, non-ASUS | Framework 13 (Ryzen 7 7840U), Debian + Hyprland |
| Handheld | Steam Deck OLED (SteamOS, gamescope) |
| Intel desktop (Mac hw) | Mac Mini i5-4278U, Fedora + applesmc |
| ARM SBC | Raspberry Pi 5 (BCM2712) |
| Rack server (Intel) | Dell PowerEdge R750 + iDRAC9 Redfish |
| Rack server (Intel) | HP ProLiant DL380 Gen10 + iLO5 |
| Rack server (AMD) | Supermicro H12 dual EPYC 7763 |
| Cloud ARM (KVM) | Oracle Cloud Free Tier Ampere A1 |
| Container | Alpine LXC (degenerate /sys) |
Interference matrix — 17 scenarios assert what happens when coolstep collides with another actor that owns part of the same surface. 9 covered gates · 4 open gaps · 4 structural separations. The matrix is honest: known unfixed conflicts carry explicit fix proposals and stay visible until closed.
→ docs/interference-matrix.md ·
docs/hw-matrix.md
The dashboard binds 127.0.0.1:18889. Loopback is the gate; remote
exposure is your choice and your auth layer — SSH tunnel for one user,
reverse proxy with basic-auth / Authelia / oauth2-proxy for shared
access, WireGuard for closed admin networks. The daemon refuses to
bind a non-loopback address without an explicit --allow-public flag,
ships transport-layer hardening middleware, and writes every mode
switch to the actuator journal.
→ docs/security.md · docs/headless-deployment.md · SECURITY.md
Two install paths. Both put a coolstep binary on your PATH that
runs from anywhere, plus two daemon entry points (coolstep-collector,
coolstep-dashboard) that you opt into separately.
⚠ Never
sudo pip install. coolstep is a user-level tool. The daemon runs as your regular user; data and config live under~/.local/and~/.config/. Installing as root puts files where the user daemon can't reach them and breaks env-var inheritance from your shell. If you ever need privileged access for hardware writes (fan curves, RAPL, ryzenadj), it is granted per-capability via systemd unit overrides — never by running the whole tool as root. See docs/privileges.md.
Heads-up: install only puts the binaries on
PATH. It does not start the daemon. After install,coolstep compatworks immediately as a read-only diagnostic; the daemon is a deliberate second step. See What runs, what doesn't.
pipx install git+https://github.com/zzallirog/coolstepWhy this path. pipx puts coolstep in its own isolated venv at
~/.local/share/pipx/venvs/coolstep and symlinks the binaries into
~/.local/bin/. Distro-agnostic, no PEP 668 conflicts, no clashes
with system Python packages. Easy to upgrade
(pipx upgrade coolstep) and remove (pipx uninstall coolstep).
Expected output. A progress bar, then:
installed package coolstep 0.5.4, installed using Python 3.12+
These apps are now available:
- coolstep
- coolstep-collector
- coolstep-dashboard
done! ✨ 🌟 ✨
If pipx is missing. pacman -S python-pipx (Arch) /
apt install pipx (Debian, Ubuntu) / dnf install pipx (Fedora).
If pipx warns about existing files in ~/.local/bin/ — you have
a pip --user install (path B below) holding the same shim names.
Either run pip uninstall --break-system-packages coolstep first and
retry, or stay on path B; the two are mutually exclusive.
pip install --user --break-system-packages git+https://github.com/zzallirog/coolstepWhy this path. Smallest possible install — no extra tool, no extra
venv, just packages added to your ~/.local/lib/python*/site-packages/
tree. Use this if you already manage system Python deliberately and
know what you're opting into.
Why the awkward flag. Modern Arch, Debian and Fedora ship system
Python with PEP 668 enabled, which refuses bare pip install by
default to protect distro-managed packages from being shadowed.
--break-system-packages opts out for this one command only; --user
keeps the install in ~/.local/ and never touches system packages.
Expected output. Standard pip output, ending in:
Successfully installed coolstep-0.5.4
Right after install — regardless of path — three things are true:
-
coolstep compatworks immediately. Run it from any directory. It prints the platform report (which collectors activated, what's available, what's missing, how to fix gaps). Read-only, never touches hardware. This is the fastest sanity check that the install succeeded. -
The daemon is not running. Neither
coolstep-collectornorcoolstep-dashboardis started. The dashboard athttp://127.0.0.1:18889/does not respond yet. -
No systemd units are enabled. They're installed and visible to
systemctl --user list-unit-files | grep coolstep, but neither is active. You decide when to enable them.
To bring the daemon up:
coolstep install-units # writes systemd unit files (pipx/pip only)
systemctl --user daemon-reload
systemctl --user enable --now coolstep-collector coolstep-dashboard
xdg-open http://127.0.0.1:18889/coolstep install-units drops the unit files into
~/.config/systemd/user/. pipx and pip don't install systemd units
automatically, so this step is needed for non-AUR installs. If you
installed via a future AUR package, the units are already in
/usr/lib/systemd/user/ and you can skip this command.
enable --now does two things at once: marks the units to start at
login and starts them immediately. Without --now, they'd only fire
on next login. Without enable, they'd run this session and forget on
reboot.
To verify daemon health:
systemctl --user status coolstep-collector coolstep-dashboard
coolstep doctor # full health checkTo stop and disable:
systemctl --user disable --now coolstep-collector coolstep-dashboardHardware actuators stay in dry-run by default. Even with the daemon
enabled, no fan curves move and no sysfs nodes are written until you
explicitly set COOLSTEP_ACTUATOR_ENABLE=true in the systemd unit
override — and even then, only after the eight calibration gates
(docs/calibration-gates.md) clear.
| Command | Purpose |
|---|---|
coolstep adapters |
List discovered collectors and actuators |
coolstep tail |
Live-stream telemetry ticks to the terminal |
coolstep stats |
Summary statistics over a time window |
coolstep drift |
Show model-drift indicators |
coolstep efficiency |
Thermal-efficiency report |
coolstep export |
Export telemetry to CSV/JSON |
coolstep doctor |
Full health check (daemon, units, hardware) |
coolstep export-telemetry |
Export raw telemetry frames |
coolstep history |
Decision-history log |
coolstep export-profile |
Export current workload profile |
coolstep import-profile |
Import a saved workload profile |
coolstep compat |
Platform compatibility report |
coolstep install-units |
Write systemd user units (pipx/pip installs) |
coolstep predict-debug |
Inspect predictor buckets and weights |
coolstep predict-replay |
Replay predictions over historical data |
- Upgrade.
pipx upgrade coolstep(Path A) orpip install --user --break-system-packages -U git+https://github.com/zzallirog/coolstep(Path B), thensystemctl --user restart coolstep-collector coolstep-dashboard. State-file migration contract + known schema breaks →docs/upgrade.md. - Uninstall.
pipx uninstall/pip uninstallremove the package only. Unit files in~/.config/systemd/user/and runtime state in~/coolstep/data/(up to ~2 GB) remain orphan — full purge recipe →docs/uninstall.md. - Storage. Default 1 Hz collector × 14d retention ≈ 1.5–2 GB
steady-state for
store.db. Knobs (--period,COOLSTEP_HOME) and concrete production numbers →docs/headless-deployment.md#storage-footprint.
- Linux kernel ≥ 5.10
- Python 3.10–3.13 recommended. Python 3.14 triggers a known
chromadbrust-bindings segfault — the daemon detects and falls back toAlwaysIdleBaselineautomatically (no KNN, predictions return 0.0), but you lose the predictor entirely. Either downgrade to 3.13 or setCOOLSTEP_CHROMA_DISABLED=1explicitly to silence the warning and run dashboard-only. See troubleshooting. - One of
pipx,pip, oruv(most distros have at least one pre-installed; minimal hosts like Proxmox base or Alpine may need a one-time install — see troubleshooting) - systemd (only for the daemon mode —
coolstep compatworks without it on any init system) - No root required for the dashboard,
coolstep compat, or any read-only collector. Some optional collectors and all hardware-write actuators do want extra capabilities — see docs/privileges.md for the full table and the per-capability systemd drop-in templates.
| Distro | Status |
|---|---|
| Arch Linux (Hyprland / KDE / GNOME) | ✅ verified — daily-driver target |
| Debian 12 / Ubuntu 22.04+ | ✅ verified — headless deployment |
| Fedora 39+ / RHEL 9+ | |
| openSUSE Tumbleweed / Leap | |
| Alpine, NixOS | |
| macOS, Windows | ❌ P4+ roadmap |
docs/hw-matrix.md includes Fedora / RHEL / Ubuntu / Debian static
/sys + /proc + /etc snapshots — these exercise detect_caps()
parsing, not end-to-end runtime. Real install reports from non-Arch /
non-Debian distros are welcome via
GitHub Discussions.
| Phase | Status |
|---|---|
| P0 — foundation, four collectors, dashboard (now 12 collectors / 6 actuators) | ✅ |
| P1 — calibration window, throttle FSM, audit closure | ✅ |
| P2 — actuator stack with sandbox-first defaults | ✅ |
| P2.5 — perf and ML/control hardening | ✅ |
| P2.6 — predictor cockpit relational metrics + 30 s err chip | ✅ v0.5.4 |
| P2.7 — spike-driven training archive (detector + incident plumbing) | ✅ v0.5.4 |
| P3 — two deployment targets, three-layer manifest, community pointers | ✅ v0.5.0 |
| P2.8 — memory layers (chromadb pin, warm-start reindex, embedder-stats persistence) | ✅ v0.5.6 |
| P2.9 — cockpit (10 Hz daemon, dual-curve canvas, multi-horizon toggle, rAF glide) | ✅ v0.5.6–v0.5.8 |
P2.10 — workload awareness rework (Profile.BROWSER, steam_app_* prefix, reactive slope_preload, background-vote resolver) |
✅ v0.5.13 |
| P2.11 — trajectory features in KNN embedding (26→29-dim) + sparse-index read-path robustness | ✅ v0.5.15 |
| P3.1 — dashboard target-aware tile reordering | ⚪ |
P3.2 — coolstep manifest update (community feed sync) |
⚪ |
P3.3 — coolstep doctor --facet=server |
⚪ |
| P3.4 — Prometheus export endpoint | ⚪ |
| P4 — Windows / macOS adapters | ⚪ future |
| File | Topic |
|---|---|
docs/concept.md |
Why this exists at all |
docs/physics-rationale.md |
Arrhenius and Poole–Frenkel, why peaks beat averages |
docs/architecture.md |
Module map, the tick loop, the three independent layers |
docs/telemetry-schema.md |
Every field, every collector, how the frame composes |
docs/calibration-gates.md |
The eight gates between dry-run and armed mode |
docs/efficiency-curve.md |
work_per_degree, the sweet spot, and the knee |
docs/curve-ownership.md |
Who manages the fan curve at each layer — BIOS, vendor tool, your profile, coolstep bias — and where coolstep's authority ends |
docs/drift-detection.md |
Seven indicators that the model has gone stale |
docs/stack-decisions.md |
Twenty-one ADRs covering why this stack and not another |
docs/p3-plan.md |
The two-target design and the three-layer manifest |
docs/troubleshooting.md |
Every warning coolstep compat can print, with per-distro fixes |
docs/privileges.md |
What needs root, why, and how to grant the minimum safely |
docs/upgrade.md |
Upgrade commands per install path, state-file migration contract, known schema breaks |
docs/uninstall.md |
Full purge recipe — package, units, drop-ins, runtime state, journals |
CONTRIBUTING.md |
What we accept readily and what needs discussion |
CHANGELOG.md |
Feature history by version |
- Something broke or behaves wrong?
Run
coolstep compat --install-planfirst — it prints the per-distro fix for most missing-dependency cases. If that's not enough, the troubleshooting guide covers every warningcoolstep compatcan emit, with per-distro commands. - Question, idea, show-and-tell? GitHub Discussions is the canonical place. Lower friction than issues, easier to find later.
- Bug you can reproduce?
Open an issue.
The bug-report template asks for
coolstep compat --json— including that one piece is the difference between a triaged issue and one that sits open for a week. - Hardware not detected?
Use the hardware-support issue template.
Single-chip additions usually land within 48 h. Most additions are
one entry in
coolstep/compat/core.json.
MIT — see LICENSE.



