Skip to content

fix(onboard): bind gateway to 0.0.0.0 on Docker Desktop WSL (#5513)#5534

Open
abhi-0906 wants to merge 1 commit into
NVIDIA:mainfrom
abhi-0906:fix/wsl-docker-desktop-gateway-bind
Open

fix(onboard): bind gateway to 0.0.0.0 on Docker Desktop WSL (#5513)#5534
abhi-0906 wants to merge 1 commit into
NVIDIA:mainfrom
abhi-0906:fix/wsl-docker-desktop-gateway-bind

Conversation

@abhi-0906

@abhi-0906 abhi-0906 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

On Docker Desktop + WSL2, nemoclaw onboard step [2/8] starts the host-mode OpenShell gateway bound to NemoClaw's default 127.0.0.1. Sandbox containers reach the gateway via Docker's host-gateway route, which Docker Desktop maps to its own bridge IP rather than the WSL distro loopback — so the [2/8] sandbox-bridge reachability probe fails with Connection refused, 100% of the time, blocking onboarding on every Docker Desktop WSL host.

Root cause

NemoClaw chooses the gateway bind address and passes it to OpenShell via OPENSHELL_BIND_ADDRESS. In host mode that value defaulted to 127.0.0.1 (src/lib/core/gateway-address.ts, src/lib/onboard/docker-driver-gateway-env.ts) with no Docker Desktop WSL branch. The containerized compatibility path already binds 0.0.0.0 for exactly this reason; the host-mode gateway (used on modern-glibc WSL distros) did not.

Fix

Resolve the effective bind address with Docker Desktop WSL awareness:

  • No explicit NEMOCLAW_GATEWAY_BIND_ADDRESS override → bind 0.0.0.0 on Docker Desktop WSL so the host-gateway route reaches the gateway; 127.0.0.1 everywhere else.
  • An explicit override still wins (including forcing 127.0.0.1 back on).
  • Clients still connect over loopback (getGatewayConnectHost maps 0.0.0.0127.0.0.1); only the listen surface widens, and the existing wildcard-bind warning now fires for the auto-widened case.

The decision is computed in the gateway env layer so it flows consistently into the launch, the runtime drift identity/marker, the preflight port check, and the systemd env file (no drift-driven restart loop). Detection is memoized so onboard runs docker info at most once. To keep the detector out of source-level unit tests, wsl-docker-desktop-gpu now lazy-loads the Docker adapter only on an actual probe.

Testing

  • New unit tests in docker-driver-gateway-env.test.ts cover wildcard bind on Docker Desktop WSL, loopback on native Linux, explicit-override precedence in both directions, and the start-env / port-check / warning wiring.
  • tsc -p tsconfig.src.json clean; gateway/onboard suites pass (remaining failures are pre-existing Windows-only chmod/path tests, identical on main).

Fixes #5513.

Summary by CodeRabbit

Release Notes

  • New Features

    • Gateway bind address now adapts dynamically to the Docker Desktop WSL environment, while still honoring an explicit NEMOCLAW_GATEWAY_BIND_ADDRESS override.
    • Updated gateway startup and connectivity checks to use the resolved bind address consistently.
  • Tests

    • Added/expanded coverage for Docker Desktop WSL vs native Linux behavior, override handling (including wildcard cases), and interface targeting.
  • Refactor

    • Improved environment parsing flexibility and reduced redundant Docker Desktop detection work during onboarding.

Signed-off-by: Abhimanyu Kumar abhimanyukumar7290@gmail.com

@copy-pr-bot

copy-pr-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

parseGatewayBindAddress gains an injectable env parameter. A new resolveGatewayBindAddress function in docker-driver-gateway-env.ts selects 0.0.0.0 on Docker Desktop WSL and 127.0.0.1 otherwise, using a memoized WSL probe with optional GatewayBindAddressDeps override. The Docker adapter import in wsl-docker-desktop-gpu.ts is made lazy. New tests cover all bind-address resolution branches.

Changes

Dynamic Gateway Bind Address for Docker Desktop WSL

Layer / File(s) Summary
parseGatewayBindAddress injectable env
src/lib/core/gateway-address.ts
Adds optional env: NodeJS.ProcessEnv parameter defaulting to process.env and switches the bind address lookup to env[envVar].
Lazy Docker adapter load
src/lib/onboard/wsl-docker-desktop-gpu.ts
Replaces static dockerInfoFormat import with a defaultDockerInfoFormat() lazy wrapper using require() to avoid pulling the Docker adapter into the static import graph.
GatewayBindAddressDeps, WSL status cache, and imports
src/lib/onboard/docker-driver-gateway-env.ts
Adds GatewayBindAddressDeps type with optional detectStatus override, a module-level memoized WSL status cache, resolveWslDockerDesktopStatus, resetGatewayBindAddressDetectionCacheForTests, and updated imports/re-exports for gateway-address constants.
resolveGatewayBindAddress and dependent helpers
src/lib/onboard/docker-driver-gateway-env.ts
Implements three-priority bind resolution (explicit env → docker-desktop wildcard → default loopback), threads resolved address into getGatewayPortCheckOptions, getGatewayStartNetworkEnv, and warnIfGatewayWildcardBindAddress.
Docker Desktop WSL bind-address tests
src/lib/onboard/docker-driver-gateway-env.test.ts
Adds test imports and a full describe block covering wildcard/loopback resolution, env override, start network env propagation, port-check host selection, and wildcard log warning assertion.

Sequence Diagram(s)

sequenceDiagram
    participant Onboard as nemoclaw onboard
    participant GatewayEnv as docker-driver-gateway-env
    participant WSLProbe as wsl-docker-desktop-gpu
    participant DockerAdapter as adapters/docker (lazy)

    Onboard->>GatewayEnv: resolveGatewayBindAddress(deps?)
    GatewayEnv->>GatewayEnv: check NEMOCLAW_GATEWAY_BIND_ADDRESS in env
    alt env override present
        GatewayEnv-->>Onboard: return env override address
    else no override
        GatewayEnv->>WSLProbe: resolveWslDockerDesktopStatus()
        WSLProbe->>DockerAdapter: require('../adapters/docker').dockerInfoFormat()
        DockerAdapter-->>WSLProbe: docker info output
        WSLProbe-->>GatewayEnv: "docker-desktop" | other
        alt WSL docker-desktop
            GatewayEnv-->>Onboard: return 0.0.0.0 (wildcard)
        else native Linux
            GatewayEnv-->>Onboard: return 127.0.0.1 (default)
        end
    end
    Onboard->>GatewayEnv: getGatewayStartNetworkEnv(deps?)
    GatewayEnv-->>Onboard: { OPENSHELL_BIND_ADDRESS: resolvedAddr, ... }
    Onboard->>GatewayEnv: getGatewayPortCheckOptions(deps?)
    GatewayEnv-->>Onboard: { host: resolvedAddr }
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Suggested labels

bug-fix, area: onboarding, platform: wsl, area: sandbox

Suggested reviewers

  • cv

Poem

🐇 Hopping through WSL's maze,
the gateway once hid in loopback's haze.
Now 0.0.0.0 opens wide,
Docker Desktop containers slide inside!
No more "Connection refused" dismay —
the wildcard bind saves the day. 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely summarizes the main change: fixing the gateway binding to 0.0.0.0 on Docker Desktop WSL to resolve reachability issues from sandbox containers.
Linked Issues check ✅ Passed The pull request fully addresses issue #5513 by implementing dynamic gateway bind address resolution with Docker Desktop WSL awareness, memoized detection, and comprehensive test coverage matching all stated coding requirements.
Out of Scope Changes check ✅ Passed All changes directly support the stated objective: dynamic bind address resolution for Docker Desktop WSL, lazy-loading of Docker adapter to keep tests clean, and comprehensive test coverage. No extraneous modifications detected.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

)

On Docker Desktop + WSL2, onboard [2/8] starts the host-mode OpenShell
gateway bound to NemoClaw's default 127.0.0.1, but sandbox containers
reach it via Docker's host-gateway route, which Docker Desktop maps to
its own bridge IP rather than the WSL distro loopback. The [2/8]
sandbox-bridge reachability probe then fails with Connection refused,
100% of the time, blocking onboarding on every Docker Desktop WSL host.

Resolve the effective bind address with Docker Desktop WSL awareness:
with no explicit NEMOCLAW_GATEWAY_BIND_ADDRESS override, bind 0.0.0.0 on
Docker Desktop WSL so the host-gateway route can reach the gateway. This
mirrors the containerized compat path, which already binds 0.0.0.0 for
the same reason. An explicit override still wins, and the existing
wildcard-bind warning now fires for the auto-widened case too.

The decision is computed in the gateway env layer so it flows
consistently into the launch, the runtime drift identity/marker, the
preflight port check, and the systemd env file. To keep the detector out
of source-level unit tests, wsl-docker-desktop-gpu now lazy-loads the
Docker adapter only on an actual probe.

Signed-off-by: Abhimanyu Kumar <abhimanyukumar7290@gmail.com>
@abhi-0906

Copy link
Copy Markdown
Contributor Author

Opened a small follow-up, #5536, for the secondary orphaned-gateway symptom in #5513: on a genuine sandbox-bridge probe failure, onboard now tears down the gateway it started (via the existing stopDockerDriverGatewayProcess()) instead of leaving it bound and orphaned.

The two are complementary: this PR (#5534) keeps the probe from failing on Docker Desktop WSL in the first place; #5536 handles cleanup for any genuine probe failure on any host. They touch different files (docker-driver-gateway-env.ts / wsl-docker-desktop-gpu.ts here vs gateway-sandbox-reachability.ts / onboard.ts there) and don't conflict.

abhi-0906 added a commit to abhi-0906/NemoClaw that referenced this pull request Jun 17, 2026
…VIDIA#5513)

When onboard's [2/8] sandbox-bridge reachability probe fails, NemoClaw
aborts via process.exit(1) without stopping the OpenShell gateway it
started (or reused/adopted) earlier in the same run. The gateway is left
running, bound to the loopback address, so the accompanying "restart
Docker and re-run" hint is misleading: the stale listener survives a
Docker restart and collides with the next attempt.

Add an onUnreachable hook to verifySandboxBridgeGatewayReachableOrExit
that fires only on a genuine unreachable result (not the soft
probe_unavailable skip or a successful probe), and wire it at the three
host-mode gateway paths in startDockerDriverGateway (fresh start, reuse,
adopt) to tear the gateway down via the existing
stopDockerDriverGatewayProcess(). That helper reads the pid file written
in every path and only terminates a verified gateway process, so it is a
safe no-op otherwise.

Follow-up to the bind-address fix in NVIDIA#5534: that change keeps the probe
from failing on Docker Desktop WSL in the first place, while this ensures
any genuine probe failure no longer orphans the gateway.

Signed-off-by: Abhimanyu Kumar <abhimanyukumar7290@gmail.com>
@cv

cv commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

@abhi-0906 can you add a DCO "Signed-off-by: ..." line to the PR description, please?

@abhi-0906

Copy link
Copy Markdown
Contributor Author

Thanks @cv — added the Signed-off-by line to the PR description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WSL2][Policy&Network] OpenShell gateway unreachable from sandbox containers on Docker Desktop WSL (binds to 127.0.0.1:8080)

2 participants