Skip to content

test/cli/logs.test.ts "continues watching old file if no new file appears" is flaky on windows-latest #49

@Rinse12

Description

@Rinse12

Symptom

test/cli/logs.test.ts > bitsocial logs -f log file rotation (synthetic) > continues watching old file if no new file appears (test/cli/logs.test.ts:426) intermittently fails only on windows-latest. ubuntu and macos consistently pass.

When it fails, the assertion error is:

AssertionError: expected '[2026-05-01T00:00:00.000Z] initial li…' to contain 'APPENDED_LINE'

- Expected
+ Received

- APPENDED_LINE
+ [2026-05-01T00:00:00.000Z] initial line
+
+

i.e. bitsocial logs -f printed the seed line correctly but never picked up the line the test appended to the file. The test waits 8 seconds (visible from 8047ms/8041ms durations across runs) before timing out the assertion.

Observed history

Both #43 (fix/logs-follow-windows-stat-stale) and #41 (fix/logs-follow-windows-polling) targeted this code path. The test is much less flaky than it used to be — but clearly still not deterministic.

What this PR is asking for

Not "fix the bug now" — we don't have enough signal from CI to know whether this is:

  • a real watcher race that still exists on Windows under load,
  • a filesystem-timestamp issue (mtime resolution differs on Windows),
  • a buffering/flush issue in the child bitsocial logs -f process,
  • a vitest fork-pool scheduling issue specific to the Windows runner,
  • or something else.

Right now the failure reproduces ~1 time in N on a hosted runner, never (or very rarely) on a developer's own Windows machine. Without more data we'd be guessing.

Concrete asks, in priority order:

  1. Have the test dump more diagnostics on failure. When the assertion fails, log:
    • the full contents of the watched log file (so we can see whether the appended line landed on disk at all);
    • the directory listing (so we can see whether file rotation kicked in unexpectedly);
    • the captured stdout+stderr of the spawned bitsocial logs -f child (currently it's discarded);
    • the wall-clock timeline of: write seed linespawn childexpect "initial line"append APPENDED_LINEexpect "APPENDED_LINE" (with timestamps), so we can tell whether the watcher saw the append at all or whether it saw it but the assertion polled too soon.
    • the file's stat result (mtime/atime/size in ms) before and after the append.
  2. Tee the spawned child's stderr/stdout to a file and upload as a CI artifact on failure — so the next Windows flake gives us its complete logs -f trace instead of just the test's expectation diff.
  3. Run the test in a loop (e.g. repeats: 20) on Windows specifically in a separate scheduled CI job, not the main PR check. A "5/20 fails" signal is much more actionable than "1/many merges fails."

Once #1 and #2 land, we can let the natural flake rate produce one good failure log and root-cause it. Until then every retry just re-rolls the dice.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions