Skip to content

logging: write container output synchronously so final output is not truncated#5009

Merged
ChengyuZhu6 merged 1 commit into
containerd:mainfrom
AkihiroSuda:fix-5006
Jun 27, 2026
Merged

logging: write container output synchronously so final output is not truncated#5009
ChengyuZhu6 merged 1 commit into
containerd:mainfrom
AkihiroSuda:fix-5006

Conversation

@AkihiroSuda

@AkihiroSuda AkihiroSuda commented Jun 25, 2026

Copy link
Copy Markdown
Member

Logging tests such as TestLogsWithoutNewlineOrEOF and
TestLogsAfterRestartingContainer flake under gomodjail: a container's final
output, in particular a line emitted right before exit with no trailing
newline, is sometimes dropped from the log. Running

nerdctl run --name c alpine printf "'Hello World!\nThere is no newline'"
nerdctl logs -f c

would intermittently print only "'Hello World!" instead of the full output.

Two independent problems, both reproduced locally under gomodjail:

  1. getContainerWait hung for short-lived containers. The logging process is
    started by containerd while it sets up the container's IO, before the task
    is created, so the first con.Task() returns NotFound and the code retried
    forever "waiting for the task to start". For a fast container the task can
    instead have already exited and been removed before the logger ever sees
    it, so it never appeared and the logger blocked forever holding the logger
    lock. It now concludes the container has exited once it is missing and the
    container has been observed producing output.

  2. The container's final chunk was lost to teardown. On exit containerd closes
    the stdio FIFOs and tears the logging process down almost immediately. The
    old path read the FIFO, copied it through an io.Pipe and a bufio splitter,
    and handed each line to the driver over a buffered channel; a trailing
    chunk with no newline was held in the splitter until EOF and then raced the
    teardown across several goroutines, so it was frequently lost. The logger
    now reads each FIFO directly and, for drivers that can write synchronously
    (json-file, via the new SyncDriver interface), writes each entry inline from
    the reading goroutine and flushes a trailing no-newline fragment as soon as
    it is read. Streaming drivers keep using the buffered channel so a slow
    driver cannot block the container.

The viewer also does a final read of the JSON log file when it receives the
stop signal, so entries flushed just before exit are not missed.

Verified locally with the gomodjail-packed binary: 250+ iterations of the
failing printf case, the restart (doubled-output) case, multi-line output and
follow-on-running-container all pass with no truncation.

Fixes #5006

Assisted-by: Claude Opus 4.8 noreply@anthropic.com

@AkihiroSuda AkihiroSuda added this to the v2.3.4 milestone Jun 25, 2026
@AkihiroSuda AkihiroSuda added the area/ci e.g., CI failure label Jun 25, 2026
@AkihiroSuda AkihiroSuda marked this pull request as draft June 25, 2026 15:03
@AkihiroSuda AkihiroSuda force-pushed the fix-5006 branch 4 times, most recently from 0e038aa to b6c713a Compare June 26, 2026 18:53
…truncated

Logging tests such as TestLogsWithoutNewlineOrEOF and
TestLogsAfterRestartingContainer flake under gomodjail: a container's final
output, in particular a line emitted right before exit with no trailing
newline, is sometimes dropped from the log. Running

    nerdctl run --name c alpine printf "'Hello World!\nThere is no newline'"
    nerdctl logs -f c

would intermittently print only "'Hello World!" instead of the full output.

Two independent problems, both reproduced locally under gomodjail:

1. getContainerWait hung for short-lived containers. The logging process is
   started by containerd while it sets up the container's IO, before the task
   is created, so the first con.Task() returns NotFound and the code retried
   forever "waiting for the task to start". For a fast container the task can
   instead have already exited and been removed before the logger ever sees
   it, so it never appeared and the logger blocked forever holding the logger
   lock. It now concludes the container has exited once it is missing and the
   container has been observed producing output.

2. The container's final chunk was lost to teardown. On exit containerd closes
   the stdio FIFOs and tears the logging process down almost immediately. The
   old path read the FIFO, copied it through an io.Pipe and a bufio splitter,
   and handed each line to the driver over a buffered channel; a trailing
   chunk with no newline was held in the splitter until EOF and then raced the
   teardown across several goroutines, so it was frequently lost. The logger
   now reads each FIFO directly and, for drivers that can write synchronously
   (json-file, via the new SyncDriver interface), writes each entry inline from
   the reading goroutine and flushes a trailing no-newline fragment as soon as
   it is read. Streaming drivers keep using the buffered channel so a slow
   driver cannot block the container.

The viewer also does a final read of the JSON log file when it receives the
stop signal, so entries flushed just before exit are not missed.

Verified locally with the gomodjail-packed binary: 250+ iterations of the
failing printf case, the restart (doubled-output) case, multi-line output and
follow-on-running-container all pass with no truncation.

Fixes containerd#5006

Assisted-by: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda AkihiroSuda changed the title logging: drain remaining logs on stop signal during follow logging: write container output synchronously so final output is not truncated Jun 26, 2026
@AkihiroSuda AkihiroSuda marked this pull request as ready for review June 26, 2026 19:40
@AkihiroSuda AkihiroSuda requested a review from ChengyuZhu6 June 27, 2026 04:58

@ChengyuZhu6 ChengyuZhu6 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ChengyuZhu6 ChengyuZhu6 merged commit 38adfda into containerd:main Jun 27, 2026
50 of 52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci e.g., CI failure area/logging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gomodjail CI failing (TestLogs*)

2 participants