Describe the bug
LimitedRetryTransport.RoundTrip in storage_drivers/common.go retries on io.EOF by calling its inner RoundTripper again with the same *http.Request, but does not explicitly invoke req.GetBody() before retrying with the same *http.Request. In the observed failure path, the first attempt consumes req.Body, and the subsequent retry can hit: http: ContentLength=N with Body length 0 against the original Content-Length, which transferWriter.writeBody rejects with the synthetic Go error:
http: ContentLength=N with Body length 0
The trigger is the ONTAP management HTTPS endpoint (Server: libzapid-httpd, serving /servlets/netapp.servlets.admin.XMLrequest_filer), which closes idle keep-alive connections at exactly 5 seconds of inactivity. When Trident's HTTP client picks a stale-but-still-pooled connection right at that boundary, the first ZAPI POST attempt fails with io.EOF; the wrapper retries; in the observed traces this retry path is associated with ContentLength=N with Body length 0 error.
For ZAPI POSTs that are part of a verify-after-create sequence (e.g. qtree-list-iter after a successful qtree-create), this turns a recoverable connection-reuse race into a permanent failure. ONTAP usually has already executed the original request before the keepalive close, so the resource exists on the array, but Trident never gets the confirmation back and logs:
level=error msg="API invocation failed.
Post .../servlets/netapp.servlets.admin.XMLrequest_filer:
http: ContentLength=628 with Body length 0"
level=error msg="Error checking for existing qtree."
level=warning msg="Backend update resulted in an orphaned volume."
The volume is left in an orphaned state (the TridentVolume CR exists, the qtree exists on ONTAP, but the controller can no longer reconcile them). Bootstrap-time calls (aggr-list-info, system-get-version) hit the same path, which can put the controller into a restart loop.
Environment
- Trident version: v25.10.0
- Trident installation flags used:
--csi --crd_persistence --https_rest --enable-concurrency=false
- Container runtime: containerd
- Kubernetes version: 1.31.x
- Kubernetes orchestrator: OpenShift
- Kubernetes enabled feature gates: stock (no custom gates)
- OS: RHEL CoreOS / RHEL 9.x worker nodes
- NetApp backend types: 9.15.1P7, driver
ontap-nas-economy (qtree-based; this driver makes the bug visible because every create is followed by a verify ZAPI, but any ZAPI POST is at risk)
- Other:
- ONTAP management HTTPS endpoint sends
Server: libzapid-httpd and closes idle keep-alive connections at 5.000 s (measured to the millisecond — see Additional context).
- Symptoms began immediately after upgrading the controller image to v25.10.0; we have packet captures from both ends taken in this state.
To Reproduce
Steps to reproduce the behavior:
-
Install Trident v25.10.0 against an ONTAP cluster, configured with the ontap-nas-economy driver.
-
Drive provisioning load — e.g. create a few hundred PVCs against a StorageClass backed by ontap-nas-economy. Concurrent provisioning increases the chance that the keepalive race fires.
-
Watch the controller log:
oc logs deploy/trident-controller -c trident-main -n trident -f \
| grep -E 'ContentLength=.* with Body length 0|orphaned volume|Error checking for existing qtree'
-
Within minutes, repeated http: ContentLength=N with Body length 0 errors appear; some PVCs end up with TridentVolume CRs in an inconsistent state, while the underlying qtrees do exist on ONTAP (volume qtree show -vserver <svm>).
The two parts of the bug can also be reproduced independently of any provisioning load:
(a) Independent confirmation of ONTAP's 5-second keep-alive close (no Trident involved):
export HOST=<ontap-mgmt-lif>
export USER=<readonly_user>
read -rs -p "Password: " PASS; echo
export PASS
python3 -c '
import os, ssl, socket, time, base64
h = os.environ["HOST"]
auth = base64.b64encode(
f"{os.environ[\"USER\"]}:{os.environ[\"PASS\"]}".encode()
).decode()
def t(idle):
s = ssl._create_unverified_context().wrap_socket(
socket.create_connection((h, 443), 10), server_hostname=h)
s.sendall(
f"GET /api/cluster HTTP/1.1\r\nHost: {h}\r\n"
f"Authorization: Basic {auth}\r\n\r\n".encode())
s.settimeout(5)
d = b""
while b"\r\n\r\n" not in d:
d += s.recv(8192)
time.sleep(idle)
try:
s.sendall(
f"GET /api/cluster HTTP/1.1\r\nHost: {h}\r\n"
f"Authorization: Basic {auth}\r\nConnection: close\r\n\r\n".encode())
d = s.recv(8192)
first = d.split(b"\n")[0].decode().strip() if d else "PEER CLOSED"
print(f" idle={idle:>4.2f}s -> {first}")
except Exception as e:
print(f" idle={idle:>4.2f}s -> {type(e).__name__}: {e}")
s.close()
for i in [4.95, 5.00, 6.0]:
t(i)
'
Output against an ONTAP cluster:
idle=4.95s -> HTTP/1.1 200 OK
idle=5.00s -> PEER CLOSED
idle=6.00s -> PEER CLOSED
(b) Standalone Go reproducer of the exact ContentLength=N with Body length 0 error string:
// Wraps http.DefaultTransport with the same retry pattern as
// LimitedRetryTransport (calls base.RoundTrip(req) twice with the
// SAME *http.Request, never resets req.Body via req.GetBody()).
// With req.GetBody removed, the second call produces the exact
// log string Trident emits.
package main
import (
"bytes"
"fmt"
"io"
"net/http"
"net/http/httptest"
"strings"
)
type wrap struct{ base http.RoundTripper }
func (w wrap) RoundTrip(r *http.Request) (*http.Response, error) {
a, _ := w.base.RoundTrip(r)
if a != nil {
io.Copy(io.Discard, a.Body)
a.Body.Close()
}
return w.base.RoundTrip(r) // same *Request, no GetBody rewind
}
func main() {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
b, _ := io.ReadAll(r.Body)
fmt.Fprintf(w, "saw=%d\n", len(b))
}))
defer srv.Close()
payload := strings.Repeat("X", 628)
req, _ := http.NewRequest("POST", srv.URL, bytes.NewBuffer([]byte(payload)))
req.GetBody = nil // simulate the missing rewind
_, err := (&http.Client{Transport: wrap{base: http.DefaultTransport}}).Do(req)
fmt.Println("err:", err) // -> http: ContentLength=628 with Body length 0
}
Expected behavior
Either:
(a) LimitedRetryTransport.RoundTrip calls req.GetBody() before each retry attempt to refresh req.Body, so the retry can actually re-send the request payload. This matches the contract http.NewRequest sets up when the body is *bytes.Buffer / *bytes.Reader / *strings.Reader (it auto-populates req.GetBody for exactly this purpose); or
(b) The wrapper does not retry on io.EOF for non-replayable methods (POST without Idempotency-Key), and lets the higher-level controller reconcile loop re-issue the request as a brand-new *http.Request with a fresh body.
In addition, defense-in-depth: the *http.Transport literal in NewZapiRunner (storage_drivers/ontap/api/azgo/common.go) should set an IdleConnTimeout shorter than ONTAP's 5 s (e.g. 3 s), so stale keep-alive entries are evicted client-side before ONTAP can race them.
Additional context
Where the bug is in v25.10.0
storage_drivers/common.go, LimitedRetryTransport.RoundTrip — the closure passed to backoff.Retry calls the inner RoundTripper with the same *http.Request on every attempt and never resets req.Body:
func (lrt *LimitedRetryTransport) RoundTrip(req *http.Request) (*http.Response, error) {
var resp *http.Response
f := func() error {
// ... semaphore + metrics ...
r, err := lrt.base.RoundTrip(req) // attempt N -- same req each time
resp = r
if err == nil {
return nil
}
if !errors.Is(err, io.EOF) {
return backoff.Permanent(err)
}
return err // retry on EOF; req.Body is now drained
}
lrt.b.Reset()
return resp, backoff.Retry(f, lrt.b)
}
grep -n GetBody storage_drivers/common.go returns 0 hits in v25.10.0, and req.Body = is never re-assigned anywhere in that file.
The body construction in storage_drivers/ontap/api/azgo/common.go itself is correct — bytes.NewBuffer(b) causes http.NewRequestWithContext to auto-populate req.GetBody. So the rewind capability exists; the wrapper simply never uses it.
The *http.Transport literal sets neither IdleConnTimeout nor DisableKeepAlives, so Trident's idle pool will hold a connection indefinitely from the client's point of view, until either ONTAP sends a FIN or the next RoundTrip attempt notices it.
Why Go's standard library doesn't save us here
Go's *http.Transport.roundTrip does rewind via GetBody, but only inside its internal retry loop (shouldRetryRequest → rewindBody). For POSTs without Idempotency-Key, that path fires only on nothingWrittenError. When ONTAP closes mid-response (or the kernel hands the FIN to Go after some bytes have already been written), in this failure scenario, the request is retried by LimitedRetryTransport with the same request object, and this path can result in a drained-body retry (ContentLength=N with Body length 0), and the second attempt produces ContentLength=N with Body length 0.
ONTAP-side measurement of the 5.000 s keep-alive close
From a packet capture taken at the cluster — idle gap from last >100 B data segment to first server-initiated FIN, multiple connections:
port 39924 idle_gap = 5.001 s
port 42823 idle_gap = 5.000 s
port 54615 idle_gap = 5.001 s
port 45085 idle_gap = 5.001 s
port 33943 idle_gap = 5.001 s
port 57890 idle_gap = 5.001 s
port 34330 idle_gap = 5.001 s
Cross-confirmed by a Trident-side packet capture of the same connections (~3 ms clock skew, identical SEQ/ACK numbers), and by the from-scratch HTTPS reproducer in To Reproduce: 4.95 s idle works, 5.00 s idle gets PEER CLOSED.
"Stale-dial" signature on the Trident side
TCP-level fingerprint of *http.Transport picking a stale entry from its idle pool, failing fast, and dialing again — server FIN immediately followed by a fresh client SYN — appears repeatedly in the Trident-side capture:
srv_FIN port 52948 -> new SYN port 35188 delta=0.017 s
srv_FIN port 35188 -> new SYN port 53197 delta=0.015 s
srv_FIN port 53197 -> new SYN port 51111 delta=0.014 s
srv_FIN port 58024 -> new SYN port 51220 delta=0.001 s
srv_FIN port 51220 -> new SYN port 33153 delta=0.000 s
... (~15 occurrences in a few minutes)
Suggested fix (3 lines + 2 lines)
--- a/storage_drivers/common.go
+++ b/storage_drivers/common.go
@@ func (lrt *LimitedRetryTransport) RoundTrip(req *http.Request) (*http.Response, error) {
var resp *http.Response
f := func() error {
+ // Rewind the body so retries don't send an empty payload
+ // against the original Content-Length.
+ if req.GetBody != nil {
+ body, err := req.GetBody()
+ if err != nil {
+ return backoff.Permanent(err)
+ }
+ req.Body = body
+ }
waitStart := time.Now()
--- a/storage_drivers/ontap/api/azgo/common.go
+++ b/storage_drivers/ontap/api/azgo/common.go
@@ in NewZapiRunner ...
transport = &http.Transport{
TLSClientConfig: &tls.Config{ ... },
+ IdleConnTimeout: 3 * time.Second, // < ONTAP libzapid-httpd 5 s
+ MaxIdleConnsPerHost: 4,
}
Describe the bug
LimitedRetryTransport.RoundTripinstorage_drivers/common.goretries onio.EOFby calling its innerRoundTripperagain with the same*http.Request, but does not explicitly invoke req.GetBody() before retrying with the same *http.Request. In the observed failure path, the first attempt consumes req.Body, and the subsequent retry can hit:http: ContentLength=N with Body length 0against the originalContent-Length, whichtransferWriter.writeBodyrejects with the synthetic Go error:The trigger is the ONTAP management HTTPS endpoint (
Server: libzapid-httpd, serving/servlets/netapp.servlets.admin.XMLrequest_filer), which closes idle keep-alive connections at exactly 5 seconds of inactivity. When Trident's HTTP client picks a stale-but-still-pooled connection right at that boundary, the first ZAPI POST attempt fails withio.EOF; the wrapper retries; in the observed traces this retry path is associated withContentLength=N with Body length 0error.For ZAPI POSTs that are part of a verify-after-create sequence (e.g.
qtree-list-iterafter a successfulqtree-create), this turns a recoverable connection-reuse race into a permanent failure. ONTAP usually has already executed the original request before the keepalive close, so the resource exists on the array, but Trident never gets the confirmation back and logs:The volume is left in an orphaned state (the
TridentVolumeCR exists, the qtree exists on ONTAP, but the controller can no longer reconcile them). Bootstrap-time calls (aggr-list-info,system-get-version) hit the same path, which can put the controller into a restart loop.Environment
--csi --crd_persistence --https_rest --enable-concurrency=falseontap-nas-economy(qtree-based; this driver makes the bug visible because every create is followed by a verify ZAPI, but any ZAPI POST is at risk)Server: libzapid-httpdand closes idle keep-alive connections at 5.000 s (measured to the millisecond — see Additional context).To Reproduce
Steps to reproduce the behavior:
Install Trident v25.10.0 against an ONTAP cluster, configured with the
ontap-nas-economydriver.Drive provisioning load — e.g. create a few hundred PVCs against a
StorageClassbacked byontap-nas-economy. Concurrent provisioning increases the chance that the keepalive race fires.Watch the controller log:
Within minutes, repeated
http: ContentLength=N with Body length 0errors appear; some PVCs end up withTridentVolumeCRs in an inconsistent state, while the underlying qtrees do exist on ONTAP (volume qtree show -vserver <svm>).The two parts of the bug can also be reproduced independently of any provisioning load:
(a) Independent confirmation of ONTAP's 5-second keep-alive close (no Trident involved):
Output against an ONTAP cluster:
(b) Standalone Go reproducer of the exact
ContentLength=N with Body length 0error string:Expected behavior
Either:
(a)
LimitedRetryTransport.RoundTripcallsreq.GetBody()before each retry attempt to refreshreq.Body, so the retry can actually re-send the request payload. This matches the contracthttp.NewRequestsets up when the body is*bytes.Buffer/*bytes.Reader/*strings.Reader(it auto-populatesreq.GetBodyfor exactly this purpose); or(b) The wrapper does not retry on
io.EOFfor non-replayable methods (POST withoutIdempotency-Key), and lets the higher-level controller reconcile loop re-issue the request as a brand-new*http.Requestwith a fresh body.In addition, defense-in-depth: the
*http.Transportliteral inNewZapiRunner(storage_drivers/ontap/api/azgo/common.go) should set anIdleConnTimeoutshorter than ONTAP's 5 s (e.g. 3 s), so stale keep-alive entries are evicted client-side before ONTAP can race them.Additional context
Where the bug is in v25.10.0
storage_drivers/common.go,LimitedRetryTransport.RoundTrip— the closure passed tobackoff.Retrycalls the innerRoundTripperwith the same*http.Requeston every attempt and never resetsreq.Body:grep -n GetBody storage_drivers/common.goreturns 0 hits in v25.10.0, andreq.Body =is never re-assigned anywhere in that file.The body construction in
storage_drivers/ontap/api/azgo/common.goitself is correct —bytes.NewBuffer(b)causeshttp.NewRequestWithContextto auto-populatereq.GetBody. So the rewind capability exists; the wrapper simply never uses it.The
*http.Transportliteral sets neitherIdleConnTimeoutnorDisableKeepAlives, so Trident's idle pool will hold a connection indefinitely from the client's point of view, until either ONTAP sends a FIN or the next RoundTrip attempt notices it.Why Go's standard library doesn't save us here
Go's
*http.Transport.roundTripdoes rewind viaGetBody, but only inside its internal retry loop (shouldRetryRequest→rewindBody). For POSTs withoutIdempotency-Key, that path fires only onnothingWrittenError. When ONTAP closes mid-response (or the kernel hands the FIN to Go after some bytes have already been written), in this failure scenario, the request is retried byLimitedRetryTransportwith the same request object, and this path can result in a drained-body retry(ContentLength=N with Body length 0), and the second attempt producesContentLength=N with Body length 0.ONTAP-side measurement of the 5.000 s keep-alive close
From a packet capture taken at the cluster — idle gap from last >100 B data segment to first server-initiated FIN, multiple connections:
Cross-confirmed by a Trident-side packet capture of the same connections (~3 ms clock skew, identical SEQ/ACK numbers), and by the from-scratch HTTPS reproducer in To Reproduce: 4.95 s idle works, 5.00 s idle gets
PEER CLOSED."Stale-dial" signature on the Trident side
TCP-level fingerprint of
*http.Transportpicking a stale entry from its idle pool, failing fast, and dialing again — server FIN immediately followed by a fresh client SYN — appears repeatedly in the Trident-side capture:Suggested fix (3 lines + 2 lines)