Skip to content

feat: exponential backoff on HTTP 429 (re-impl of #90)#108

Open
jliounis wants to merge 1 commit into
mainfrom
jliounis/retry-and-strict-validation
Open

feat: exponential backoff on HTTP 429 (re-impl of #90)#108
jliounis wants to merge 1 commit into
mainfrom
jliounis/retry-and-strict-validation

Conversation

@jliounis
Copy link
Copy Markdown
Collaborator

Summary

Re-implementation of the rate-limit retry portion of the previously-closed #90. Credit: @akshayabogoju-coder for the original proposal.

Why

Today, makeApiRequest fails immediately on HTTP 429. That makes the server flaky under bursts and propagates a hard failure to the MCP client for what is almost always a transient condition.

Change

makeApiRequest now wraps the underlying HTTP call (extracted as singleApiAttempt) in a retry loop:

  • Retries only on HTTP 429. Other 4xx/5xx fail fast — we can't assume idempotency without operator-controlled idempotency keys, and Perplexity does not signal retry-safe 5xxs distinctly.
  • Schedule: [2s, 4s, 8s] by default (3 retries, 4 attempts total).
  • Respects Retry-After when present. The actual wait is max(configured-delay, retry-after-seconds * 1000) so the server never retries faster than the API asks.
  • Overridable via PERPLEXITY_RETRY_DELAYS_MS (comma-separated milliseconds) for tests and ops tuning.

Scope notes

This PR is intentionally narrower than the original #90:

  • Does not touch HTTP transport / 0.0.0.0 binding / origin controls (separate security review).
  • Does not add a README "Professional Contributions" section (out of scope).
  • Does not remove args: any casts in tool handlers — the MCP SDK already enforces the Zod inputSchema before the handler runs, so the cast is cosmetic, not a safety hole. Worth its own cleanup PR if/when the SDK's handler generics stabilize.
  • The CWE-200 error-message sanitization that feat: implement production-grade resilience and strict zod validation #90 also touched ships separately as #107 to keep review scopes clean.

Tests

  • npm test: 82 passed / 82 (was 78 on main).
  • 4 new tests in a "HTTP 429 retry behavior" suite:
    • success after N transient 429s (3 attempts, last succeeds)
    • gives up after the configured retry count (4 total attempts)
    • does not retry non-429 errors (5xx fails fast, 1 attempt)
    • Retry-After header is honored
  • Tests use PERPLEXITY_RETRY_DELAYS_MS=0,0,0 to avoid real waits.
  • npm run build: clean.

Backward compatibility

  • Successful first attempts behave identically (no extra latency).
  • Non-429 error messages are unchanged.
  • The new env var (PERPLEXITY_RETRY_DELAYS_MS) is opt-in; defaults to the 2s/4s/8s schedule that matches the original proposal.

Re-implementation of the rate-limit retry portion of #90.
Credit: @akshayabogoju-coder for the original proposal.

Why
---
Today, makeApiRequest fails immediately on HTTP 429. That makes the
server flaky under bursts and propagates a hard failure to the MCP
client for what is almost always a transient condition.

Change
------
makeApiRequest now wraps the underlying HTTP call (extracted as
singleApiAttempt) in a retry loop that:

  - Retries only on HTTP 429 (rate limit). Other 4xx/5xx fail fast,
    because we can't assume idempotency without operator-controlled
    idempotency keys.
  - Uses the schedule [2s, 4s, 8s] by default (3 retries, 4 attempts
    total).
  - Respects the upstream Retry-After header when present. The actual
    wait is max(configured-delay, retry-after-seconds * 1000) so the
    server never retries faster than the API asks.
  - Schedule is overridable via PERPLEXITY_RETRY_DELAYS_MS (comma-
    separated milliseconds) for tests and ops tuning.

Scope notes
-----------
This PR is intentionally narrower than the original #90:

  - It does NOT touch HTTP transport / 0.0.0.0 binding / origin
    controls (those belong in a separate security review).
  - It does NOT add a README "Professional Contributions" section
    (out of scope).
  - It does NOT remove `args: any` casts in tool handlers \u2014 the MCP
    SDK already enforces the Zod inputSchema before the handler runs,
    so the cast is cosmetic, not a safety hole. Worth its own cleanup
    PR if/when the SDK's handler generics stabilize.
  - The CWE-200 sanitization that #90 also touched is being shipped
    separately in #107 to keep review scopes clean.

Tests
-----
- npm test: 82 passed / 82 (was 78 on main).
- 4 new tests in a "HTTP 429 retry behavior" suite:
    * success after N transient 429s
    * gives up after the configured retry count
    * does NOT retry non-429 errors (5xx fails fast)
    * Retry-After header is honored
- Tests run with PERPLEXITY_RETRY_DELAYS_MS=0,0,0 to avoid real waits.
- npm run build: clean.

Co-authored-by: @akshayabogoju-coder
@akshayabogoju-coder
Copy link
Copy Markdown

Thanks for refining the scope and driving the implementation, James—appreciate the credit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants