Skip to content

CLI-801 Parse Web Push params by name and support aes128gcm#13

Open
cassiostp wants to merge 2 commits into
masterfrom
cli-801-parse-webpush-params
Open

CLI-801 Parse Web Push params by name and support aes128gcm#13
cassiostp wants to merge 2 commits into
masterfrom
cli-801-parse-webpush-params

Conversation

@cassiostp

@cassiostp cassiostp commented Jun 12, 2026

Copy link
Copy Markdown

Since June 4, Google's FCM delivery layer sends a fraction of pushes (~0.8% of messages, ~36% of Superhuman desktop's push-active devices daily) with a crypto-key value of the form dh=<key>;p256ecdsa=<key>. decrypt slices fixed offsets out of crypto-key/encryption, so these envelopes feed garbage into ECDH and die with the unhandled "Public key is not valid for specified curve" — the push is lost, and because it is never acked, MCS redelivers it on every reconnect. The wire format was confirmed by instrumentation in the Superhuman desktop shell (staging capture, crypto_key_params: dh;p256ecdsa correlated with the curve error on the same message). This extracts dh/salt by name, accepting ;/, separators in any order.

Two deliberate details worth reviewing closely. First, the new missing-parameter errors name the parameters present but never echo values (which may be key material), and they intentionally embed the existing crypto-key is missing / salt is missing substrings — Client._onDataMessage's catch matches on those, so malformed envelopes now drop-and-ack instead of crash-looping, with zero changes to client.js. Second, envelopes whose content-encoding appData says aes128gcm (RFC 8291 — keys travel in the payload's binary header, crypto-key disappears) are decrypted via http_ece's aes128gcm support; the captured envelopes already carry a content-encoding field, which suggests Google is staging that migration.

test/decrypt.test.js is the repo's first hermetic suite (real http_ece round-trips, no FCM credentials — the existing notification.test.js needs live keys): old format, both new formats, aes128gcm, and the error contracts. The first commit fixes two pre-existing semicolon lint errors in src/gcm so yarn lint is green again. Version is bumped to 2.1.6 per repo convention; after merge this needs a manual npm publish by an @superhuman npm maintainer, then the desktop repo picks it up via a yarn resolution override (CLI-801 has the full investigation).

Pre-existing lint failures that kept 'yarn lint' red on master.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@linear-code

linear-code Bot commented Jun 12, 2026

Copy link
Copy Markdown

CLI-801

Since 2026-06-04 a fraction of FCM deliveries carry crypto-key values
with multiple parameters (dh=<key>;p256ecdsa=<key>). Slicing fixed
offsets feeds garbage into ECDH, so decryption throws "Public key is
not valid for specified curve" and the push is lost and redelivered on
every reconnect. Extract dh/salt by name instead, accepting ';' and ','
separators in any order.

Missing-parameter errors now name the parameters present (never their
values, which may be key material) and embed the 'crypto-key is
missing'/'salt is missing' substrings so Client drops and acks such
messages instead of throwing unhandled.

Also decrypt aes128gcm (RFC 8291) envelopes, where keys travel in the
payload's binary header — observed FCM envelopes already carry a
content-encoding field, suggesting that migration is being staged.

Bump version to 2.1.6. CLI-801

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@cassiostp cassiostp force-pushed the cli-801-parse-webpush-params branch from 83abea2 to 0007c21 Compare June 12, 2026 20:31
@zindlerb

zindlerb commented Jun 16, 2026

Copy link
Copy Markdown

Since June 4, Google's FCM delivery layer sends a fraction of pushes (~0.8% of messages, ~36% of Superhuman desktop's push-active devices daily) with a crypto-key value of the form dh=;p256ecdsa=.

Could you link me to logs where we see this? Or if there are no logs how do we know this?

EDIT:
Ah I see you are likely getting this from the error volume. The reason you see it start on june 4 is we started logging native errors to log.bugsnag then. It doesn't mean the issue started june 4.

@cassiostp

cassiostp commented Jun 16, 2026

Copy link
Copy Markdown
Author

@zindlerb By the original ticket description this happened before but stopped at some point, but bugsnag confirms it started happening again on june 4

I added some extra logging to staging, as we have a very similar pattern there and got these results

EDIT:
Honeycomb logs by distinct device

@zindlerb

Copy link
Copy Markdown

@cassiostp Makes sense thanks

@zindlerb zindlerb left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. Thanks for the investigation here!
Before we merge this could you make a change to the _onDataMessage error handling?

Since we rarely touch this code I'm worried that we silence some of the errors and might not be able to see if theres a new issue introduced. I alsop Could we update it so we raise error for:

case error.message.includes(
          'Unsupported state or unable to authenticate data'
        ):
        case error.message.includes('crypto-key is missing'):
        case error.message.includes('salt is missing'):

these? Would still do this._persistentIds.push before raising.

Want to do this in a diff pr so we have baseline errors before this change. It also would be good to get visibility into those errors in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants