Add -sNODERAWSOCKETS backend for real TCP & UDP on Node.js#27080
Add -sNODERAWSOCKETS backend for real TCP & UDP on Node.js#27080guybedford wants to merge 6 commits into
Conversation
sbc100
left a comment
There was a problem hiding this comment.
Nice! I've not yet reviewed the meat of libsockfs.js but looks good so far.
|
|
||
| // If 1, the POSIX sockets API is backed by Node.js's ``node:net`` module, | ||
| // giving real non-blocking outgoing TCP sockets with no WebSockets, proxy | ||
| // process or pthreads. This only works under node and is ignored elsewhere. |
There was a problem hiding this comment.
You should mention something about how this is similar to what NODERAWFS does for filesystem access?
Speaking of which, should we perhaps combine them? Or at least have NODERAWFS automatically enable NODENET. Its hard to imagine wanting one without that other maybe?
There was a problem hiding this comment.
Also, regarding the name of this settings, should we use the word SOCK rather then NET since that seems to be what we have done previously. I guess -sNODERAWSOCKETS is kind of a mouthfull.
Maybe putting it behind the existing NODERAWFS avoids having to name something new ? :)
There was a problem hiding this comment.
Renamed to -sNODERAWSOCKETS and kept orthogonal to the FS. I think there are use cases for virtual FS as independent? Happy to reconsider merging if you prefer, but I also quite like the two separate options.
| if data: | ||
| self.request.sendall(data) | ||
|
|
||
| server = socketserver.TCPServer(('127.0.0.1', 0), EchoHandler) |
There was a problem hiding this comment.
In test_sockets.py we preallocate specific ports, but I guess this maybe even better than doing that?
There was a problem hiding this comment.
I've now extended this PR to server and UDP tests, which do the binding on the Wasm side returning the port. Still without preallocation.
sbc100
left a comment
There was a problem hiding this comment.
I still need to review the details of libsockfs_node.js but the general shape here LGTM!
Adds a new NODERAWSOCKETS setting that backs the POSIX sockets API directly with Node.js's node:net module, giving real, non-blocking outgoing (client) TCP sockets without WebSockets, an external proxy process, or pthreads. This is the sockets counterpart to NODERAWFS: where NODERAWFS gives direct access to the host filesystem, this gives direct access to host sockets. Unlike PROXY_POSIX_SOCKETS this is single-threaded and event-driven: socket readiness is delivered through the same emscripten_set_socket_*_callback hooks the default WebSocket backend uses, so it drops into existing readiness reactors unchanged. This initial backend supports outgoing TCP only: connect, send, recv and close, plus get/setsockopt (SO_ERROR, TCP_NODELAY, SO_KEEPALIVE and the TCP keep-alive tunables). There is no bind/listen/accept (server) support and no UDP yet; those land in follow-ups. - new node backend in src/lib/libsockfs_node.js, pulled in only under -sNODERAWSOCKETS, implementing the sock_ops contract over net.createConnection - __syscall_setsockopt now lives in JS (routing to the backend under NODERAWSOCKETS, else reporting the option as unknown), avoiding a libstubs variation - test/sockets/test_tcp_echo.c: a plain POSIX outgoing connect/send/recv echo client that also builds and runs natively, run under node against a loopback echo server started by the test harness
Builds on the outgoing-TCP backend to add bind, listen and accept, so a
program can run a real TCP server under -sNODERAWSOCKETS.
Clients stay on the public node:net API: connect() goes through
net.createConnection and never touches a private handle. Servers need a
synchronous bind() that reports the assigned ephemeral port up front (so a
bind(:0) followed by getsockname() works), which net.Server.listen cannot do
because it is async. For that we use a low-level tcp_wrap TCP handle, whose
bind/getsockname are synchronous, and hand that handle to net.Server.listen for
accept. So process.binding('tcp_wrap') only fires for bind/listen/accept, plus
the rare client that bind()s a source port before connect().
- bind/listen/accept added to the node backend, with poll reporting a listener
readable when a connection is pending
- test/sockets/test_tcp_server.c: a self-contained loopback accept+echo that
also builds and runs natively
Adds connectionless UDP (SOCK_DGRAM) to the node socket backend: bind, sendto/recvfrom and a connect() that records a default peer. node:dgram has no synchronous bind and a dgram.Socket cannot adopt an external handle, so unlike TCP we cannot split UDP into a public client path and a private server path. For now the whole UDP path goes through a low-level udp_wrap handle, which does give a synchronous bind() + getsockname() (so bind(:0) followed by getsockname() returns the assigned port immediately). Once node gains a public dgram bindSync, UDP can move fully onto node:dgram with no private API. - UDP handle helper with onmessage receive wiring; recvStart is deferred until the handle is bound (an unbound handle rejects it), either by an explicit bind or by the auto-bind on first send - bind/connect/sendmsg/recvmsg/poll/close branch for SOCK_DGRAM, with datagram recv returning one message and truncating to the buffer - test/sockets/test_udp_echo.c: a self-contained loopback UDP echo that also builds and runs natively
The node socket backend already works in threaded builds without any backend changes: like the rest of the JS filesystem, socket syscalls are proxied to the main thread, so the node:net/tcp_wrap/udp_wrap handles and their event loop always live on the main thread, and a worker calling connect/send/recv blocks on the synchronous proxy. Payloads are already copied out of wasm memory before being handed to node, so a SharedArrayBuffer heap is safe. - run the TCP client, TCP server, UDP and connected-UDP tests in a second '-pthread -sPROXY_TO_PTHREAD' configuration to prove the proxied path - document the threading behavior on the NODERAWSOCKETS setting and backend
The weak native stubs for __syscall_setsockopt and __syscall_shutdown were removed from emscripten_syscall_stubs.c in favor of JS implementations in libsyscall.js, but libsyscall.js is not linked under WASMFS, leaving the symbols undefined. Stub them out in WASMFS alongside the other socket syscalls.
__syscall_setsockopt and __syscall_shutdown move from native wasm exports to JS library imports in hello_dylink_all.
| UNIMPLEMENTED(sendmmsg, (int sockfd, intptr_t msgvec, size_t vlen, int flags, ...)) | ||
| UNIMPLEMENTED(shutdown, (int sockfd, int how, int dummy, int dummy2, int dummy3, int dummy4)) | ||
| // __syscall_shutdown is provided in JS (libsyscall.js): it routes to the socket | ||
| // backend under NODERAWSOCKETS and otherwise reports the option as unsupported. |
There was a problem hiding this comment.
Maybe we don't need this comment?
sbc100
left a comment
There was a problem hiding this comment.
Nice work on all the test cases. I can't say I've read all the test code yet though..
Can you confirm they all run on linux and/or macOS nativly too? (at least the ones where it makes sense that they could).
| // node builtins, resolved once each. getBuiltinModule works in both | ||
| // CommonJS and ESM output, with require as the fallback. | ||
| getNet() { | ||
| return nodeSockOps.netModule ||= (process.getBuiltinModule || require)('net'); |
| // a server can bind(:0) and read back the assigned port immediately, which | ||
| // a would-blocking getsockname could not do. Only created for servers (and | ||
| // the rare bound client), never for a plain connect(). | ||
| ensureHandle(sock) { |
There was a problem hiding this comment.
Should this be called ensureTcpHandler to match ensureUdpHandle?
| throw new FS.ErrnoError({{{ cDefs.EISCONN }}}); | ||
| } | ||
| if (addr === undefined || port === undefined) { addr = sock.daddr; port = sock.dport; } | ||
| if (addr === undefined || port === undefined) throw new FS.ErrnoError({{{ cDefs.EDESTADDRREQ }}}); |
There was a problem hiding this comment.
Put this second check inside the first one ?
| if (addr === undefined || port === undefined) { addr = sock.daddr; port = sock.dport; } | ||
| if (addr === undefined || port === undefined) throw new FS.ErrnoError({{{ cDefs.EDESTADDRREQ }}}); | ||
| var handle = nodeSockOps.ensureUdpHandle(sock); | ||
| if (ArrayBuffer.isView(buffer)) { offset += buffer.byteOffset; buffer = buffer.buffer; } |
There was a problem hiding this comment.
Is the incoming buffer allowed to be something other than a TypedArray?
| nodeSockOps.applyUdpOptions(sock); | ||
| return 0; | ||
| } | ||
| } else if (level === {{{ cDefs.IPPROTO_TCP }}}) { |
There was a problem hiding this comment.
If you want you could add more C defined to cDefs? e.g. IPPROTO_IP above?
See src/struct_info.json.
I'm not sure it always aids readability though.
| // Write-side backpressure. We connect to a sink server (argv[1]) that accepts | ||
| // but never reads, then send non-blocking until the kernel + node buffers fill | ||
| // and send() reports EAGAIN. That proves writes are bounded rather than | ||
| // buffered without limit. Plain POSIX, also runs natively. |
There was a problem hiding this comment.
Maybe combine these big block headers with the licence comment above?
The mismatch of C and C++ block headers is a little strange on the eye I find.
This adds a new
-sNODERAWSOCKETSsetting that for supporting direct full sockets on Node.js via thenode:netfor TCP andnode:dgramfor UDP modules, without needingws, an external proxy process, or pthreads.It is layered into four separate commits:
node:netAPIsprocess.binding('tcp_wrap')API to get the raw socket handle, which is also supported on other runtimes like Deno.UDP without JSPI also has to use private APIs in Node.js, but to provide a future path where we might avoid I've also posted nodejs/node#63838 which could lay the groundwork for a fully public API UDP embedding as that is the only blocker.
For comprehensive
setsockoptssupport I also posted nodejs/node#63825 as well so we can close the loop on all options being supported and that is effectively used in a backwards compatible way here despite not landing yet.Note: AI was used to create this PR, under my review.