Skip to content

WASIp3 Cooperative Threading#799

Draft
TartanLlama wants to merge 43 commits into
WebAssembly:mainfrom
TartanLlama:sy/coop-threading
Draft

WASIp3 Cooperative Threading#799
TartanLlama wants to merge 43 commits into
WebAssembly:mainfrom
TartanLlama:sy/coop-threading

Conversation

@TartanLlama

@TartanLlama TartanLlama commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Currently a draft while the LLVM patches merge and I get all of this ready for review

@alexcrichton alexcrichton left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial pass over things.

One major piece remaining in terms of synchronization is locks around the descriptor_table.h and derivative subsystems. None of that is locking-aware right now and would probably cause lots of problems if two threads tried to print to stdout for example. That's fine for an initial PR but we'll want to avoid broadcasting that threads are generally available until those issues are sorted out

if (at) {
errno = ENOSYS;
return -1;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to thread at into __waitlist_wait_on and have the error here generated by that function? That way there's sort of a centralized "this isn't supported" as opposed to having to make sure to write this down in all functions with a timeout parameter

Comment on lines +5 to +8
// TODO(wasip3) timed waits
if (ts) {
return ETIMEDOUT;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one might be difficult to leave unsupported into the future as AFAIK waiting with timeouts on condvars is a relatively common operation. Leaving this as a TODO for now though is fine by me

Comment on lines +28 to +29
# Make the C function do the rest of work.
call __wasm_component_model_builtin_thread_index

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be deferred to the C entrypoint to avoid needing to write details like $root in asm files? That way it should be possible to use the generated bindings to get the current thread index on the C side of things

static volatile int lock[1];
volatile int *const __at_quick_exit_lockptr = lock;
#endif
DECLARE_STRONG_LOCK(lock, static);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional this is "strong" bu locked with "weak" below?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray file name? (unusual to see a space)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray file?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray file?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I think is ok for an initial implementation, but I think we're going to want to eventually ensure that calls to __wasm_get_stack_pointer go directly to the builtin intrinsic without needing a wrapper in the middle. Otherwise there's basically no way to create a module that directly performs the call because even with LTO these won't get inlined.

return thrd_busy;
}
}
int ret = __pthread_mutex_trylock((pthread_mutex_t *)m);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have some sort of internal macro/header to do this cast to avoid needing to open-code it having the same representation?

Comment on lines +81 to +92
# Allocate a new stack
# Constant copied from __default_stacksize, TODO(wasip3) find a way
# to reference this constant without linking failing for shared libraries,
# i.e. a position-independent data reference.
i32.const 131072
call malloc
call __wasm_set_stack_pointer

# Allocate a new TLS area
call __allocate_aligned_tls
call __copy_tls
call __wasm_set_tls_base

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to delegate all of this to C instead of assembly? Or basically delegate as much of this function as possible to C instead of assembly here. I suspect that the initial instructions are required to be in assembly, but these look compatible (or at least a subset) to move over there

@alexcrichton

Copy link
Copy Markdown
Collaborator

Another high level thing I'm just now remembering -- we'll need to afford a transition period from today-without-coop-threads to tomorrow-with-coop-threads-stable. Basically I think we'll want some sort of CMake flag for "ok turn on threads" which is default off for wasip3. That way the default build of wasi-sdk, for example, will have a libc.a which has pthread symbols but the symbols don't actually do anything (return errors as they do today). Once the component-model-threading feature is finished/stable in Wasmtime and we've proven out the ecosystem then we can remove the option and unconditionally turn it on.

I'd like to ideally still only have one target, wasm32-wasip3, and from-source builds would have one extra option of doing pthreads-or-not. That keeps the target triples down and still easy to think about. Source-wise I think that'd involve adjusting some of the #ifdef here to use __wasm_libcall_thread_context__ plus some CMake-set variable and that's what uses all the threading intrinsics here. With the CMake-set thing disabled it'd basically be the p3-target as-is now, plus some extra stuff w.r.t. the stack pointer and TLS.

Does that sound ok to you @TartanLlama? If so I was thinking wasi/version.h.in is probably where we could invent some new #define name for "ok p3 actually has threads turned on"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants