Skip to content

Dev/prebuilt flash att#175

Open
mducducd wants to merge 2 commits intomainfrom
dev/prebuilt-flash-att
Open

Dev/prebuilt flash att#175
mducducd wants to merge 2 commits intomainfrom
dev/prebuilt-flash-att

Conversation

@mducducd
Copy link
Copy Markdown
Collaborator

@mducducd mducducd commented May 5, 2026

Set up installation with pre-built wheels for flash_attn 2.8.3 for x86_64 and aarch64. Installation now takes less than 10min then we can consider putting it in --extra gpu

Wheels are taken from: https://github.com/mjun0812/flash-attention-prebuild-wheels

Tests have been successful on Planets but failed with Github. I hard-code --extra cpu in build.yml

@mducducd mducducd force-pushed the dev/prebuilt-flash-att branch 2 times, most recently from 3ee7f46 to 1738abf Compare May 5, 2026 15:35
@mducducd mducducd requested a review from FWao May 6, 2026 08:34
Copy link
Copy Markdown
Member

@FWao FWao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mducducd Please update the main README regarding this as a build is no longer necessary. Maybe we can leave an advanced / troubleshooting section on the bottom to explain if there are errors / the user wants different version, how the user could switch back to self-built flash-attn.

Otherwise looks good, I tested on a DGX Spark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants