AVX-512 micro optimizations for non-ASCII input by hkratz · Pull Request #128 · rusticstuff/simdutf8

hkratz · 2026-06-15T12:04:35Z

This PR contains several AVX-512 micro optimizations for non-ASCII input:

Since we already require VBMI for our AVX-512 implementation we can use VPERMB for dynamic shuffles. The advantage is that VPERMB ignores the upper two bits and selects element 0-63 of the lookup table based solely on the lower 6 bits. Thus we do not need to mask out the upper nibble when shuffling on a 16-byte LUT (repeated 4 times). This allows us to get rid of two VPANDQ masking operations which we would otherwise need because AVX-512 (just like AVX2 and SSE) has only 16-bit shr operations. Those pollute the upper nibble of every other byte. The same we can also get rid of one other VPANDQ masking operation.
LLVMs optimizer already fused four bit-manipulation operations into two VPTERNLOG operations in the hot loop. Doing it explicitly allows us to do one more fusion getting rid of another VPANDQ.
Because of this fusion we also need one less register-register move.

So all in all this optimization removes four VPANDQ operations and one reg-reg move at the cost of one additional VPTERNLOG.

Overall that improves AVX-512 performance on non-ASCII input by up to 20%.

This required some minor refactoring so that we can replace check_special_cases and check_multibyte_lengths with specialized implementations on AVX-512.

AVX-512 micro optimizations

36ecf54

hkratz merged commit 641d57f into main Jun 15, 2026
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AVX-512 micro optimizations for non-ASCII input#128

AVX-512 micro optimizations for non-ASCII input#128
hkratz merged 1 commit into
mainfrom
avx512-opt

hkratz commented Jun 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hkratz commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hkratz commented Jun 15, 2026 •

edited

Loading