Skip to content

Support building on CPUs with AVX but not AVX2#3040

Merged
greenrazer merged 1 commit into
huggingface:mainfrom
jncraton:fix-avx1
Jul 31, 2025
Merged

Support building on CPUs with AVX but not AVX2#3040
greenrazer merged 1 commit into
huggingface:mainfrom
jncraton:fix-avx1

Conversation

@jncraton

Copy link
Copy Markdown
Contributor

This change corrects an issue causing AVX2 intrinsics to be called on CPUs that do not support them. These intrinsics were conditionally compiled in behind an avx feature gate. This change correctly uses them only if avx2 is available. This change should have no impact on modern CPUs, but it allows older CPUs to work properly using the unoptimized code path.

An alternative to this change would be to rework the AVX code paths to avoid AVX2 intrinsics, but that would be a larger change and likely less desirable as AVX2 seems like a very reasonable baseline these days. Dropping AVX2 intrinsics would likely be a performance regression on modern CPUs.

This change corrects an issue that I was facing related to an Illegal Instruction after building the crate, and should also address similar issues that have been encountered by others such as #1327 and #2140.

This change corrects an issue causing AVX2 intrinsics to be called on
CPUs that do not support them. These intrinsics were conditionally
compiled in behind an `avx` feature gate. This change correctly uses
them only if `avx2` is available. This change should have no impact
on modern CPUs, but it allows older CPUs to work properly using the
unoptimized code path.
@greenrazer

Copy link
Copy Markdown
Contributor

Thanks for the contribution!

According to the Steam Hardware Survey, ~97% of users have AVX and ~95% have AVX2 support. Given this data, I agree that reworking the code to avoid AVX2 instructions isn't worth the effort just to provide optimizations for the ~2% of users on older hardware.

For anyone looking at this in the future, at least one of the AVX2 instructions is found here: _mm256_cvtepu8_epi16, which according to Intel's documentation is AVX2-exclusive.

@greenrazer greenrazer merged commit 26a3222 into huggingface:main Jul 31, 2025
9 checks passed
john-sharratt pushed a commit to john-sharratt/candle that referenced this pull request May 7, 2026
This change corrects an issue causing AVX2 intrinsics to be called on
CPUs that do not support them. These intrinsics were conditionally
compiled in behind an `avx` feature gate. This change correctly uses
them only if `avx2` is available. This change should have no impact
on modern CPUs, but it allows older CPUs to work properly using the
unoptimized code path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants