Skip to content

check if llama.cpp is feasible to be used directly instead of llama-cpp-python #297

@kyteinsky

Description

@kyteinsky

for the docker builds, there are some limitations added by llama-cpp-python which prevent us from building llama.cpp with flexible support for SSE, AVX(2) and AVX512. They are not loaded on demand but hardcoded to support AVX2 or fail. If AVX512 were to be compiled (GGML_AVX512=ON), that build would fail on all systems that don't support it even if the system support AVX2 and the build has that compiled in.
the llama.cpp flags for flexible/dynamic build: GGML_BACKEND_DL=ON GGML_CPU_ALL_VARIANTS=ON
^ this may be addressed later on but not a priority.

for now the previous requirement of AVX2 has been kept intact.

Originally posted by @kyteinsky in #295 (comment)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions