check if llama.cpp is feasible to be used directly instead of llama-cpp-python

for the docker builds, there are some limitations added by llama-cpp-python which prevent us from building llama.cpp with flexible support for SSE, AVX(2) and AVX512. They are not loaded on demand but hardcoded to support AVX2 or fail. If AVX512 were to be compiled (`GGML_AVX512=ON`), that build would fail on all systems that don't support it even if the system support AVX2 and the build has that compiled in.
the llama.cpp flags for flexible/dynamic build: `GGML_BACKEND_DL=ON` `GGML_CPU_ALL_VARIANTS=ON`
^ this may be addressed later on but not a priority.

for now the previous requirement of AVX2 has been kept intact.

_Originally posted by @kyteinsky in https://github.com/nextcloud/context_chat_backend/issues/295#issuecomment-4498917560_
            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

check if llama.cpp is feasible to be used directly instead of llama-cpp-python #297

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

check if llama.cpp is feasible to be used directly instead of llama-cpp-python #297

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions