I'd like to add support for Linux. The idea is to use llamafile as the backend since it's portable and performs well on CPU/GPU without complex setups.
The plan is:
- Update the Rust CLI to work on Linux (currently only the python server works on Linux, while the CLI runs on Linux but unimplemented).
- Implement the Linux backend (server/backend/linux.py) to manage the llama cpp process using https://github.com/abetlen/llama-cpp-python.
I see that we have a cpu.rs file to implement that. However llamafile does support both CPU and GPU, might want to change the names to fit both later no? Or is there only CPU specific use case we are targetting?
Nonetheless, I have started to make a prototype for this.
I'd like to add support for Linux. The idea is to use llamafile as the backend since it's portable and performs well on CPU/GPU without complex setups.
The plan is:
I see that we have a cpu.rs file to implement that. However llamafile does support both CPU and GPU, might want to change the names to fit both later no? Or is there only CPU specific use case we are targetting?
Nonetheless, I have started to make a prototype for this.