feat: Add LlamaCpp support for local model hosting for faster inference#5
Open
MAXNORM8650 wants to merge 7 commits intocontext-labs:mainfrom
Open
feat: Add LlamaCpp support for local model hosting for faster inference#5MAXNORM8650 wants to merge 7 commits intocontext-labs:mainfrom
MAXNORM8650 wants to merge 7 commits intocontext-labs:mainfrom
Conversation
- Add LlamaCppProvider with background server management - Support for TinyLlama, Gemma-3-4B, and SmolLM3 models - Automatic model downloading from Hugging Face - Silent server operation with clean command output - Add sample configurations for different models - Update README with LlamaCpp setup instructions - Server runs in background until manually stopped Supports models: - tinyllama-1.1b (fast, basic responses) - gemma-3-4b (balanced quality/speed) - smollm3-3b (small, efficient) Usage: Set LLAMA_DIR env var and use config commands to switch models
Contributor
|
@MAXNORM8650 Can you resolve conflicts? |
added 6 commits
August 18, 2025 00:47
Author
|
I have resolved all the conflicts except new features related eg. |
Author
|
Can you please look into these two conflicts, which is related to the llama cpp new features? I am not sure how to resolve both conflicts, as they are new features related to Llama cpp. |
Contributor
|
Please fix the conflicts. |
|
@samheutmaker Hey can I look into into to resolve the conflicts |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supports models in simple config and can be extended with all smol models:
Usage: Set LLAMA_DIR env var and use config commands to switch models