This repository was archived by the owner on Jan 21, 2026. It is now read-only.
Description Whenever ollama takes too long to process, both ollama and the client will keep disconnecting each other's time.
Ollama
[GIN] 2025/07/18 - 04:20:18 | 200 | 29.971381125s | 127.0.0.1 | POST "/api/chat"
Client
time=2025-07-17T13:24:13.634-04:00 level=WARN source=runner.go:128 msg="truncating input prompt" limit=4096 prompt=4420 keep=4 new=4096
time=2025-07-17T13:24:43.607-04:00 level=WARN msg="Retry extract" package=golightrag function=Insert retry=3 error="failed to call LLM: error sending request: Post \"http://localhost:11434/api/chat\": context deadline exceeded"
Proposal 1.
//We set the timeout in seconds.
NewOllama(host, model string, timeout int, params Parameters, logger *slog.Logger)
type Ollama struct {
host string
model string
timeout time.Duration
params Parameters
client *api.Client
logger *slog.Logger
}
func NewOllama(host, model string, timeout int, params Parameters, logger *slog.Logger) Ollama {
u, err := url.Parse(host)
if err != nil {
panic(err)
}
return Ollama{
host: host,
model: model,
params: params,
client: api.NewClient(u, &http.Client{Timeout: timeout * time.Second}),
timeout: timeout * time.Second,
logger: logger.With(slog.String("module", "ollama")),
}
}
// Chat sends a chat message to the Ollama API.
func (o Ollama) Chat(messages []string) (string, error) {
msgs := make([]api.Message, len(messages))
....
ctx, cancel := context.WithTimeout(context.Background(), o.timeout)
defer cancel()
......
return result.String(), nil
}
Proposal 2
Timeout := 100
llm.NewOllama(cfg.OpenAIUrl, cfg.OpenAIModel, llm.Parameters{
Temperature: &temp,
IncludeReasoning: &reasoning,
Timeout: timeout
}, logger)
type Parameters struct {
Temperature *float32 `yaml:"temperature"`
TopP *float32 `yaml:"topP"`
TopK *int `yaml:"topK"`
FrequencyPenalty *float32 `yaml:"frequencyPenalty"`
PresencePenalty *float32 `yaml:"presencePenalty"`
RepetitionPenalty *float32 `yaml:"repetitionPenalty"`
MinP *float32 `yaml:"minP"`
TopA *float32 `yaml:"topA"`
Seed *int `yaml:"seed"`
MaxTokens *int `yaml:"maxTokens"`
LogitBias map[string]int `yaml:"logitBias"`
Logprobs *bool `yaml:"logprobs"`
TopLogprobs *int `yaml:"topLogprobs"`
Stop []string `yaml:"stop"`
IncludeReasoning *bool `yaml:"includeReasoning"`
Timeout *int `yaml:"timeout"`
}
Reactions are currently unavailable
Whenever ollama takes too long to process, both ollama and the client will keep disconnecting each other's time.
Ollama
Client
Proposal 1.
Proposal 2