Skip to content
This repository was archived by the owner on Jan 21, 2026. It is now read-only.
This repository was archived by the owner on Jan 21, 2026. It is now read-only.

Feature configure the ollama timeout. #8

Description

@hungrymonkey

Whenever ollama takes too long to process, both ollama and the client will keep disconnecting each other's time.

Ollama

[GIN] 2025/07/18 - 04:20:18 | 200 | 29.971381125s |       127.0.0.1 | POST     "/api/chat"

Client

time=2025-07-17T13:24:13.634-04:00 level=WARN source=runner.go:128 msg="truncating input prompt" limit=4096 prompt=4420 keep=4 new=4096

time=2025-07-17T13:24:43.607-04:00 level=WARN msg="Retry extract" package=golightrag function=Insert retry=3 error="failed to call LLM: error sending request: Post \"http://localhost:11434/api/chat\": context deadline exceeded"

Proposal 1.

//We set the timeout in seconds.
 NewOllama(host, model string, timeout int, params Parameters, logger *slog.Logger)

type Ollama struct {
	host  string
	model string

        timeout time.Duration
	params Parameters

	client *api.Client

	logger *slog.Logger
}


func NewOllama(host, model string, timeout int, params Parameters, logger *slog.Logger) Ollama {
	u, err := url.Parse(host)
	if err != nil {
		panic(err)
	}

	return Ollama{
		host:   host,
		model:  model,
		params: params,
		client: api.NewClient(u, &http.Client{Timeout:  timeout * time.Second}),
               timeout: timeout * time.Second,
		logger: logger.With(slog.String("module", "ollama")),
	}
}

// Chat sends a chat message to the Ollama API.
func (o Ollama) Chat(messages []string) (string, error) {
	msgs := make([]api.Message, len(messages))
....
	ctx, cancel := context.WithTimeout(context.Background(), o.timeout)
	defer cancel()
......
	return result.String(), nil
}

Proposal 2

Timeout := 100
llm.NewOllama(cfg.OpenAIUrl, cfg.OpenAIModel, llm.Parameters{
		Temperature:      &temp,
		IncludeReasoning: &reasoning,
                Timeout: timeout
	}, logger)

type Parameters struct {
	Temperature       *float32       `yaml:"temperature"`
	TopP              *float32       `yaml:"topP"`
	TopK              *int           `yaml:"topK"`
	FrequencyPenalty  *float32       `yaml:"frequencyPenalty"`
	PresencePenalty   *float32       `yaml:"presencePenalty"`
	RepetitionPenalty *float32       `yaml:"repetitionPenalty"`
	MinP              *float32       `yaml:"minP"`
	TopA              *float32       `yaml:"topA"`
	Seed              *int           `yaml:"seed"`
	MaxTokens         *int           `yaml:"maxTokens"`
	LogitBias         map[string]int `yaml:"logitBias"`
	Logprobs          *bool          `yaml:"logprobs"`
	TopLogprobs       *int           `yaml:"topLogprobs"`
	Stop              []string       `yaml:"stop"`
	IncludeReasoning  *bool          `yaml:"includeReasoning"`
        Timeout      *int `yaml:"timeout"`
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions