Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ FETCH_OPENRAG_DOCS_AT_STARTUP=false
LANGFLOW_SECRET_KEY=

# flow ids for chat and ingestion flows
# OpenRAG automatically switches to the built-in Astra flow IDs when
# VECTOR_BACKEND=astra. Leave these values alone unless you are replacing
# the stock OpenSearch-backed flows with your own custom JSON files.
LANGFLOW_CHAT_FLOW_ID=1098eea1-6649-4e1d-aed1-b77249fb8dd0
LANGFLOW_INGEST_FLOW_ID=5488df7c-b93f-4f87-a446-b67028bc0813
LANGFLOW_URL_INGEST_FLOW_ID=72c3d17c-2dac-4a73-b48a-6518473d7830
Expand Down Expand Up @@ -93,6 +96,23 @@ OPENSEARCH_USERNAME=admin
# Change this if you want to use a different index name or avoid conflicts
OPENSEARCH_INDEX_NAME=documents

# Knowledge backend selection
# - opensearch: store and search knowledge in OpenSearch (default)
# - astra: use the built-in Astra-backed Langflow flows
VECTOR_BACKEND=opensearch

# Astra DB connection for VECTOR_BACKEND=astra
ASTRA_DB_APPLICATION_TOKEN=
ASTRA_DB_API_ENDPOINT=
# Optional Astra DB keyspace/namespace override. Leave empty to use the database default.
ASTRA_DB_KEYSPACE=

# Optional: override the built-in Astra flow IDs (only needed for custom Astra flows)
# ASTRA_CHAT_FLOW_ID=6df7ff83-97be-46be-b624-ab40ff53fd7b
# ASTRA_INGEST_FLOW_ID=4ae67a0f-8f97-4563-ad54-3c18fe07ce30
# ASTRA_URL_INGEST_FLOW_ID=fbd0aee6-29b1-4a51-a802-27fb3bec9927
# ASTRA_NUDGES_FLOW_ID=05b262d8-45ba-4574-afad-797e8918defd

# IBM AMS Authentication (IBM Watsonx Data embedded mode)
# Set IBM_AUTH_ENABLED=true to authenticate via the ibm-openrag-session cookie
# instead of Google OAuth. The raw IBM JWT is also passed directly to OpenSearch.
Expand Down Expand Up @@ -231,7 +251,6 @@ LOG_LEVEL=
SERVICE_NAME=openrag
# Secret key for session management (auto-generated if not provided)
SESSION_SECRET=

# OPTIONAL: JWT signing key for token issuance and verification.
# If not set, OpenRAG generates an RSA key pair (private_key.pem / public_key.pem)
# under OPENRAG_KEYS_PATH at startup and uses RS256 signing.
Expand Down
13 changes: 12 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ services:
- LANGFLOW_CHAT_FLOW_ID=${LANGFLOW_CHAT_FLOW_ID}
- LANGFLOW_INGEST_FLOW_ID=${LANGFLOW_INGEST_FLOW_ID}
- LANGFLOW_URL_INGEST_FLOW_ID=${LANGFLOW_URL_INGEST_FLOW_ID}
- VECTOR_BACKEND=${VECTOR_BACKEND:-opensearch}
- DISABLE_INGEST_WITH_LANGFLOW=${DISABLE_INGEST_WITH_LANGFLOW:-false}
- INGEST_SAMPLE_DATA=${INGEST_SAMPLE_DATA:-true}
- NUDGES_FLOW_ID=${NUDGES_FLOW_ID}
Expand Down Expand Up @@ -100,6 +101,13 @@ services:
- IBM_COS_HMAC_SECRET_ACCESS_KEY=${IBM_COS_HMAC_SECRET_ACCESS_KEY}
- IBM_COS_AUTH_ENDPOINT=${IBM_COS_AUTH_ENDPOINT}
- OPENSEARCH_INDEX_NAME=${OPENSEARCH_INDEX_NAME:-documents}
- ASTRA_DB_APPLICATION_TOKEN=${ASTRA_DB_APPLICATION_TOKEN}
- ASTRA_DB_API_ENDPOINT=${ASTRA_DB_API_ENDPOINT}
- ASTRA_DB_KEYSPACE=${ASTRA_DB_KEYSPACE}
- ASTRA_CHAT_FLOW_ID=${ASTRA_CHAT_FLOW_ID}
- ASTRA_INGEST_FLOW_ID=${ASTRA_INGEST_FLOW_ID}
- ASTRA_URL_INGEST_FLOW_ID=${ASTRA_URL_INGEST_FLOW_ID}
- ASTRA_NUDGES_FLOW_ID=${ASTRA_NUDGES_FLOW_ID}
- LANGFLOW_KEY=${LANGFLOW_KEY}
- LANGFLOW_KEY_RETRIES=${LANGFLOW_KEY_RETRIES:-15}
- LANGFLOW_KEY_RETRY_DELAY=${LANGFLOW_KEY_RETRY_DELAY:-2.0}
Expand Down Expand Up @@ -185,12 +193,15 @@ services:
- OPENSEARCH_PORT=${LANGFLOW_OPENSEARCH_PORT:-${OPENSEARCH_PORT:-9200}}
- OPENSEARCH_URL=https://${LANGFLOW_OPENSEARCH_HOST:-${OPENSEARCH_HOST:-opensearch}}:${LANGFLOW_OPENSEARCH_PORT:-${OPENSEARCH_PORT:-9200}}
- OPENSEARCH_INDEX_NAME=${OPENSEARCH_INDEX_NAME:-documents}
- VECTOR_BACKEND=${VECTOR_BACKEND:-opensearch}
- ASTRA_DB_APPLICATION_TOKEN=${ASTRA_DB_APPLICATION_TOKEN}
- ASTRA_DB_API_ENDPOINT=${ASTRA_DB_API_ENDPOINT}
- DOCLING_SERVE_URL=${DOCLING_SERVE_URL:-http://host.docker.internal:5001}
- FILENAME=None
- MIMETYPE=None
- FILESIZE=0
- SELECTED_EMBEDDING_MODEL=${SELECTED_EMBEDDING_MODEL:-text-embedding-3-small}
- LANGFLOW_VARIABLES_TO_GET_FROM_ENVIRONMENT=JWT,OPENRAG-QUERY-FILTER,OPENSEARCH_PASSWORD,OPENSEARCH_URL,DOCLING_SERVE_URL,OWNER,OWNER_NAME,OWNER_EMAIL,CONNECTOR_TYPE,DOCUMENT_ID,SOURCE_URL,ALLOWED_USERS,ALLOWED_GROUPS,FILENAME,MIMETYPE,FILESIZE,SELECTED_EMBEDDING_MODEL,OPENAI_API_KEY,ANTHROPIC_API_KEY,WATSONX_APIKEY,WATSONX_URL,WATSONX_PROJECT_ID,OLLAMA_BASE_URL,OPENSEARCH_INDEX_NAME
- LANGFLOW_VARIABLES_TO_GET_FROM_ENVIRONMENT=JWT,OPENRAG-QUERY-FILTER,OPENSEARCH_PASSWORD,OPENSEARCH_URL,DOCLING_SERVE_URL,OWNER,OWNER_NAME,OWNER_EMAIL,CONNECTOR_TYPE,DOCUMENT_ID,SOURCE_URL,ALLOWED_USERS,ALLOWED_GROUPS,FILENAME,MIMETYPE,FILESIZE,SELECTED_EMBEDDING_MODEL,OPENAI_API_KEY,ANTHROPIC_API_KEY,WATSONX_APIKEY,WATSONX_URL,WATSONX_PROJECT_ID,OLLAMA_BASE_URL,OPENSEARCH_INDEX_NAME,ASTRA_DB_APPLICATION_TOKEN,ASTRA_DB_API_ENDPOINT,ASTRA_DB_KEYSPACE
- LANGFLOW_LOG_LEVEL=DEBUG
- LANGFLOW_WORKERS=${LANGFLOW_WORKERS:-1}
- LANGFLOW_AUTO_LOGIN=${LANGFLOW_AUTO_LOGIN}
Expand Down
21 changes: 20 additions & 1 deletion docs/docs/reference/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,25 @@ For Langflow flow IDs and Langflow timeout settings, see [Langflow settings](#la
| `INGESTION_TIMEOUT` | `3600` | Document ingestion timeout limit in seconds for each file. Increase this value if you experience timeouts when ingesting very large documents. Must be greater than or equal to [`LANGFLOW_TIMEOUT`](#langflow-settings). |
| `UPLOAD_BATCH_SIZE` | `25` | When ingesting folders, set the maximum number of files to ingest per batch. Each batch is an [ingestion task](/ingestion#monitor-ingestion). Increase this value to ingest more files per batch. If this value is too high, performance issues can occur. |

## Knowledge backend settings

Select the vector backend that OpenRAG should use for Langflow ingestion and retrieval.

| Variable | Default | Description |
|----------|---------|-------------|
| `VECTOR_BACKEND` | `opensearch` | Select the active knowledge backend. Use `opensearch` for the default OpenSearch-backed flows, or `astra` to switch OpenRAG to the built-in Astra DB Langflow flows. |
| `ASTRA_DB_APPLICATION_TOKEN` | Not set | Required when `VECTOR_BACKEND=astra`. Astra DB application token used by the Astra-backed Langflow vector store component. |
| `ASTRA_DB_API_ENDPOINT` | Not set | Required when `VECTOR_BACKEND=astra`. Astra DB Data API endpoint used by the Astra-backed Langflow vector store component. |
| `ASTRA_DB_KEYSPACE` | Not set | Optional when `VECTOR_BACKEND=astra`. Astra DB keyspace/namespace to use instead of the database default. |
| `ASTRA_CHAT_FLOW_ID` | Built-in | Optional. Override the default Astra chat/retrieval flow ID. Only needed when replacing the built-in Astra flows with custom ones. |
| `ASTRA_INGEST_FLOW_ID` | Built-in | Optional. Override the default Astra ingestion flow ID. |
| `ASTRA_URL_INGEST_FLOW_ID` | Built-in | Optional. Override the default Astra URL ingestion flow ID. |
| `ASTRA_NUDGES_FLOW_ID` | Built-in | Optional. Override the default Astra nudges flow ID. |

:::warning
The Astra backend stores one vector representation per collection. If you change the embedding provider or model for an existing Astra-backed corpus, reingest the documents so all stored vectors are regenerated with the new embedding configuration.
:::

## Langflow settings {#langflow-settings}

Configure the OpenRAG Langflow server's authentication, contact point, and built-in flow definitions.
Expand All @@ -103,7 +122,7 @@ For better security, it is recommended to set `LANGFLOW_SUPERUSER_PASSWORD` so t
| `LANGFLOW_SECRET_KEY` | Automatically generated | Secret encryption key for Langflow internal operations. It is recommended to [generate your own Langflow secret key](https://docs.langflow.org/api-keys-and-authentication#langflow-secret-key) for this variable. If this variable isn't set, then Langflow generates a secret key automatically. |
| `LANGFLOW_SUPERUSER` | `admin` | Username for the Langflow administrator user. |
| `LANGFLOW_SUPERUSER_PASSWORD` | Not set | Langflow administrator password. If this variable isn't set, then the Langflow server starts _without_ authentication enabled. It is recommended to set `LANGFLOW_SUPERUSER_PASSWORD` so the [Langflow server starts with authentication enabled](https://docs.langflow.org/api-keys-and-authentication#start-a-langflow-server-with-authentication-enabled). |
| `LANGFLOW_CHAT_FLOW_ID`, `LANGFLOW_INGEST_FLOW_ID`, `NUDGES_FLOW_ID`, `LANGFLOW_URL_INGEST_FLOW_ID` | Built-in flow IDs | These variables are set automatically to the IDs of the chat, Docling ingestion, URL ingestion, and nudges [flows](/agents). The default values are found in [`.env.example`](https://github.com/langflow-ai/openrag/blob/main/.env.example). Only change these values if you want to replace a built-in flow with your own custom flow. The flow JSON must be present in your version of the OpenRAG codebase. For example, if you [deploy self-managed services](/docker), you can add the flow JSON to your local clone of the OpenRAG repository before deploying OpenRAG. |
| `LANGFLOW_CHAT_FLOW_ID`, `LANGFLOW_INGEST_FLOW_ID`, `NUDGES_FLOW_ID`, `LANGFLOW_URL_INGEST_FLOW_ID` | Built-in flow IDs | These variables are set automatically to the IDs of the stock OpenSearch-backed chat, Docling ingestion, URL ingestion, and nudges [flows](/agents). The default values are found in [`.env.example`](https://github.com/langflow-ai/openrag/blob/main/.env.example). Only change these values if you want to replace a built-in flow with your own custom flow. The flow JSON must be present in your version of the OpenRAG codebase. For example, if you [deploy self-managed services](/docker), you can add the flow JSON to your local clone of the OpenRAG repository before deploying OpenRAG. When `VECTOR_BACKEND=astra`, OpenRAG automatically switches to the built-in Astra flow set. |
| `LANGFUSE_SECRET_KEY` | Not set | Optional Langfuse secret key to enable the [Langflow integration with Langfuse](https://docs.langflow.org/integrations-langfuse). |
| `LANGFUSE_PUBLIC_KEY` | Not set | Optional Langfuse public key to enable the [Langflow integration with Langfuse](https://docs.langflow.org/integrations-langfuse). |
| `LANGFUSE_HOST` | Not set | Leave empty for Langfuse Cloud. Required for self-hosted Langfuse deployments if `LANGFUSE_SECRET_KEY` and `LANGFUSE_PUBLIC_KEY` are set. The address must be relative to the OpenRAG container deployment. For example, `http://localhost:3002` or `http://host.docker.internal:3000`. |
Expand Down
Loading
Loading