Skip to content
Merged
Show file tree
Hide file tree
Changes from 64 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
6e4acd3
fix(security): bind Ray dashboard to localhost in docker-compose
EnjoyBacon7 Jun 24, 2026
7003e06
fix(security): run the Ray container as a non-root user (H4)
EnjoyBacon7 Jun 24, 2026
b69a812
chore(docker): run the app image as a non-root, OpenShift-compatible …
EnjoyBacon7 Jun 24, 2026
1c502f0
fix(security): don't bind-mount source / auto-reload in prod (N8)
EnjoyBacon7 Jun 24, 2026
bc9e286
fix(deploy): remove API_NUM_WORKERS footgun, force single uvicorn worker
EnjoyBacon7 Jun 24, 2026
7fd3752
docs: drop API_NUM_WORKERS from example env assets
EnjoyBacon7 Jun 24, 2026
bb5797b
chore(release): bump version to 1.1.12
EnjoyBacon7 Jun 24, 2026
0ae15c0
chore(release): bump version to 1.1.13
EnjoyBacon7 Jun 24, 2026
389072e
fix(security): harden Ansible deploy (host key checking, .env 0600, p…
EnjoyBacon7 Jun 24, 2026
fec84c2
fix(security): bind Ray dashboard to localhost by default (C3)
EnjoyBacon7 Jun 24, 2026
32e6dca
feat(api): connect to an external Ray cluster via RAY_ADDRESS
EnjoyBacon7 Jun 24, 2026
3ddc3a2
fix(security): require MinIO credentials via env, drop minioadmin def…
EnjoyBacon7 Jun 24, 2026
fb93d2f
fix(security): remove weak default DB password and AUTH_TOKEN (M2/M3)
EnjoyBacon7 Jun 24, 2026
c1cac2e
fix(security): drop seccomp:unconfined from Milvus containers (N11)
EnjoyBacon7 Jun 24, 2026
03b8618
fix(security): move Helm DB password to Secret, drop weak default (N7…
EnjoyBacon7 Jun 24, 2026
a75a156
fix(security): add default-deny NetworkPolicy to Helm chart (N12, C3 …
EnjoyBacon7 Jun 24, 2026
525d649
fix(security): restrict metrics stack exposure (N9)
EnjoyBacon7 Jun 24, 2026
81c039e
fix(security): pin image tags instead of latest (N10)
EnjoyBacon7 Jun 24, 2026
f49d77f
fix(security): require explicit ALLOW_NO_AUTH for the no-token admin …
EnjoyBacon7 Jun 24, 2026
ed1fb27
fix(security): whitelist writable fields in update_user (mass assignm…
EnjoyBacon7 Jun 24, 2026
ce43f38
fix(users): coerce empty external_user_id to NULL to avoid unique-ind…
EnjoyBacon7 Jun 24, 2026
a4456ad
fix(security): don't put session token in UI file URLs under OIDC (N13)
EnjoyBacon7 Jun 24, 2026
1e927ee
test(infra): update compose-storage test for the N8 bind-mount removal
EnjoyBacon7 Jun 24, 2026
89a3935
chore: sync uv.lock to version 1.1.13 and ruff-format user_service
EnjoyBacon7 Jun 24, 2026
d04d8f2
docs(forward-port): log the main→refactor security forward-port
EnjoyBacon7 Jun 24, 2026
1196e52
fix(security): require exp + jti and block replay on back-channel log…
EnjoyBacon7 Jun 24, 2026
eea34bb
fix(security): add clock-skew leeway and nbf check to OIDC JWT verifi…
EnjoyBacon7 Jun 24, 2026
8080944
fix(security): block cross-site logout CSRF (N3)
EnjoyBacon7 Jun 24, 2026
411fcb7
fix(security): revoke OIDC sessions on API-token regeneration (#361, …
EnjoyBacon7 Jun 24, 2026
ee68c34
docs(forward-port): record the OIDC-crypto cluster as ported/skipped
EnjoyBacon7 Jun 24, 2026
6d7f817
fix(streaming): don't attach finish_reason to content-bearing chunk
EnjoyBacon7 Jun 24, 2026
6fef508
fix(search): stop logging raw user query text at INFO (#481)
EnjoyBacon7 Jun 24, 2026
01a1311
fix(prompt): require a non-empty answer body in the RAG system prompt
EnjoyBacon7 Jun 24, 2026
145aec8
fix(security): scope surrounding-chunk lookup to the source partition…
EnjoyBacon7 Jun 24, 2026
6471f92
fix(security): stop fail-open in ensure_partition_role for missing pa…
EnjoyBacon7 Jun 24, 2026
ca1b73c
docs(forward-port): record Batch 4 progress (5 ported, 199424bf obvia…
EnjoyBacon7 Jun 24, 2026
cfd2447
fix(security): prevent Milvus filter injection escaping partition scope
EnjoyBacon7 Jun 24, 2026
2103418
fix(security): validate file_id and partition names to block Milvus f…
EnjoyBacon7 Jun 24, 2026
107cd7d
fix(security): ignore client-supplied LLM endpoint/credentials in llm…
EnjoyBacon7 Jun 24, 2026
c944620
fix(security): neutralize control tokens in RAG context (H8, #487)
EnjoyBacon7 Jun 24, 2026
972316b
fix(security): stop leaking stack traces / FS paths to clients (M7)
EnjoyBacon7 Jun 24, 2026
5925294
fix(security): enforce token limit in RAG mode and bound n/best_of (M12)
EnjoyBacon7 Jun 24, 2026
c9f8507
docs(forward-port): record Batch 4 RAG-security progress
EnjoyBacon7 Jun 24, 2026
3d0a6aa
fix(security): cap partitions a non-admin user may create (M13)
EnjoyBacon7 Jun 24, 2026
a678251
docs(forward-port): record d66cf029 (partition cap) ported
EnjoyBacon7 Jun 24, 2026
55178a4
fix(security): harden web-search content fetcher against SSRF and MITM
EnjoyBacon7 Jun 24, 2026
c3d6ca9
fix(security): make SVG rendering external-fetch guard explicit (8ea7…
EnjoyBacon7 Jun 24, 2026
79e0db0
docs(forward-port): record web/SVG SSRF progress; 63a857af obviated
EnjoyBacon7 Jun 24, 2026
843ce6d
fix(security): bound EML attachment fan-out during ingestion (M8)
EnjoyBacon7 Jun 24, 2026
ec558e4
fix(security): authorize source-file downloads by partition
EnjoyBacon7 Jun 24, 2026
6f6e748
fix(security): validate source_file_id in copy endpoint and bind Ray …
EnjoyBacon7 Jun 24, 2026
24a129c
docs(forward-port): Batch 4 (RAG/retrieval) complete
EnjoyBacon7 Jun 24, 2026
47e360e
fix(security): bump Starlette to >=0.47.2 and FastAPI to >=0.116.1 (d…
EnjoyBacon7 Jun 24, 2026
e7fa78e
fix(security): add path-tiered request rate limiting (M6)
EnjoyBacon7 Jun 24, 2026
57a0606
docs(forward-port): Batch 2 (deps) complete; only deferred infra/docs…
EnjoyBacon7 Jun 24, 2026
9a7fda0
docs(usage): document Chainlit on CHAINLIT_PORT under Ray Serve
EnjoyBacon7 Jun 24, 2026
47c655e
docs: complete OIDC/SSO quick-start move into the docs site
EnjoyBacon7 Jun 24, 2026
abf6a1c
style: ruff-format the forward-ported security changes
EnjoyBacon7 Jun 24, 2026
75e8b2d
docs(forward-port): finalize — all batches + docs complete
EnjoyBacon7 Jun 24, 2026
257c4cc
Merge remote-tracking branch 'origin/refactor/hexagonal' into forward…
hedhoud Jun 25, 2026
21bf3ec
fix(forward-port): address review hardening gaps
hedhoud Jun 25, 2026
92babff
fix(partitions): enforce quota inside create transaction
hedhoud Jun 25, 2026
ef24f8e
fix(mcp): gate no-auth dev bypass explicitly
hedhoud Jun 25, 2026
e827d64
fix(forward-port): close remaining review gaps
hedhoud Jun 25, 2026
47f07a8
fix(forward-port): close auth and search review gaps
hedhoud Jun 25, 2026
60b006d
fix(forward-port): address remaining review feedback
hedhoud Jun 25, 2026
6d27e98
fix(indexing): tolerate partition auto-create races
hedhoud Jun 25, 2026
e717a12
fix: address compose and OIDC review findings
hedhoud Jun 25, 2026
e4314d6
fix(indexing): return 403 (not 500) when a create-race loser lacks ac…
EnjoyBacon7 Jun 25, 2026
18050a2
fix(chainlit): serve source downloads from standalone Chainlit app
EnjoyBacon7 Jun 25, 2026
dc8fdff
fix(compose): build openrag with local uid
hedhoud Jun 26, 2026
ca7028f
fix(compose): auto-fix bind-mount perms via root entrypoint, then dro…
Ahmath-Gadji Jun 26, 2026
9c7096f
fix(ray-serve): boot replicas cleanly — picklable app + atomic actor …
Ahmath-Gadji Jun 26, 2026
4b5e3c3
feat(auth): gate interactive API docs behind login under OIDC
Ahmath-Gadji Jun 26, 2026
8bea2be
fix(chainlit): init standalone app's DB pool under OIDC + Ray Serve
Ahmath-Gadji Jun 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -454,4 +454,4 @@ New table `oidc_sessions`:
- **OIDC mode**`401 {"detail": "Unauthenticated"}` (no usable session/bearer and the path isn't a UI redirect target).
- Programmatic access: Bearer `users.token` accepted in both modes

**See Also**: Full configuration and troubleshooting guide at `docs/oidc.md`.
**See Also**: Full configuration and troubleshooting guide at `docs/content/docs/documentation/oidc.md` (quick start: `docs/content/docs/documentation/sso-quickstart.md`).
112 changes: 101 additions & 11 deletions FORWARD_PORT_LOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,109 @@ to the cutover re-implementation queue.

---

## Forward-ported (critical)
## `main` → `refactor/hexagonal` security forward-port (2026-06-24)

Working branch: `forward-port/main-to-hexagonal` (off `origin/refactor/hexagonal`).
Porting the 68 non-merge commits that landed on `main` after the
merge-base (`c9d53cc0`, 2026-05-21) — mostly a one-shot security-hardening audit
plus loader/OpenAI fixes, deploy hardening, deps and release chores. Each port is
one commit carrying the original subject/body + a `Forward-ported from <hash>`
trailer. Order chosen by the user: package security code first, infra non-root +
docs last.

### Ported

| source (main) | subject | target location(s) |
|---|---|---|
| `c1079d7f` | Ray dashboard → localhost (compose) | `infra/compose/docker-compose.yaml` |
| `8914dfbb` | non-root Ray container (H4) | `infra/docker/ray.Dockerfile` |
| `8914dfbb`+`6860341c`+`26a70dbc`+`c0847d8f` | non-root OpenShift app image (consolidated — later commits rewrote earlier) | `infra/docker/api.Dockerfile` |
| `645128dc` | no prod bind-mount / gate --reload (N8) | `infra/compose/docker-compose.yaml`, `infra/scripts/entrypoint.sh` |
| `0e5687bc` | remove API_NUM_WORKERS footgun | `infra/scripts/entrypoint.sh`, `infra/compose/.env.example`, `infra/charts/.../values.yaml`, `docs/.../env_vars.md` |
| `319bc5cd` | drop API_NUM_WORKERS from doc env assets | `docs/assets/env_example.env`, `env_linux_gpu.env` |
| `c558dd1f` / `d69be1da` | version bump → 1.1.12 / 1.1.13 | `pyproject.toml` |
| `2af1d76f` | harden Ansible deploy (#488) | `infra/ansible/ansible.cfg`, `playbooks/openrag.yml` |
| `34dc3a2f` | Ray dashboard localhost default (C3) | `openrag/api/main.py`, `openrag/api/mcp/server.py`, `infra/cluster.yaml`, `infra/quick_start/docker-compose.yaml`, `docs/assets/compose_ollama_cpu.yaml` |
| `aa015bdd` | external Ray cluster via RAY_ADDRESS | `openrag/api/main.py`, `openrag/api/mcp/server.py`, `.env.example`, docs |
| `0515f705` | require MinIO creds (H3) | `infra/compose/milvus/milvus.yaml`, `infra/quick_start/vdb/milvus.yaml`, `.env.example` |
| `0164b829` | remove weak DB/AUTH defaults (M2/M3) | `conf/config.yaml`, compose stacks, `docs/assets/compose_ollama_cpu.yaml`, `.env.example` |
| `b8002ef0` | drop seccomp:unconfined (N11) | `infra/compose/milvus/milvus.yaml` + `.named-volumes.yaml`, `infra/quick_start/vdb/milvus.yaml` |
| `24efaa66` | Helm DB password → Secret (N7) | `infra/charts/openrag-stack/values.yaml` |
| `c8f2d47f` | default-deny NetworkPolicy (N12) | `infra/charts/.../templates/networkpolicy.yaml` (new), `values.yaml` |
| `5caec50b` | restrict metrics exposure (N9) | `infra/compose/monitoring.docker-compose.yaml` |
| `d7fc3130` | pin image tags (N10) | `infra/charts/.../values.yaml` (openrag-owned → 1.1.13), `infra/compose/monitoring.docker-compose.yaml` (third-party pins). Compose app-image pins skipped (reverted by 64c3e722). |
| `7bc46696` | require ALLOW_NO_AUTH for no-token admin bypass | `openrag/api/middleware/auth.py`, `.env.example`, `tests/unit/api/middleware/test_bypass_config.py` |
| `202433d7` | update_user mass-assignment whitelist | `openrag/core/models/user.py` (extra="ignore"), `openrag/services/orchestrators/user_service.py` (whitelist), tests |
| `edd2c7ce` | external_user_id empty→NULL (#121) | `openrag/core/models/user.py` (validator; covers update path the repo missed) |
| `229503b4` | no session token in UI file URLs (N13) | `openrag/app_front.py` |
| `97c624ef` | back-channel logout exp/jti + replay (M9) | `services/auth/oidc_client.py`, `services/orchestrators/auth_service.py` (+ replay test) |
| `2b34a0d1` | clock-skew leeway + nbf (crypto) | `services/auth/oidc_client.py` |
| `c2fde135` | logout CSRF Fetch-Metadata guard (N3) | `api/routers/auth/oidc.py` (+ tests) |
| `9a73200a` | revoke OIDC sessions on token regen (#361, #486) | `services/orchestrators/{auth_service,user_service}.py` (startup-rotation guard + revoke_by_user already present) |
| `714f2a84` | streaming finish_reason not on content chunk | `core/utils/source_filtering.py` (+ test) |
| `0bc6157e` | stop logging raw query text (#481) | `api/routers/user/search.py` (query_len) |
| `52be26f1` | non-empty RAG answer body | `openrag/prompts/templates/sys_prompt_tmpl.txt` |
| `6bc898e9` | surrounding-chunk partition scope (N6) | `services/storage/vector_store_searcher.py` (+ tests) |
| `86c9b51d` | ensure_partition_role fail-open → 404 | `api/dependencies/auth.py` (+ tests) |
| `bf4ae134` | Milvus filter scope-escape via precedence | `services/storage/milvus_store.py` `_build_filter_expr` (paren-wrap multi-part) (+ tests) |
| `f079efa5` | validate file_id / partition allowlist | `core/indexing/validators.py`, `api/dependencies/auth.py`, `partition_service.py`, `api/routers/{user/search,admin/partitions}.py` (+ tests). Defense-in-depth atop `_format_value` escaping. |
| `db92875d` | strip client llm_override endpoint/creds | `services/inference/vllm_client.py` `_resolve_overrides`, `api/schemas/user/chat.py` (+ tests) |
| `e3c7eac2` + `81bccf08` | control-token neutralizer (H8, #487) | `core/utils/text.py` `neutralize_prompt_control_tokens`, `core/prompts/chat_prompt_builder.py` (+ tests) |
| `818d5446` | stop leaking stack traces / FS paths (M7) | `api/routers/admin/indexing.py` (generic save error; admin-gated traceback) |
| `54165900` | token limit in RAG mode + bound n/best_of (M12) | `api/routers/user/chat.py`, `api/schemas/user/chat.py` (+ tests) |
| `d66cf029` | cap partitions per non-admin user (M13) | `services/orchestrators/partition_service.py`, `api/routers/admin/partitions.py`, `.env.example` (+ tests) |
| `8ecbc781` | web-search SSRF/MITM deltas (verify_ssl default True + DNS-resolution guard hook) | `services/websearch/content_fetcher.py` (+ tests). The refactor already had per-hop redirect revalidation (#383). |
| `8ea723ca` | explicit cairosvg `unsafe=False` (SSRF/XXE) | `core/indexing/parsers/image_parser.py` |
| `221f8ed8` | EML attachment fan-out cap (M8) | `core/indexing/parsers/eml_parser.py` (+ test). eml-depth already bounded by the dispatcher; docx/pptx/pdf caps N/A (docling/marker delegate). |
| `67ec4199` | authorize source-file downloads by partition | new `api/routers/user/download.py` (replaces the open `/static` mount), `api/main.py`, `chat.py`/`source_links.py` rekeyed to chunk id (+ tests) |
| `761f47a0` | validate copy-endpoint source_file_id (#477) | `api/routers/admin/indexing.py` + `docs/assets/compose_linux_gpu.yaml` |
| `74de8232` | Starlette>=0.47.2 / FastAPI>=0.116.1 (CVE-2025-54121) | `pyproject.toml` + `uv.lock` regen (starlette 0.46.2->0.47.3, fastapi->0.116.2, chainlit->2.11.1) |
| `0e6e7836` + `4d8bca01` | path-tiered rate limiting (M6) | new `api/middleware/rate_limit.py` (registered before AuthMiddleware), `limits>=3.6` dep, `.env.example`, API-test compose disable (+ tests). Refactor never had slowapi, so the 4d8bca01 swap is folded in. |
| `701fcf9e` | document Chainlit on CHAINLIT_PORT under Ray Serve | `docs/.../getting_started/usage.mdx` |
| `8849fe7d` | OIDC/SSO docs move deltas (frontmatter + links + README/CLAUDE pointers) | the file move was already done in the refactor; ported the remaining deltas |
| (final) | ruff-format the ported lines | `style:` commit, mirrors main's 355a6305 / ee86fd47 |

### Skipped (already present / superseded on refactor)

_Security fixes, data-loss bugs, production outages re-implemented against
the new architecture. Each entry pairs the dev commit with the refactor
commit so a reviewer can audit equivalence._
| source (main) | reason |
|---|---|
| `f9e0c394` | /chainlit path-boundary anchor already in `api/middleware/auth.py` |
| `9b82996c` | azp on multi-aud tokens already in `services/auth/oidc_client.py` |
| `64c3e722` | compose `:latest` tags already the refactor state |
| `f9b8a776` | de-flake `test_cancel_*` superseded by refactor `cca5f415` |
| `47b8cd32` | refactor already fail-fasts on missing CHAINLIT_AUTH_SECRET (stricter, #380/`b63a9825`, with a regression test). 47b8cd32 only re-adds an ALLOW_NO_AUTH-gated default-secret fallback — a loosening intentionally NOT ported. |
| `73acb1c9` | both halves already present: `sanitize_next_url` rejects CR/LF/NUL + the `/\` protocol-relative vector (#360 regression test), and the callback already verifies userinfo.sub == id_token.sub (OIDC Core §5.3.2). |
| `cdb3edc9` | test-only follow-up to M9 on main's `test_oidc_client.py` (no equivalent file); subsumed by the new replay test which carries exp. |
| `199424bf` | empty-stream 502 (#363): obviated by the refactor architecture — non-streaming uses `self._llm.chat()`/`.generate()` (materialized dicts, not a stream's first chunk), and the inference client already raises `InferenceError(status_code=502)` on invalid upstream responses. No `__anext__`/`StopAsyncIteration` path remains. |
| `70a2db36` | CustomDocLoader page accumulation (#376): obviated — the parser-shim removal deleted `CustomDocLoader`; `.doc` now goes through the docling/marker workers, which produce whole-document markdown and have no single-page-overwrite bug. |
| `63a857af` | image-URL SSRF on captioning (H2): obviated — the refactor captions extracted image *bytes* (`vlm.caption_image(image_bytes)`); a document's remote `![](http://…)` becomes an `ImageBlock(source_url=…)` with empty bytes that nothing fetches, and the URL is never forwarded to the VLM. No SSRF gadget. (`image_captioning_url` is a dormant unread knob.) The SVG sub-fix is ported separately as `8ea723ca`. |

(none yet)
### Remaining (TODO — not yet ported)

## Deferred to cutover (features)
**Batches 1 (infra core), 2 (deps), 3 (auth/OIDC), 4 (RAG/retrieval), docs, final
ruff pass: ✅ COMPLETE.** All security-relevant package code, dependencies and
docs are ported or obviated. Targeted security/infra tests, ruff and diff checks
are clean on the forward-port branch.

**Not ported — one item, by deliberate choice:**
- `4bbefd41` **compose non-root rework** (init-perms bootstrap container + per-service
`user:` mappings). This is all-or-nothing: the per-service `user:` mappings only work
with the init-perms container that pre-creates and chowns the bind-mount dirs, and those
paths must be mapped exactly onto the refactor's `infra/compose/` layout (`../../data`,
`../../logs`, the `milvus/milvus.yaml` include, `../../extern/reranker/*`, the
named-volumes variant). It is **runtime-only** (no unit-test feedback; compose isn't
exercised in CI) and is the least security-critical item — the images already *build*
non-root via the ported Dockerfiles. Committing it on faith risks a silently broken
stack, so it is left for a follow-up that can `docker compose up` to verify init-perms
ownership and non-root writes. Target files: `infra/compose/docker-compose.yaml`,
`infra/compose/milvus/milvus.yaml`, `extern/reranker/*.yaml`, `infra/quick_start`.
- `563907ad` (doc comment trim) — cosmetic; the ported comments are already condensed and
differ from main's verbose originals, so the trim doesn't map cleanly. Skipped.

---

## Forward-ported (critical)

_Non-critical changes that landed on `dev` during MODE 2. These will be
re-implemented directly in the new architecture during MODE 3 (Phases
10-12) or post-cutover. List the dev PR number / commit and the target
location in the new layout._
_(legacy template section — superseded by the dated section above.)_

(none yet)
(none recorded under the original Phase 5–9 process)
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ OpenRag supports two authentication modes:

To enable OIDC, set `AUTH_MODE=oidc` and configure the required OIDC variables (see [`infra/compose/.env.example`](./infra/compose/.env.example) for the full list).

For comprehensive OIDC setup and configuration, see the [OIDC Authentication Guide](./docs/oidc.md).
For comprehensive OIDC setup and configuration, see the [OIDC Authentication Guide](./docs/content/docs/documentation/oidc.md) (or the [SSO Quick Start](./docs/content/docs/documentation/sso-quickstart.md) for a faster path).

3. `http://localhost:INDEXERUI_PORT` to access the indexer ui for easy document ingestion, indexing, and management

Expand Down
4 changes: 3 additions & 1 deletion conf/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,9 @@ rdb:
host: rdb
port: 5432
user: root
password: "root_password"
# No default: supply via the POSTGRES_PASSWORD env var. Shipping a known
# password would be a usable default credential on any exposed deployment.
password: ""
default_file_quota: -1
# Leave unset to derive partitions_for_collection_<VDB_COLLECTION_NAME>.
database: null
Expand Down
2 changes: 1 addition & 1 deletion docs/assets/compose_linux_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ x-openrag: &openrag_template
- ./ray_mount/logs:/app/logs
ports:
- ${APP_PORT:-8080}:${APP_iPORT:-8080}
- ${RAY_DASHBOARD_PORT:-8265}:8265 # Disable when in cluster mode
- 127.0.0.1:${RAY_DASHBOARD_PORT:-8265}:8265 # Localhost only: Ray dashboard/Jobs API is unauthenticated. Disable when in cluster mode
networks:
default:
aliases:
Expand Down
13 changes: 8 additions & 5 deletions docs/assets/compose_ollama_cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ x-openrag: &openrag_template
- ./ray_mount/logs:/app/logs
ports:
- 8090:8080
- 8265:8265 # Disable when in cluster mode
# Localhost only: Ray dashboard/Jobs API is unauthenticated (CVE-2023-48022). Disable when in cluster mode
- 127.0.0.1:${RAY_DASHBOARD_PORT:-8265}:8265
networks:
default:
aliases:
Expand All @@ -16,7 +17,7 @@ x-openrag: &openrag_template
- .env
environment:
- APP_PORT=8090
- AUTH_TOKEN=OpenRAG
- AUTH_TOKEN=${AUTH_TOKEN:?Set a strong AUTH_TOKEN in your .env}
- RERANKER_ENABLED=false
- MARKER_MAX_PROCESSES=1
- INDEXERUI_COMPOSE_FILE=true # Does not serve any purpose but needs to be enabled until PR is merged
Expand All @@ -38,7 +39,7 @@ services:
rdb:
image: postgres:15
environment:
- POSTGRES_PASSWORD=root
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?Set POSTGRES_PASSWORD in your .env}
- POSTGRES_USER=root
volumes:
- ./db:/var/lib/postgresql/data
Expand Down Expand Up @@ -72,8 +73,8 @@ services:
minio:
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
environment:
MINIO_ACCESS_KEY: minioadmin
MINIO_SECRET_KEY: minioadmin
MINIO_ACCESS_KEY: ${MINIO_ACCESS_KEY:?Set MINIO_ACCESS_KEY in your .env}
MINIO_SECRET_KEY: ${MINIO_SECRET_KEY:?Set MINIO_SECRET_KEY in your .env}
volumes:
- ./volumes/minio:/minio_data
command: minio server /minio_data --console-address ":9001"
Expand All @@ -91,6 +92,8 @@ services:
environment:
ETCD_ENDPOINTS: etcd:2379
MINIO_ADDRESS: minio:9000
MINIO_ACCESS_KEY_ID: ${MINIO_ACCESS_KEY:?Set MINIO_ACCESS_KEY in your .env}
MINIO_SECRET_ACCESS_KEY: ${MINIO_SECRET_KEY:?Set MINIO_SECRET_KEY in your .env}
volumes:
- ./volumes/milvus:/var/lib/milvus
healthcheck:
Expand Down
9 changes: 8 additions & 1 deletion docs/assets/env_example.env
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ VLM_MODEL=

## FastAPI App (no need to change it)
# APP_PORT=8080 # this is the forwarded port
# API_NUM_WORKERS=1 # Number of uvicorn workers for the FastAPI app
# The uvicorn path runs a single worker by design (Ray provides concurrency).
# To scale the HTTP layer, use Ray Serve: ENABLE_RAY_SERVE=true with
# RAY_SERVE_NUM_REPLICAS=N (see the Ray Serve configuration section).

## To enable API HTTP authentication via HTTPBearer
# AUTH_TOKEN=sk-openrag-1234
Expand Down Expand Up @@ -42,6 +44,11 @@ RAY_DEDUP_LOGS=0 # turns off ray log deduplication that appear across multiple p
RAY_ENABLE_RECORD_ACTOR_TASK_LOGGING=1 # # to enable logs at task level in ray dashboard
RAY_task_retry_delay_ms=3000
RAY_ENABLE_UV_RUN_RUNTIME_ENV=0 # critical with the newest version of UV
# Attach to an external Ray cluster instead of starting an embedded one (disables the local dashboard).
# RAY_ADDRESS=ray://X.X.X.X:10001
# Interface the embedded Ray dashboard binds to. Defaults to 127.0.0.1 (loopback) because the
# dashboard/job API is unauthenticated (CVE-2023-48022). Set 0.0.0.0 only behind a firewall/auth proxy.
# RAY_DASHBOARD_HOST=127.0.0.1

# Indexer UI
## 1. replace X.X.X.X with localhost if launching local or with your server IP
Expand Down
9 changes: 8 additions & 1 deletion docs/assets/env_linux_gpu.env
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ VLM_MODEL=

## FastAPI App (no need to change it)
# APP_PORT=8080 # this is the forwarded port
# API_NUM_WORKERS=1 # Number of uvicorn workers for the FastAPI app
# The uvicorn path runs a single worker by design (Ray provides concurrency).
# To scale the HTTP layer, use Ray Serve: ENABLE_RAY_SERVE=true with
# RAY_SERVE_NUM_REPLICAS=N (see the Ray Serve configuration section).

## To enable API HTTP authentication via HTTPBearer
# AUTH_TOKEN=sk-openrag-1234
Expand Down Expand Up @@ -38,6 +40,11 @@ RAY_DEDUP_LOGS=0 # turns off ray log deduplication that appear across multiple p
RAY_ENABLE_RECORD_ACTOR_TASK_LOGGING=1 # # to enable logs at task level in ray dashboard
RAY_task_retry_delay_ms=3000
RAY_ENABLE_UV_RUN_RUNTIME_ENV=0 # critical with the newest version of UV
# Attach to an external Ray cluster instead of starting an embedded one (disables the local dashboard).
# RAY_ADDRESS=ray://X.X.X.X:10001
# Interface the embedded Ray dashboard binds to. Defaults to 127.0.0.1 (loopback) because the
# dashboard/job API is unauthenticated (CVE-2023-48022). Set 0.0.0.0 only behind a firewall/auth proxy.
# RAY_DASHBOARD_HOST=127.0.0.1

# Indexer UI
## 1. replace X.X.X.X with localhost if launching local or with your server IP
Expand Down
6 changes: 3 additions & 3 deletions docs/content/docs/documentation/API.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ The FastAPI-powered backend provides a comprehensive document-based question ans

## 🔐 Authentication

All endpoints require authentication when **enabled** (by adding a authorization token `AUTH_TOKEN` in your **`.env`**). Include your **`AUTH_TOKEN`** in the HTTP request header:
Protected endpoints require authentication by default. Set `AUTH_TOKEN` in your `.env` and include it in the HTTP request header:

```http
Authorization: Bearer YOUR_AUTH_TOKEN
```

For OpenAI-compatible endpoints, `AUTH_TOKEN` serves as the `api_key` parameter. Use a placeholder like `'sk-1234'` when authentication is disabled (necessary for when using OpenAI client).
For OpenAI-compatible endpoints, `AUTH_TOKEN` serves as the `api_key` parameter. Local no-auth development requires the explicit `ALLOW_NO_AUTH=true` opt-in; otherwise an empty token fails closed.

---

Expand Down Expand Up @@ -586,7 +586,7 @@ from openai import OpenAI, AsyncOpenAI
api_base_url = "http://localhost:8080" # fastapi base url of 'openrag'
base_url = f"{api_base_url}/v1"

auth_key = ... # your api authentification key AUTH_TOKEN in your .env. Is authentification is disabled, use a placeholder like 'sk-1234'
auth_key = ... # your API authentication key AUTH_TOKEN from .env
client = OpenAI(api_key=auth_key, base_url=base_url)

your_partition= 'my_partition' # name of your partition
Expand Down
Loading
Loading