Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,13 @@
"next/using-iii/engine",
"next/using-iii/console",
"next/using-iii/cli",
"next/using-iii/deployment"
{
"group": "Deployment",
"pages": [
"next/using-iii/deployment",
"next/using-iii/deploy-railway"
]
}
]
},
{
Expand Down Expand Up @@ -188,7 +194,13 @@
"using-iii/engine",
"using-iii/console",
"using-iii/cli",
"using-iii/deployment"
{
"group": "Deployment",
"pages": [
"using-iii/deployment",
"using-iii/deploy-railway"
]
}
]
},
{
Expand Down
308 changes: 308 additions & 0 deletions docs/next/using-iii/deploy-railway.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,308 @@
---
title: "Deploy to Railway"
description: "Run the iii engine and your workers on Railway with a clean, reusable base image."
owner: "devrel"
type: "how-to"
---

This guide deploys the iii engine and your workers to [Railway](https://railway.com).
The approach is the same one used everywhere else (see
[Self-hosted deployment](./deployment)): ship a **clean engine image** and let your
`config.yaml` decide what runs. The base image contains **no** workers. You add
capabilities by declaring them, and the engine provisions them. One image stays
reusable across every app, and adding an integration is another config entry.

Railway differs from a bare-host deploy in three ways that shape the rest of this
guide: its private network is IPv6-only, durable state persists on a mounted
volume, and its edge terminates TLS so you do not run your own reverse proxy.

## A base image worth reusing

Build a base that contains the engine **and** the `iii-worker` daemon (the
process that runs add-on workers), but nothing app-specific. The pre-built
distroless `iiidev/iii:latest` is engine-only and can't be extended (no shell),
so for any deployment that uses registry workers, build a small base from the
install script instead. Railway builds it from this repo
[Dockerfile](https://docs.railway.com/builds/dockerfiles):

```dockerfile
# Dockerfile: clean iii base. Contains NO workers.
FROM debian:bookworm-slim

# curl, ca-certificates, and jq drive the installer. libssl3 and libcap2 are
# engine runtime deps; libcap-ng0 provides libcap-ng.so.0, which the iii-worker
# daemon needs to launch binary registry workers (for example `database`).
RUN apt-get update \
&& apt-get install -y --no-install-recommends curl ca-certificates jq libssl3 libcap2 libcap-ng0 \
&& rm -rf /var/lib/apt/lists/*

# Installs both `iii` (engine) and `iii-worker` (the add-on worker daemon).
RUN curl -fsSL https://install.iii.dev/iii/main/install.sh | sh
ENV PATH="/root/.local/bin:${PATH}"

WORKDIR /app
COPY config.yaml /app/config.yaml

EXPOSE 49134 3111 3112
CMD ["iii", "--config", "/app/config.yaml"]
```

<Note>
A ready-to-fork starter with this `Dockerfile`, a `config.yaml`, and a
`railway.json` (Dockerfile builder + restart policy) lives at
[iii-experimental/railway-template](https://github.com/iii-experimental/railway-template).
</Note>

## A Railway-ready config.yaml

The only opinionated file is `config.yaml`. Two things make it Railway-ready:
**bind to `[::]`** (Railway private networking is IPv6-only) and **put all
durable state under `/data`** (a Railway volume):

```yaml
workers:
# The first manager entry is the engine WS port. Bind [::] so worker
# services can reach it over Railway's IPv6 private network.
- name: iii-worker-manager
config: { host: "[::]", port: 49134 }

- name: iii-http
config:
host: "[::]"
port: 3111
cors:
allowed_origins: ["*"] # tighten to your domains in production
- name: iii-stream
config:
host: "[::]"
port: 3112
adapter: { name: kv, config: { store_method: file_based, file_path: /data/stream_store } }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep yaml, for coerence?

- name: iii-state
config:
adapter: { name: kv, config: { store_method: file_based, file_path: /data/state_store.db } }
- name: iii-queue
config:
adapter: { name: builtin, config: { store_method: file_based, file_path: /data/queue_store } }
- name: configuration
config:
adapter: { name: fs, config: { directory: /data/configuration } }
```

## Add workers by declaring them

Each entry above is a worker. That is all there is in iii: everything is a
worker, and you run one by **declaring it** in `config.yaml`. Nothing is added
to the image itself. If a worker you declare isn't present locally, the
engine fetches it from the registry on boot. For example, add a SQLite-backed
database with nothing in the image:

```yaml
- name: database
config:
databases:
primary:
url: sqlite:/data/iii.db
```

The engine logs `Worker 'database' not found locally, checking registry...` then
registers `database::query`, `database::execute`, and the rest as a plain child
process, no image rebuild. Add any worker the same way: a `- name: <worker>`
entry plus its config schema (see the worker's page on
[workers.iii.dev](https://workers.iii.dev)). Supply secrets (database URLs,
S3/R2 credentials) from Railway service variables, covered below.

<Note>
Auto-provisioning downloads the worker on first boot, which adds cold-start
time and needs registry access at runtime. For reproducible, fast starts in
production, **pin it into the image**: add `RUN iii worker add <name>` to your
Dockerfile. The base stays generic; the build step is your version lock.
</Note>

## Connect the services

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we give readers explicit steps to create the Railway services, rather than only describing them?


A typical app is the engine plus one or more worker services that connect to it
over the private network:

| Service | What it is | Key setting |
| --- | --- | --- |
| `engine` | the base image above + your `config.yaml` | volume mounted at `/data`; `RAILWAY_RUN_UID=0` so a non-root image can write it; public domain on port `3111` |
| `api-worker` (and friends) | your Node/Python/Rust worker code | `III_URL=ws://engine.railway.internal:49134` |

Worker services stay private (no public domain) and reach the engine over
`engine.railway.internal`, Railway's internal DNS name for the engine service.
Railway [private networking](https://docs.railway.com/private-networking) is
IPv6-only, which is why the engine manager binds `[::]`. A worker that binds or
dials `127.0.0.1` will not find the engine across services; always use the
`.railway.internal` hostname.

<Note>
Set the engine service's
[restart policy](https://docs.railway.com/deployments/restarts) to **`ALWAYS`**. The
engine exits cleanly (exit code 0) when it reloads `config.yaml`, and Railway's default
`ON_FAILURE` policy only restarts on a non-zero exit, so a clean exit would leave the
engine stopped. The `railway.json` in the
[starter template](https://github.com/iii-experimental/railway-template) is the place to
set this.
</Note>

## Public domain, TLS, and routing

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not clear!


Railway's edge is your reverse proxy. It terminates TLS and forwards one public
domain to one port on one service, so you do not run Caddy or nginx yourself
(contrast the bare-host [Hardening](./deployment#hardening) section). See Railway's
[public networking](https://docs.railway.com/public-networking) guide for the
domain and `PORT` details.

- Attach the public domain to the **engine** service and target port **3111**,
the `iii-http` worker that serves your registered HTTP routes. Railway issues
and renews the certificate, so the domain serves HTTPS with no extra config.
- A custom domain works the same way: add it on the engine service and point your
DNS `CNAME` at the Railway target Railway shows you.
- Keep worker services private. They have no domain and are reachable only on the
internal network.
- If an external client needs the raw engine WebSocket port (`49134`), expose it
with a Railway [TCP proxy](https://docs.railway.com/networking/tcp-proxy).
Inside Railway, prefer the private network.
- Health check (optional): Railway's default check confirms the port is
listening. For an application-level check, set the service
[healthcheck](https://docs.railway.com/deployments/healthchecks) path to a
route your `iii-http` worker serves.

## Secrets and environment

Supply every credential through Railway
[service variables](https://docs.railway.com/variables), never the image or git.
Reference them in `config.yaml` with `${VAR}` placeholders
(`${VAR:-default}` for a fallback); the engine substitutes them at boot:

```yaml
- name: database
config:
databases:
primary:
url: ${DATABASE_URL} # set DATABASE_URL on the engine service
```

Shared variables let several services read one value, which is useful when a
worker and the engine both need the same connection string. Change a variable and
redeploy (or restart) the service for the engine to pick it up.

## Stateful workers and object storage

Attach a Railway [volume](https://docs.railway.com/volumes) to the engine
service, mount it at `/data`, and keep every worker's persistence path under it
(`sqlite:/data/iii.db`, `/data/queue_store`, `/data/state_store.db`). Railway
mounts volumes as `root`, so a non-root image needs `RAILWAY_RUN_UID=0` to write
the volume.

For object storage, use the `storage` worker's **remote providers**, which need
no local disk:

```yaml
- name: storage
config:
buckets:
uploads: { provider: s3, bucket: my-bucket, region: us-east-1 }
avatars: { provider: r2, bucket: avatars, account_id: ${R2_ACCOUNT_ID},
access_key_id: ${R2_ACCESS_KEY_ID}, secret_access_key: ${R2_SECRET_ACCESS_KEY} }
```

<Note>
The `storage` worker's `local` provider runs a `rustfs` sidecar that does not
reach a healthy state inside Railway's container (a worker-internal lifecycle
issue, not a config one). On Railway, use a remote provider (`s3`, `gcs`, `r2`)
for object storage.
</Note>

## Scaling

- **More workers**: add another service per worker (or per language runtime).
They all dial the same `engine.railway.internal:49134`.
- **External adapters**: swap the `file_based`/`builtin` adapters for Redis and
RabbitMQ when you outgrow single-instance file storage. See
[Scale out with Redis and RabbitMQ](./deployment#scale-out-with-redis-and-rabbitmq).
Add those as Railway services (or use a managed add-on) and point the
adapter config at their private hostnames.
- **Object storage**: the `storage` worker's remote providers (`s3`, `gcs`,
`r2`) are the durable, scalable path on Railway; supply credentials from
service variables.

## What cannot run on Railway

Railway containers do **not** expose `/dev/kvm`. Any worker that boots a
micro-VM therefore cannot run there:

- `iii-sandbox` and any OCI/image (managed) worker. They boot guests via
libkrun. See [Engine-managed workers (micro-VMs)](./deployment#engine-managed-workers-micro-vms).
- Bundle workers, which are dispatched through the same libkrun rails. (Note
some workers are moving to `deploy: binary`, which **does** run on Railway as a
plain process. Check the worker's current type before assuming.)

Every other worker runs there, including `deploy: binary` workers, which run as
plain processes.

## Verify the deployment

Once the engine service is live, the engine log shows each declared worker
registering, including any it auto-provisioned from the registry. Then call a
route your worker registered through the public domain:

```bash
curl https://<your-engine-domain>/orders -X POST \
-H 'content-type: application/json' \
-d '{ "sku": "abc", "qty": 1 }'
```

A response from your handler confirms the full path: Railway edge to `iii-http`
to your worker to the `database` worker on the volume.

<Note>
Redeploying the engine briefly re-registers HTTP routes, which can race a
worker that still holds the old route. Prefer `railway restart` over a full
redeploy when only restarting, and restart dependent worker services after an
engine redeploy so they reconnect and re-register.
</Note>

## Railway deployment checklist

This is the Railway-specific layer on top of the general
[Deployment checklist](./deployment#deployment-checklist).

**Image and build**

- [ ] Clean engine base built from the install script (not the distroless image) when you
use registry workers; distroless is engine-only with no shell to extend.
- [ ] `config.yaml` baked into the image; rebuild and redeploy to change it (Railway builds
from the Dockerfile, with no live file mount).
- [ ] `libcap-ng0` installed in the image if you run binary registry workers (for example
`database`); the `iii-worker` daemon needs it to launch them.
- [ ] Registry workers pinned with `RUN iii worker add <name>` for fast, reproducible
starts and no runtime registry dependency.

**Networking**

- [ ] Engine binds `[::]`; worker services dial `ws://engine.railway.internal:49134`, never
`127.0.0.1`.
- [ ] Public domain on the engine targets port `3111`; worker services have no public
domain.
- [ ] Raw `49134` exposed (via TCP proxy) only behind an RBAC listener
([RBAC](./deployment#rbac)); otherwise keep it on the private network.

**State and resiliency**

- [ ] Volume mounted at `/data`; every worker `file_path` lives under it.
- [ ] `RAILWAY_RUN_UID=0` set so a non-root image can write the root-owned volume.
- [ ] No `DO NOT USE IN_MEMORY` warnings in the boot logs.
- [ ] Single engine service while on `file_based` storage; move to Redis/RabbitMQ before
scaling to multiple replicas.
- [ ] Engine restart policy set to `ALWAYS`, so a clean exit from a config reload does not
leave the service stopped (Railway's default `ON_FAILURE` skips code-0 exits).

**Security and operations**

- [ ] CORS narrowed to real origins (the sample uses `*`).
- [ ] Secrets supplied through Railway variables, never the image or git.
- [ ] Dependent worker services restarted after an engine redeploy (HTTP-route
re-registration race).
- [ ] Health check path pointed at a real `iii-http` route; observability exporter chosen
deliberately ([Observability](./deployment#observability)).
Loading
Loading