Trace the journey of a word through history.
Interactive etymology maps powered by FastAPI, React, Wikidata, Wiktionary, and AI-assisted storytelling.
Trace the journey of a word through history. Lexoroute visualises the etymological chain of any word as an interactive geographic map — each node is a language, each edge is a borrowing or derivation, and the path curves across continents and centuries.
Public deployment: https://lexoroute-latest.onrender.com/
Want the shortest path to a running local setup?
# Terminal 1
cd backend
uv run python -m uvicorn app.main:app --reload --reload-dir app
# Terminal 2
cd frontend
npm install
npm run devThen open http://127.0.0.1:5173.
For environment variables, Windows-specific notes, Docker, and deployment details, see Getting started.
- Search any word and see its root languages plotted on a world map
- Great-circle arcs connect ancestor → descendant with animated flow
- Directed arrowheads show the direction of linguistic borrowing
- Era labels (BCE / CE) mark historical attestation
- Three data sources: Wikidata SPARQL, Wiktionary, or both combined
- AI source — GPT-powered etymology with vivid historical notes and a word-journey story (requires OpenAI key)
- Hover tooltips show all nodes at a stacked position (multiple languages that share the same map location)
- Stack indicator on dots when multiple nodes share coordinates
- Side panel with the full etymology tree
- Auto-flies to fit all nodes when results load
| Layer | Stack |
|---|---|
| Backend | Python 3.14 · FastAPI · httpx · pydantic-settings · uv |
| Frontend | TypeScript · React 19 · Vite · Tailwind CSS 4 · Zustand · TanStack Query |
| Map | deck.gl 9 · MapLibre GL · react-map-gl · CartoDB Dark Matter tiles |
- uv — Python package manager
- Node.js 20+ and npm
cd backend
uv run python -m uvicorn app.main:app --reload --reload-dir appNote:
uv run fastapi devmay fail on Windows with aUnicodeEncodeError(cp1252 / emoji issue) orFailed to canonicalize script path(trampoline bug in uv 0.9.x). Use thepython -m uvicornform above as a reliable alternative.
The API starts on http://127.0.0.1:8000.
To override defaults, create backend/.env:
CORS_ORIGINS=["http://localhost:5173"]
SPARQL_ENDPOINT=https://query.wikidata.org/sparql
SPARQL_TIMEOUT=55.0
CACHE_TTL=3600
# Optional — enables the AI source
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-5.5
# Optional — enables the Neo4j graph cache (AuraDB Free works)
NEO4J_URI=neo4j+s://<your-instance>.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=<your-password>
NEO4J_DATABASE=neo4j
# Rate limiting (slowapi, per IP)
RATE_LIMIT_API=30/minute
RATE_LIMIT_AI=5/minutecd frontend
npm install
npm run devThe app starts on http://127.0.0.1:5173. The Vite dev server proxies /api to the backend automatically.
cd frontend
npm run build # outputs to frontend/dist/Build the full app as a single production image:
docker build --platform linux/amd64 -t <dockerhub-username>/lexoroute:latest .Run it locally:
docker run --rm -p 8000:8000 --env PORT=8000 <dockerhub-username>/lexoroute:latestOpen http://localhost:8000. The FastAPI backend serves /api/*, and the built Vite app is served from the same origin.
Push to Docker Hub:
docker login
docker push <dockerhub-username>/lexoroute:latest- In Render, create a new Web Service.
- Under Source Code, choose Existing Image.
- Use this image URL:
docker.io/<dockerhub-username>/lexoroute:latest
- If the Docker Hub repository is private, add Docker Hub registry credentials in Render.
- Set any runtime environment variables you need, such as
OPENAI_API_KEY,OPENAI_MODEL,SPARQL_TIMEOUT, orCACHE_TTL. - Deploy the service. The container command uses Render's
PORTautomatically.
For image-backed services, Render does not automatically redeploy just because latest changes in Docker Hub. Trigger a manual deploy in Render after each push, or configure a deploy hook.
| Parameter | Type | Default | Description |
|---|---|---|---|
word |
string | required | Word to trace (1–100 chars) |
lang |
string | en |
BCP-47 language code (en, fr, de, es, …) |
source |
string | combined |
wikidata · wiktionary · combined · llm |
Response
{
"word": "alcohol",
"node_count": 5,
"root": {
"word": "الكحل",
"language": "Arabic",
"language_qid": "Q13955",
"era": -800,
"coordinates": [24.0, 45.0],
"children": [ ... ]
}
}Errors
| Status | Meaning |
|---|---|
| 404 | No etymology found for the word |
| 408 | Wikidata SPARQL timed out |
| 422 | Word too long for AI source (max 50 chars) |
| 429 | Rate limit exceeded |
| 502 | Upstream API error |
| 503 | AI source not configured (no OPENAI_API_KEY) |
Lexoroute/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI app, CORS, rate limiting, router registration
│ │ ├── config.py # pydantic-settings config (reads .env)
│ │ ├── limiter.py # slowapi Limiter instance (shared)
│ │ ├── models/etymology.py # Pydantic models
│ │ ├── routers/etymology.py # GET /api/etymology
│ │ ├── services/
│ │ │ ├── sparql.py # SPARQL client with in-memory TTL cache
│ │ │ ├── tree_builder.py # Converts edge list → recursive EtymologyNode
│ │ │ ├── etymology_service.py
│ │ │ └── sources/ # wikidata.py · wiktionary.py · combined.py · llm.py
│ │ └── data/language_coords.py # QID → {name, lat, lng, era}
│ ├── pyproject.toml
│ └── uv.lock
└── frontend/
├── src/
│ ├── App.tsx # Root layout
│ ├── api/etymology.ts # fetchEtymology()
│ ├── hooks/useEtymology.ts # TanStack Query wrapper
│ ├── types/etymology.ts # Shared TypeScript interfaces
│ └── components/
│ ├── EtymologyMap.tsx # deck.gl + MapLibre visualisation
│ ├── EtymologyPanel.tsx # Tree panel
│ ├── SearchBar.tsx
│ ├── SourceSelector.tsx
│ └── WordTooltip.tsx
├── package.json
└── vite.config.ts # /api proxy → :8000
Wikidata SPARQL — queries Wikidata lexeme entities using the P5191 (derived from) property. Walks up to 10 ancestor hops and fetches sibling/child forms. Supports English, French, German, Spanish.
Wiktionary — parses etymology sections from the Wiktionary API, extracting {{inh}}, {{der}}, and {{bor}} templates. Supports 40+ language codes including historical forms (Old English, Proto-Germanic, Latin, Ancient Greek, etc.).
Combined — runs both sources concurrently, merges by normalising lemmas, and deduplicates nodes.
AI (LLM) — uses OpenAI structured output to generate an etymological tree constrained to the app's known language set. Each node includes a vivid historical note; the response includes a one-sentence word-journey story. Input is sanitised and capped at 50 characters. Requires OPENAI_API_KEY in .env. Rate-limited to 5 requests/minute per IP. AI output is clearly labelled in the UI and may be inaccurate — verify with primary sources.
This project is available under the MIT License.

