Skip to content

Prhmma/Lexoroute

Repository files navigation

Lexoroute banner

Lexoroute

Trace the journey of a word through history.
Interactive etymology maps powered by FastAPI, React, Wikidata, Wiktionary, and AI-assisted storytelling.

Python 3.14+ FastAPI React 19 Vite License: MIT

Trace the journey of a word through history. Lexoroute visualises the etymological chain of any word as an interactive geographic map — each node is a language, each edge is a borrowing or derivation, and the path curves across continents and centuries.

Demo

Public deployment: https://lexoroute-latest.onrender.com/

Lexoroute demo screenshot

Table of contents

Quickstart

Want the shortest path to a running local setup?

# Terminal 1
cd backend
uv run python -m uvicorn app.main:app --reload --reload-dir app

# Terminal 2
cd frontend
npm install
npm run dev

Then open http://127.0.0.1:5173.

For environment variables, Windows-specific notes, Docker, and deployment details, see Getting started.

Features

  • Search any word and see its root languages plotted on a world map
  • Great-circle arcs connect ancestor → descendant with animated flow
  • Directed arrowheads show the direction of linguistic borrowing
  • Era labels (BCE / CE) mark historical attestation
  • Three data sources: Wikidata SPARQL, Wiktionary, or both combined
  • AI source — GPT-powered etymology with vivid historical notes and a word-journey story (requires OpenAI key)
  • Hover tooltips show all nodes at a stacked position (multiple languages that share the same map location)
  • Stack indicator on dots when multiple nodes share coordinates
  • Side panel with the full etymology tree
  • Auto-flies to fit all nodes when results load

Tech stack

Layer Stack
Backend Python 3.14 · FastAPI · httpx · pydantic-settings · uv
Frontend TypeScript · React 19 · Vite · Tailwind CSS 4 · Zustand · TanStack Query
Map deck.gl 9 · MapLibre GL · react-map-gl · CartoDB Dark Matter tiles

Getting started

Prerequisites

  • uv — Python package manager
  • Node.js 20+ and npm

1. Backend

cd backend
uv run python -m uvicorn app.main:app --reload --reload-dir app

Note: uv run fastapi dev may fail on Windows with a UnicodeEncodeError (cp1252 / emoji issue) or Failed to canonicalize script path (trampoline bug in uv 0.9.x). Use the python -m uvicorn form above as a reliable alternative.

The API starts on http://127.0.0.1:8000.

To override defaults, create backend/.env:

CORS_ORIGINS=["http://localhost:5173"]
SPARQL_ENDPOINT=https://query.wikidata.org/sparql
SPARQL_TIMEOUT=55.0
CACHE_TTL=3600
# Optional — enables the AI source
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-5.5
# Optional — enables the Neo4j graph cache (AuraDB Free works)
NEO4J_URI=neo4j+s://<your-instance>.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=<your-password>
NEO4J_DATABASE=neo4j
# Rate limiting (slowapi, per IP)
RATE_LIMIT_API=30/minute
RATE_LIMIT_AI=5/minute

2. Frontend

cd frontend
npm install
npm run dev

The app starts on http://127.0.0.1:5173. The Vite dev server proxies /api to the backend automatically.

Production build

cd frontend
npm run build     # outputs to frontend/dist/

Docker

Build the full app as a single production image:

docker build --platform linux/amd64 -t <dockerhub-username>/lexoroute:latest .

Run it locally:

docker run --rm -p 8000:8000 --env PORT=8000 <dockerhub-username>/lexoroute:latest

Open http://localhost:8000. The FastAPI backend serves /api/*, and the built Vite app is served from the same origin.

Push to Docker Hub:

docker login
docker push <dockerhub-username>/lexoroute:latest

Deploy to Render from Docker Hub

  1. In Render, create a new Web Service.
  2. Under Source Code, choose Existing Image.
  3. Use this image URL:
docker.io/<dockerhub-username>/lexoroute:latest
  1. If the Docker Hub repository is private, add Docker Hub registry credentials in Render.
  2. Set any runtime environment variables you need, such as OPENAI_API_KEY, OPENAI_MODEL, SPARQL_TIMEOUT, or CACHE_TTL.
  3. Deploy the service. The container command uses Render's PORT automatically.

For image-backed services, Render does not automatically redeploy just because latest changes in Docker Hub. Trigger a manual deploy in Render after each push, or configure a deploy hook.


API

GET /api/etymology

Parameter Type Default Description
word string required Word to trace (1–100 chars)
lang string en BCP-47 language code (en, fr, de, es, …)
source string combined wikidata · wiktionary · combined · llm

Response

{
  "word": "alcohol",
  "node_count": 5,
  "root": {
    "word": "الكحل",
    "language": "Arabic",
    "language_qid": "Q13955",
    "era": -800,
    "coordinates": [24.0, 45.0],
    "children": [ ... ]
  }
}

Errors

Status Meaning
404 No etymology found for the word
408 Wikidata SPARQL timed out
422 Word too long for AI source (max 50 chars)
429 Rate limit exceeded
502 Upstream API error
503 AI source not configured (no OPENAI_API_KEY)

Project structure

Lexoroute/
├── backend/
│   ├── app/
│   │   ├── main.py                 # FastAPI app, CORS, rate limiting, router registration
│   │   ├── config.py               # pydantic-settings config (reads .env)
│   │   ├── limiter.py              # slowapi Limiter instance (shared)
│   │   ├── models/etymology.py     # Pydantic models
│   │   ├── routers/etymology.py    # GET /api/etymology
│   │   ├── services/
│   │   │   ├── sparql.py           # SPARQL client with in-memory TTL cache
│   │   │   ├── tree_builder.py     # Converts edge list → recursive EtymologyNode
│   │   │   ├── etymology_service.py
│   │   │   └── sources/            # wikidata.py · wiktionary.py · combined.py · llm.py
│   │   └── data/language_coords.py # QID → {name, lat, lng, era}
│   ├── pyproject.toml
│   └── uv.lock
└── frontend/
    ├── src/
    │   ├── App.tsx                 # Root layout
    │   ├── api/etymology.ts        # fetchEtymology()
    │   ├── hooks/useEtymology.ts   # TanStack Query wrapper
    │   ├── types/etymology.ts      # Shared TypeScript interfaces
    │   └── components/
    │       ├── EtymologyMap.tsx    # deck.gl + MapLibre visualisation
    │       ├── EtymologyPanel.tsx  # Tree panel
    │       ├── SearchBar.tsx
    │       ├── SourceSelector.tsx
    │       └── WordTooltip.tsx
    ├── package.json
    └── vite.config.ts              # /api proxy → :8000

Data sources

Wikidata SPARQL — queries Wikidata lexeme entities using the P5191 (derived from) property. Walks up to 10 ancestor hops and fetches sibling/child forms. Supports English, French, German, Spanish.

Wiktionary — parses etymology sections from the Wiktionary API, extracting {{inh}}, {{der}}, and {{bor}} templates. Supports 40+ language codes including historical forms (Old English, Proto-Germanic, Latin, Ancient Greek, etc.).

Combined — runs both sources concurrently, merges by normalising lemmas, and deduplicates nodes.

AI (LLM) — uses OpenAI structured output to generate an etymological tree constrained to the app's known language set. Each node includes a vivid historical note; the response includes a one-sentence word-journey story. Input is sanitised and capped at 50 characters. Requires OPENAI_API_KEY in .env. Rate-limited to 5 requests/minute per IP. AI output is clearly labelled in the UI and may be inaccurate — verify with primary sources.


License

This project is available under the MIT License.

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors