An API gateway to parse PDFs and images via LLM-based OCR models, utilizing a Redis queue and asynchronous service workers for high-concurrency.
Prereqs: Go, PostgreSQL, Redis, Python3 with paddleocr and requests (for Paddle runner).
Install Python deps (runner):
python3 -m pip install paddleocr requestsSet env and run server:
export DATABASE_URL="postgres://user:password@localhost:5432/my_database"
export REDIS_URL="redis://localhost:6379"
export PORT=3000
go run ./srcRun worker using PaddleOCR runner:
WORKER=1 OCR_PROVIDER=paddle go run ./srcFallback (legacy HTTP OCR service):
WORKER=1 go run ./srcThat's it — enqueue with POST /textify and fetch results with GET /results/{jobId}.