Skip to content

neuralbroker/lexora-ai

Repository files navigation

Lexora AI

Lexora AI is a FastAPI-based document question-answering platform. Users upload PDF, TXT, Markdown, or DOCX files, Lexora extracts and chunks their text, stores embeddings in a per-user FAISS index, and answers natural-language questions using retrieval-augmented generation (RAG) with source attribution.

What the project does

  1. A user registers and authenticates with JWT access and refresh tokens.
  2. The user uploads a supported document.
  3. The backend validates and stores the file, extracts text, splits it into chunks, embeds those chunks, and writes vectors plus metadata to FAISS.
  4. The user asks a question through the chat API.
  5. The retrieval layer embeds the query, searches the user's vector store, optionally filters by selected documents, and builds an LLM context.
  6. The LLM service generates either a normal response or a Server-Sent Events streaming response.
  7. Conversations, messages, documents, and users are persisted in the database.

Tech stack

Area Technology
API FastAPI, Uvicorn
Validation/config Pydantic v2, pydantic-settings
Database SQLAlchemy 2 async ORM, PostgreSQL in production, SQLite for tests/local checks
Auth JWT via python-jose, password hashing via passlib/bcrypt
Vector search FAISS
AI orchestration LangChain, OpenAI chat and embedding models
Cache Redis async client
Background jobs Celery
Observability Prometheus metrics, structlog JSON logging
Tests pytest, pytest-asyncio, httpx ASGI transport

Main capabilities

  • User registration, login, refresh token flow, and authenticated /me endpoint
  • Document upload and validation for pdf, txt, md, and docx
  • Text extraction and chunking utilities
  • Per-user vector store isolation with FAISS persistence
  • Retrieval-augmented chat with source metadata
  • Streaming chat responses over SSE
  • Conversation and message history
  • Redis-backed retrieval cache
  • Prometheus metrics endpoint at /metrics
  • Health endpoint at /health and readiness endpoint at /ready
  • Docker Compose stack for PostgreSQL, Redis, app, Celery worker, and Nginx

Project structure

lexoraai/
├── app/
│   ├── api/v1/              FastAPI route modules for auth, documents, and chat
│   ├── core/                Exceptions, logging, and security helpers
│   ├── models/              Pydantic request/response models
│   ├── schemas/             SQLAlchemy ORM models and DB session setup
│   ├── services/            Business logic for documents, chat, embeddings, vectors, cache, retrieval, and LLMs
│   ├── tasks/               Celery worker entry points
│   ├── utils/               Document parsing and text chunking utilities
│   ├── config.py            Environment-driven settings
│   ├── deps.py              FastAPI dependencies
│   └── main.py              Application factory and global routes
├── alembic/                 Database migration assets
├── docker/                  Dockerfile, Compose stack, and Nginx config
├── scripts/                 Operational helper scripts
├── tests/                   Unit and integration tests
├── requirements.txt         Runtime dependencies
├── requirements-dev.txt     Test/lint/dev dependencies
└── pyproject.toml           Tooling configuration

Prerequisites

  • Python 3.11 is recommended for the pinned dependency set.
  • PostgreSQL 15+ for full application use.
  • Redis 7+ for cache and Celery broker/result backend.
  • OpenAI API key for real embedding and chat generation.
  • Docker and Docker Compose if you want the packaged local stack.

Note: the current pinned dependencies were validated in this workspace with the existing virtual environment, but a full reinstall under Python 3.13 may require dependency upgrades because packages such as psycopg2-binary==2.9.9 may attempt a source build.

Environment variables

Create a local .env file from the provided template and set at least these values:

Variable Purpose Example
DATABASE_URL Async SQLAlchemy database URL postgresql+asyncpg://postgres:postgres@localhost:5432/lexora
REDIS_URL Redis cache URL redis://localhost:6379/0
SECRET_KEY JWT signing key, use a strong 32+ character secret change-this-to-a-real-secret-value
OPENAI_API_KEY OpenAI API key for embeddings and chat sk-...
OPENAI_MODEL Chat model gpt-4-turbo-preview
OPENAI_EMBEDDING_MODEL Embedding model text-embedding-3-small
OPENAI_EMBEDDING_DIMENSIONS Embedding vector dimension 1536
UPLOAD_DIR Uploaded file storage path ./uploads
FAISS_INDEX_PATH FAISS index storage path ./data/faiss
CELERY_BROKER_URL Celery broker URL redis://localhost:6379/1
CELERY_RESULT_BACKEND Celery result backend URL redis://localhost:6379/2
DOCUMENT_PROCESSING_MODE inline for local/dev processing or background for Celery-based processing inline
CORS_ORIGINS Comma-separated allowed origins http://localhost:3000,http://localhost:8000

Local setup

1. Create and activate a virtual environment

Linux/macOS:

python3.11 -m venv venv
source venv/bin/activate

Windows PowerShell:

py -3.11 -m venv venv
.\venv\Scripts\Activate.ps1

2. Install dependencies

pip install -r requirements.txt -r requirements-dev.txt

3. Start infrastructure

docker compose -f docker/docker-compose.yml up -d postgres redis

4. Configure .env

cp .env.example .env

Then edit .env with your database, Redis, secret, and OpenAI values.

5. Run the API

uvicorn app.main:app --reload

The API will be available at:

Docker Compose setup

To run the full containerized stack:

docker compose -f docker/docker-compose.yml up --build

The Compose stack includes:

  • postgres: PostgreSQL database
  • redis: Redis cache/broker
  • app: FastAPI application
  • celery-worker: Celery worker process
  • nginx: reverse proxy

API examples

Register a user:

curl -X POST http://localhost:8000/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email":"user@example.com","password":"password123","full_name":"Demo User"}'

Login:

curl -X POST http://localhost:8000/api/v1/auth/login \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=user@example.com&password=password123"

Upload a document:

curl -X POST http://localhost:8000/api/v1/documents \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -F "file=@path/to/document.pdf"

List documents:

curl http://localhost:8000/api/v1/documents \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN"

Ask a non-streaming chat question:

curl -X POST http://localhost:8000/api/v1/chat/message \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message":"What is this document about?"}'

Ask a streaming chat question:

curl -N -X POST http://localhost:8000/api/v1/chat/stream \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message":"Summarize the key points."}'

Create a conversation:

curl -X POST http://localhost:8000/api/v1/chat/conversations \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"title":"Research notes"}'

Testing

Run the test suite:

python -m pytest

In this workspace, the test suite currently passes:

  • 31 passed
  • Coverage: 46%

Important test notes:

  • Tests use an in-memory SQLite database through dependency overrides.
  • Tests set local environment variables in tests/conftest.py.
  • Tests do not call the real OpenAI API.

Runtime verification performed

The app was started successfully with a SQLite run-check database and test settings. Startup completed, the database initialized, Redis connected, and Uvicorn served the API on 127.0.0.1:8010 before the bounded run command was stopped.

A fresh screenshot of the running app health endpoint was captured from a headless Chromium browser:

Lexora AI running health check

To regenerate this screenshot locally, run:

venv/Scripts/python.exe scripts/capture_readme_screenshot.py

Implementation notes and current optimizations

  • Retrieval filtering happens inside the retrieval/vector-search path instead of filtering only source metadata after context construction. This avoids sending context from documents outside a requested document filter.
  • Document processing can run inline for local development or be queued to Celery with DOCUMENT_PROCESSING_MODE=background.
  • JWTs include jti identifiers, logout revokes the current access token in Redis until token expiry, and refresh-token rotation blacklists the used refresh token.
  • Database and token timestamps use timezone-aware UTC values instead of deprecated datetime.utcnow() calls.
  • Chat history is now passed to the LLM as real user/assistant turns instead of user-only messages with blank assistant responses.
  • FAISS metadata stores embeddings alongside chunk metadata so index rebuilds after deletion avoid unnecessary re-embedding when possible.
  • A module-level get_file_type() wrapper exists for compatibility with the test suite and public utility-style imports.
  • Test settings cache clearing uses get_settings.cache_clear(), which is the actual cached settings function.
  • FAISS indexes are isolated per user under FAISS_INDEX_PATH/<user_id>/.
  • Redis retrieval cache keys include user ID, query hash, and document filter to prevent cross-user or cross-filter leakage.

Known limitations and next improvements

  • FAISS still rebuilds the user index on deletion. Stored embeddings reduce rebuild cost, but high-churn production deployments should still consider a deletion-friendly vector database or FAISS strategy.
  • Document processing defaults to inline for safer local development. Set DOCUMENT_PROCESSING_MODE=background in production when the Celery worker is running.
  • Test coverage is improved for chat-history behavior, but document ingestion, retrieval, vector storage, cache, and full chat orchestration should still receive more unit/integration tests.
  • SECRET_KEY defaults are development-only and must be overridden in production.

Security considerations

  • Never commit real .env secrets or OpenAI API keys.
  • Use a strong production SECRET_KEY.
  • Restrict CORS_ORIGINS to trusted frontend origins.
  • Token revocation uses Redis, so production deployments should keep Redis highly available.
  • Add rate limiting enforcement before public deployment.

License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors