LexiGPT is a retrieval-augmented generation (RAG) chatbot designed to provide intelligent, personalized book recommendations. It combines semantic search with large language models (LLMs) such as OpenAI and DeepSeek to process natural language queries. Built with FastAPI and a React web interface, LexiGPT enables fast, accurate book discovery through a conversational web interface.
- LLM-Powered Responses: Utilizes OpenAI and DeepSeek to generate intelligent book recommendations.
- Semantic Search: Employs vector embeddings to identify books based on contextual relevance rather than keywords.
- Real-Time Streaming: Delivers instantaneous, conversational book recommendations.
- Multi-LLM Support: Allows configuration of various language model providers.
- REST API: Built with FastAPI to ensure efficient and high-performance search capabilities.
- Query Processing: Transforms user queries into vector embeddings using a Sentence Transformer model.
- Semantic Search: Matches query embeddings with precomputed book embeddings using cosine similarity.
- Ranking & Retrieval: Fetches the top-k most relevant books based on similarity scores.
- LLM Enhancement: Refines and elucidates retrieved books using a language model for enhanced recommendations.
LexiGPT allows users to search for books using natural language descriptions. It integrates with the OpenLibrary Web Service to fetch book data and uses LLMs to refine and enhance search results. The goal is to deliver fast, relevant recommendations with response times under 1-3 seconds.
Finding the right book can be challenging when you don't have an exact title or author in mind. LexiGPT solves this by interpreting natural language queries, identifying key details, and retrieving relevant book recommendations in real-time.
LexiGPT follows a client-server architecture for scalability and performance.
- Tech Stack: React, Fetch API (for HTTP requests)
- Core Features:
- Conversational Search → Supports flexible, natural language queries.
- Dynamic Recommendations → Displays book titles, authors, and summaries based on user input.
- Responsive UI → Optimized for desktop and mobile.
- Tech Stack:
- FastAPI → High-performance Python backend.
- LLM APIs → Uses OpenAI, DeepSeek, and HF APIs for query processing.
- Docker → Containerized for portability.
- Core Features:
- Query Understanding → Extracts structured details (title, author, genre) from user input.
- Information Retrieval → Fetches book metadata from OpenLibrary.
- AI-Enhanced Responses → Summarizes and ranks recommendations using LLMs.
- Caching for Performance → Stores frequent queries to reduce API calls and improve speed.
These instructions outlines how environment variables are configured and managed across different environments for both the frontend and backend applications. Properly managing environment variables ensures security, scalability, and seamless deployments.
For local development, environment variables are stored in .env files. Each service (frontend/backend) loads the required variables from these files to simulate a real production environment.
Before running the project for the first time, copy the .env.example template file to create a local .env file:
cp .env.example .env.developmentThis ensures all required environment variables are properly set.
The React application uses .env.development to configure local API endpoints and settings.
REACT_APP_API_BASE_URL=http://localhost:8000
REACT_APP_ENV=development
We can use process.env to access these variables in the application:
const API_BASE_URL = process.env.REACT_APP_API_BASE_URL;TThe FastAPI backend loads environment variables using python-dotenv.
For example, the REDIS_PASSWORD is loaded from the .env.development as such:
from dotenv import load_dotenv
import os
load_dotenv(".env.development")
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD")
if not REDIS_PASSWORD:
raise ValueError("Missing required environment variable: REDIS_PASSWORD")- We add error handling to ensure the application falls gracefully if critical environment variables are missing.
In CI/CD pipelines, environment variables are not stored in .env files. Instead, they are securely stored in GitHub Secrets.
We can manage environment variables with secrets:
Secrets are managed in GitHub Actions → Repository Settings → Secrets.
Ensure the following secrets are set before running any CI/CD workflows:
REDIS_PASSWORD=supersecretpassword
JWT_SECRET_KEY=supersecurejwtkey
FRONTEND_ORIGIN=https://staging.yourdomain.com
GitHub Actions automatically injects secrets into workflows:
env:
REDIS_PASSWORD: ${{ secrets.REDIS_PASSWORD }}
JWT_SECRET_KEY: ${{ secrets.JWT_SECRET_KEY }}In production, environment variables are securely stored in AWS Secrets Manager and injected into ECS Task Definitions.
AWS allows us to store secrets as key-value pairs.
aws secretsmanager create-secret --name REDIS_PASSWORD --secret-string "productionsecret"
aws secretsmanager create-secret --name JWT_SECRET_KEY --secret-string "productionjwtsecret"In AWS ECS, secrets are injected at runtime through task definitions.
environment:
- name: REDIS_PASSWORD
valueFrom: arn:aws:secretsmanager:us-east-1:123456789012:secret:REDIS_PASSWORD
- name: JWT_SECRET_KEY
valueFrom: arn:aws:secretsmanager:us-east-1:123456789012:secret:JWT_SECRET_KEYEnvironment variables passed from AWS Secrets Manager to running containers using task definitions.
{
"containerDefinitions": [
{
"name": "backend",
"image": "backend-image:latest",
"memory": 512,
"cpu": 256,
"essential": true,
"environment": [
{
"name": "FRONTEND_ORIGIN",
"value": "https://yourdomain.com"
}
],
"secrets": [
{
"name": "REDIS_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:REDIS_PASSWORD"
},
{
"name": "JWT_SECRET_KEY",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:JWT_SECRET_KEY"
}
]
}
]
}AWS ECSreadssecretsfromAWS Secrets Manager.- The container loads
environment variablesfrom thosesecrets. - The backend can access these variables using
os.getenv().
If a required environment variable is missing, the application is designed to fail gracefully with an error message.
import os
JWT_SECRET_KEY = os.getenv("JWT_SECRET_KEY")
if not JWT_SECRET_KEY:
raise ValueError("Missing required environment variable: JWT_SECRET_KEY")This guide walks you through the steps to set up both the frontend and backend development environments. Before you begin, ensure you have the following prerequisites installed:
- Node.js
- Yarn
- Docker Desktop
This guide will walk you through the process of setting up Playwright for integration testing in your CRA TypeScript project, targeting both your frontend and API server. We'll use Yarn for dependency management.
Add playwright as a development dependency:
yarn add --dev @playwright/testThis command downloads the necessary browser binaries (Chromium, Firefox, WebKit) required for testing.
npx playwright installCreate a file named playwright.config.ts in the root of your project and add the following configuration:
import { defineConfig, devices } from "@playwright/test";
export default defineConfig({
testDir: "./tests", // Directory where your tests will live
timeout: 30000, // Global timeout for each test in milliseconds
expect: {
timeout: 5000, // Timeout for expect assertions
},
use: {
// Set a base URL for your tests (adjust as needed)
baseURL: "http://localhost:3000",
// Capture screenshots on test failure
screenshot: "only-on-failure",
},
projects: [
{
name: "Chromium",
use: { ...devices["Desktop Chrome"] },
},
{
name: "Firefox",
use: { ...devices["Desktop Firefox"] },
},
{
name: "WebKit",
use: { ...devices["Desktop Safari"] },
},
],
});The frontend application is built with React. Follow these steps to launch the development server:
Open your terminal and change your current directory to the /frontend directory:
cd frontendRun the following command to start the development server:
yarn run start-
This command launches the React development server.
-
You should see logs indicating that the server is running, and the application will typically open in your default browser.
The backend is powered by FastAPI and uses Redis for caching/session management. Docker Compose is used to build and run the services.
Open your terminal and change your current directory to the /backend directory:
cd backendRun the following command to build and launch all the backend services:
docker compose up --buildThis command performs the following:
What this command does:
- Builds Docker Images: It builds the Docker images for each service defined in the docker-compose.yml file using the respective Dockerfiles.
- Starts Containers: It launches the containers for:
- API Service: Runs the FastAPI application, which depends on Redis for caching and session storage. The API is bound to 127.0.0.1 on port 8000 for local access.
- Redis Service: Uses the lightweight redis:alpine image with a custom configuration file and a persistent volume (redis_data) for storing data. It also includes a health check to ensure Redis is running correctly.
We can set up our development environment by running:
docker compose up --buildThis command will spin up the following services:
- Web Interface (React Application)
- REST API (Python)
- Caching Server (Redis)
Then, run automated tests in docker with the command:
docker compose -f docker-compose.test.yml up --build- This will run all our tests inside a controlled environment.
docker compose logs -fThe RAG system enhances LLM responses by incorporating external knowledge through a multi-step process:
- Objective: Gather book metadata via OpenLibrary’s API.
- Method:
- Extract Subjects:
def extract_subjects(url="https://openlibrary.org/subjects") -> List[str]: # Code to parse HTML and extract subjects...
- Collect Books per Subject:
def fetch_books_by_subject(subject: str, limit=100) -> List[Dict[str, Any]]: # Code to query the API and structure metadata...
- Extract Subjects:
- Output: A JSON file containing subjects and book lists.
- Objective: Normalize and structure text for better semantic search.
- Method:
- Text Normalization: Convert text to lowercase, remove special characters, tokenize, and lemmatize.
def normalize_text(text: str) -> str: # Code to clean and lemmatize text...
- Metadata Formatting: Standardize titles, authors, and subjects.
- Text Normalization: Convert text to lowercase, remove special characters, tokenize, and lemmatize.
- Output: Cleaned and structured data ready for embedding.
- Objective: Transform textual metadata into vector embeddings for similarity searches.
- Method:
- Format for Embedding:
def format_book_for_embedding(book: dict) -> str:
return f"Title: {book.get('title', '')}. Author: {book.get('author', '')}. Subjects: {book.get('subjects', '')}. Year: {book.get('year', '')}."- Create Embeddings:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
embedding = model.encode(["Example text for embedding"])- Output: A JSON file containing vector embeddings for each book.
- Objective: Use LLMs to refine and enhance retrieved book data.
- Method:
- Query Processing: Transform user queries into embeddings.
- Retrieve and Rank: Use cosine similarity to identify top-k matching books.
- Generate Enhanced Responses: Leverage OpenAI or DeepSeek to produce detailed recommendations.
- Output: Final, enriched recommendations delivered to the user.
For more detailed insight, read the full Medium article.
This section walks through deploying the backend service on AWS using IAM, ECR, and ECS. The deployment process ensures a secure, scalable, and containerized environment for the application.
Before we begin, ensure you have:
- An AWS account with administrative access.
- The AWS CLI installed and configured.
- Docker installed on your local machine.
- Your backend service containerized using Docker.
- Log in to your AWS account and navigate to the AWS IAM Console.
- In the left sidebar, click Users → Create User.
- Provide a username (e.g.,
backend-deploy-user). - Select Access Key - Programmatic Access.
Attach the following policies to grant the necessary permissions:
- AmazonEC2ContainerRegistryFullAccess – Allows full access to Amazon ECR (push, pull, delete images).
- AmazonECS_FullAccess – Provides full permissions to create and manage ECS resources.
- IAMFullAccess – Enables role and permission management.
- CloudWatchLogsFullAccess – Allows access to CloudWatch for ECS logging.
- AmazonS3ReadOnlyAccess – Grants read-only access to S3, useful if storing static assets or configuration.
- SecretsManagerReadWrite - Grant permission to read secrets in AWS Secrets Manager.
- In the IAM user settings, navigate to Security Credentials.
- Scroll to the Access Keys section and click Create Access Key.
- Copy both the Access Key ID and Secret Access Key. Store these securely.
- Install AWS CLI by running the command in terminal:
brew install awscli- Run the following command and enter the IAM credentials when prompted:
aws configure- Open the Amazon ECR Console.
- Click Create Repository.
- Enter a repository name (e.g.,
backend-service). - Select Private Repository.
- Click Create and note the repository URI.
To push Docker images to ECR, first authenticate your local Docker client:
aws ecr get-login-password --region ${AWS_REGION} | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.comEnsure you replace the following variables:
${AWS_REGION}with the AWS region (e.g.,us-east-2)${AWS_ACCOUNT_ID}with your AWS account ID.
Once the Amazon ECR repository is set up, we need to build, tag, and push the Docker image. This ensures that the backend service is properly containerized and stored in AWS Elastic Container Registry (ECR) for deployment.
We will use docker buildx to build the image. This ensures compatibility with AWS Fargate, which runs on linux/amd64 architecture.
docker buildx build \
--platform linux/amd64 \
--provenance=false \
-t ${ECR_REPOSITORY_NAME}:latest \
--load .--platform linux/amd64→ Ensures compatibility with AWS Fargate.--provenance=false→ Disables provenance metadata, reducing build time.-t ${ECR_REPOSITORY_NAME}:latest→ Tags the image locally as latest.--load→ Loads the built image into the local Docker daemon.
Tagging is required so the image can be correctly referenced when pushing to ECR.
docker tag ${ECR_REPOSITORY_NAME}:latest \
${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${ECR_REPOSITORY_NAME}:latest${ECR_REPOSITORY_NAME}:latest→ The locally built image.${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${ECR_REPOSITORY_NAME}:latest→ The ECR repository where the image will be pushed.
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.us-east-2.amazonaws.com/${ECR_REPOSITORY_NAME}:latestNow that our Docker image is stored in Amazon ECR, we can deploy it to Amazon ECS (Elastic Container Service) using Fargate, AWS’s serverless container orchestration service.
- Open the ECS Console.
- In the left sidebar, click Clusters → Create Cluster.
- Select Fargate (serverless).
- Configure the cluster settings as needed (default settings are fine for most cases).
- Click Create.
A task definition tells ECS how to run a container, including CPU/memory limits, networking, and container settings.
- Navigate to Task Definitions → Create New Task Definition.
- Choose Fargate as the launch type and click Next.
- Configure the container settings:
- Container Name:
backend-service - Image URI: Use the ECR repository URI from Step 2.
- Memory/CPU: Choose an appropriate size (e.g.,
512 MB / 0.25 vCPU). - Port Mappings: Set 8000 (or the port your backend listens on).
- Container Name:
- Click Create.
A service ensures that the task (container) runs continuously and handles scaling.
- Navigate to ECS Clusters and select your cluster.
- Click Create Service.
- Configure the service:
- Launch Type:
Fargate - Task Definition: Select the one created in Step 3.2.
- Service Name:
backend-service - Number of Tasks: Set an appropriate number (
1for testing, scale up for production).
- Launch Type:
- Click Deploy.
- Navigate to your ECS Cluster → Services.
- Click on your running service and note the Public IP of the running task.
- Test the deployment by sending a request:
curl http://${PUBLIC_IP}:8080If the service responds correctly, your backend is now successfully running on AWS ECS Fargate.
When changes are made to the backend service, the updated Docker container needs to be deployed to Amazon ECS to ensure the latest version of the application is running in the cluster. Below are the steps to build, tag, push, and deploy the updated container.
First, build the Docker image for the backend service. This step compiles the application into a container that can be deployed to ECS.
docker buildx build \
--platform linux/amd64 \
--provenance=false \
-t ${ECR_REPOSITORY_NAME}:latest \
--load .After building the image, tag it with the full ECR repository URL. This step prepares the image to be pushed to the Amazon Elastic Container Registry (ECR).
docker tag ${ECR_REPOSITORY_NAME}:latest \
${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${ECR_REPOSITORY_NAME}:latestdocker push ${AWS_ACCOUNT_ID}.dkr.ecr.us-east-2.amazonaws.com/${ECR_REPOSITORY_NAME}:latestFinally, update the ECS service to use the newly pushed container image. This step triggers a new deployment, replacing the old containers with the updated version.
aws ecs update-service --cluster ${ECS_CLUSTER_NAME} --service ${ECS_SERVICE_NAME} --force-new-deployment --region ${AWS_REGION}Run the following command to attach the SecretsManagerReadWrite AWS Managed Policy to your IAM user:
aws iam attach-user-policy \
--user-name ${AWS_IAM_USER} \
--policy-arn arn:aws:iam::aws:policy/SecretsManagerReadWriteTo set a single environment variable, run:
aws amplify update-app \
--app-id ${AWS_AMPLIFY_APP_ID} \
--environment-variables ${ENV_NAME}=`${VALUE}`We can also load multiple variables by creating an env.json file:
{
"REDIS_PASSWORD": "your-secure-password",
"FRONTEND_ORIGIN": "https://yourdomain.com",
"API_KEY": "12345abcdef"
}This section outlines the steps required to deploy the frontend of the Book Search Application using AWS Amplify. AWS Amplify automates the deployment process, integrating directly with GitHub for continuous deployment. By following these steps, you can ensure a smooth and efficient deployment process.
Before starting the deployment process, ensure the following steps are completed:
- AWS Account: You need an AWS account with the necessary permissions to create and manage Amplify applications.
- GitHub Repository: The frontend code should be hosted in a GitHub repository.
- Node.js and Yarn: Ensure Node.js and Yarn are installed on your local machine.
Run the linting tool to ensure the code adheres to the project's coding standards:
yarn run lintGenerate an optimized production build of the application:
yarn run buildThis command compiles the React application into a set of static files that can be served by a web server.
- Go to the AWS Amplify Console.
- Sign in with your AWS credentials.
- Click the "Create New Application" button to start the process.
- Select GitHub as the source code provider.
- Choose the name of the repository from your GitHub account.
- Select the
mainbranch for deployment.
Since the frontend application is part of a monorepo, specify the root directory:
- Set the root directory to
frontendto ensure Amplify knows where to find the source code.
- Click the "Next" button to proceed.
- AWS Amplify will automatically build and deploy the application to the cloud.
Once the deployment is complete, the application will be accessible at the provided Amplify URL.