AI Agent

This project uses a pre-trained LLM to create an AI agent that assists with coding tasks. Users can provide command-line instructions for the program to analyze a codebase, identify issues, and apply corrections.

Project Demo:

Tech Stack

Frontend: n/a
Backend: Python, Google Gemini 2.5 Flash, JSON (for API usage logging)

Project Structure

ai-agent/
├── public/                     # Media assets
├── calculator/                 # Calculator app logic
│   ├── calculator.py           # Implements calculator class functionality
│   └── render.py               # Handles output rendering of expressions and results
├── functions/                  # Contains files with helper functions for AI agent to
├── .gitignore                  # Git ignore rules
├── .env                        # Environmental variables
├── main.py                     # Entry point for CLI agent
├── quota_tracker.py            # Tracks API usage and persists logs
├── quota_log.json              # JSON file persisting daily and minute usage logs
├── main.py                     # Entry point for CLI agent
├── tests.py                    # Test scripts for call functions
├── pyproject.toml              # Project configuration and dependencies
├── uv.lock
├── README.md

# Before running this project locally, ensure you have the following installed:
- IDE (VS Code, PyCharm, etc.)
- Install Python 3.10+ version > visit python.org/downloads/

# Install dependencies
- uv project/package manager > docs.astral.sh/uv/getting-started/installation/
- Environmental variables > uv add python-dotenv==1.1.0
- Google Gemini API > uv add google-genai==1.12.1

Quick Start

This repo will later be, if not already, saved as a subfolder. Be sure to only clone relevant files. Then, do the following:

Virtual Environment Setup

All-in-one command to create project: uv init project-name && cd project-name
Create virtual environment: uv venv
Use uv's project environment and avoid pyenv/global mismatches: uv add requests && uv run python main.py
Activate virtual environment: source .venv/bin/activate

Environmental Variables Setup

Create a .env file with the following contents:

GEMINI_API_KEY=""
AI_MODEL="gemini-2.5-flash"
MAX_CHAR_LIMIT=1000
SYSTEM_PROMPT=""
WORKING_DIR=""
MAX_ITERATIONS=5

This project used Google Gemini 2.5 Flash. At the time of this writing a free tier existed. Regardless of the AI provider you choose, you should set values for each variable shown. Since the requests per minute (RPM) was 5 for the agent chosen, the value for MAX_ITERATIONS was set to this value.

API Key Setup

Create an API Key on Google AI Studio
Store API Key inside .env file on the GEMINI_API_KEY environmental variable
Add .env file to .gitignore

Usage

User Prompts

After the virtual environment has been activated, users should use the following prompt format:

python main.py 'ENTER YOUR PROMPT HERE' [--verbose]

The program will throw an error if a prompt is not entered after the program file name. Optionally, users can use a --verbose statement in the prompt for the response to report token input and output metadata.

Safeguards

The program grants read and write privileges to a codebase. This can be dangerous! Safeguards taken throughout this project included: (1) safely storing API keys; (2) limiting read-write privileges to a single directory; (3) protecting against directory traversal; (4) using timeout limits when running subprocesses; (5) setting an iteration limit to avoid infite agent call loops; and (6) setting a character limit for output to preserve tokens. Error handling was used, but did not cover all edge cases.

Safegaurds (1), (2), (5), and (6) are implemented via environmental variables in the the .env file, which the .gitignore file then protects from public exposure. Safeguards (3) and (4) are implemented via files in the ./functions/ folder.

Lastly, at the end of each iterative call to the AI agent, the console prints out observability metrics. This includes API consumption levels and usage rate limits for requests per day (RPD), requests per minute (RPM), and tokens per minute (TPM). This is orchestrated by the quota_tracker.py file that reads and writes data to the quota_log.json file.

System Design

1. Requirements

Functional Requirements:

User submits a user_prompt in command-line interface (CLI)
For a single WORKING_DIRECTORY, the AI agent can inspect files, read contents, write changes, and execute Python files
Program ends with model response that satisfies initial prompt

Optional Requirements:

Verbose mode: if user ends initial prompt with --verbose tag for model response to include additional metadata
Low latency: use timeout limits when running subprocesses
Scalability: limiting output character length to preserve tokens (MAX_CHAR_LIMIT in .env file)
Security: (1) safely storing API keys (GEMINI_API_KEY in .env file); (2) limit read-write privileges to a single directory (WORKING_DIRECTORY in .env file); (3) protecting against directory traversal (conditional checks in ./functions/ files); (4) setting an iteration limit to avoid infite model + function execution loops (MAX_ITERATIONS in .env file)
CAP Theorem: prioritize consistency (read/writes must be correct) over availability
Observability: provide a log of API consumption and usage rate limits

2. Core Entities

UserPrompt: user_prompt (string), is_verbose (boolean)
ModelSettings: GEMINI_API_KEY, AI_MODEL, SYSTEM_PROMPT, available_functions (function schemas), conversation_history (content messages)
ModelCalls: MAX_ITERATIONS, current_iteration, model_response
FunctionCalls: WORKING_DIR, function_name, arguments, function_response
APIUsageLogs: tracked in quota_log.json as JSON arrays:
- "daily_requests": timestamps of all requests for the current day
- "minute_requests": timestamps of requests in the last 60 seconds
- "minute_tokens": tuples of (timestamp, input tokens) in the last 60 seconds

3. API (or Interface)

Command-Line Interface (CLI):

python main.py 'user_prompt' [--verbose]

Model Call:

call_model(model_settings, conversation_history) -> model_response(result | function_calls | error)

Function Calls:

call_function(function_name, arguments, WORKING_DIR) -> function_response (result | error)

In reality, several function call files were written. However, each generally follows this convention.

4. Data Flow

I created the diagram above to illustrate function calling with Gemini API. The four numbers shown correspond with the article steps outlined, which are used throughout this project for developer documentation. The overall workflow follows this order and is expanded upon in a table below:

Function Declarations: The ./functions/schemas.py file defined all external functions to be used by the model. The ./functions/call_function.py file bundled these functions into a types.Tool() object.
Call Model: The Gemini model is executed. This requires providing the model settings: GEMINI_API_KEY, AI_MODEL, SYSTEM_PROMPT, user prompt, conversation history, and function declarations.
Call Functions: Function calls are executed. This requires parsing the model response from step (2) to determine function names and arguments to call. The WORKING_DIR must be provided on each function call.
Model Response: The Gemini model is executed again within an agent loop that repeats steps (2) and (3). At the end of each cycle, the function response is parsed for content to update the conversation history. The agent loop runs at most for MAX_ITERATIONS and final output has a character limitation of MAX_CHAR_LIMIT to preserve tokens. Final output includes information related to API consumption and usage rate limits. This information is taken froma JSON file that is updated between iterations of the AI agent.

5. High Level Design

The project currently reprsents a single-node, agent-driven architecture without a frontend, backend server, or database. The project could be improved to be more robust in the following manner:

Client: Use HTML and JavaScript to create an asynchronous web application using WebSockets over a TCP connection. This will allow the UI to provide a real-time log of the agent's thought process after a user submits a prompt and waits for a response.
API Gateway: This is the single, centralized entry point and security perimeter to the backend services. Security benefits include handling authentication/authorization and rate limiting. Scalability benefits include horizontal load balancing and service routing.
Server (Agent Orchestration): This service runs the core agent logic (planningn and action loop). Can be implemented as a stateless microservice that passes user requests to an asynchronous task queue (e.g., Redis or Kafka). This would allow for horizontal scaling and improved client response times by allowing tasks to execute independently.
Database: May use two database types. Relational database may store user account information and system logs. Vector database may store embeddings of conversation history for retrieval-augmented generation (RAG).

Credits and Contributing

Boot.dev provided the project requirements and guidance to complete this project. Modifications were made to follow function calling guidance from Google. The Google Gen AI SDK for Python was used as a source of truth for development. Contributions are welcome! Feel free to report any problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Agent

Tech Stack

Project Structure

Quick Start

Virtual Environment Setup

Environmental Variables Setup

API Key Setup

Usage

User Prompts

Safeguards

System Design

1. Requirements

2. Core Entities

3. API (or Interface)

4. Data Flow

5. High Level Design

Credits and Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
calculator		calculator
functions		functions
public		public
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
quota_log.json		quota_log.json
quota_tracker.py		quota_tracker.py
tests.py		tests.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

AI Agent

Tech Stack

Project Structure

Quick Start

Virtual Environment Setup

Environmental Variables Setup

API Key Setup

Usage

User Prompts

Safeguards

System Design

1. Requirements

2. Core Entities

3. API (or Interface)

4. Data Flow

5. High Level Design

Credits and Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages