Skip to content

Mokshii46/CODESAGE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 CodeSage β€” AI Code Interpreter & Explainer

Python AST NLP GUI Terminal License


🌿 Overview

CodeSage is a Python framework that combines compiler principles, tree-walk interpretation, and AI-based summarization to not just run your code β€” but explain it in plain English.

It reads Python source code, tokenizes it, builds an Abstract Syntax Tree, interprets it live, and produces a structured plain-English summary of what the code does β€” all available via a Tkinter GUI or a terminal mode.

πŸ“– Documentation Β |Β  πŸ™ GitHub


✨ Key Features

  • Scanner / Lexer β€” Tokenizes raw source character by character; catches unrecognized symbols early
  • Recursive Descent Parser β€” Produces a full AST with meaningful syntax error messages
  • AST Summarizer β€” Converts loops, conditionals, and assignments into readable plain English
  • Tree-Walk Interpreter β€” Evaluates expressions and executes code live from the AST
  • NLP / GPT Integration (optional) β€” AI-powered line-by-line explanations via GPT-4o-mini
  • GUI Mode β€” Tkinter interface with code editor, output console, AST summary panel, and colored AST tree
  • Terminal Mode β€” Scanner output, parser AST, plain English summary, and execution result in the terminal

πŸ“Š At a Glance

Metric Value
Pipeline stages 6
Run modes 2 (GUI + Terminal)
Python constructs supported 5+

πŸ–₯️ Two Ways to Run

GUI Mode

Full Tkinter interface with code editor, interpreter output, AST summary, and colored AST tree β€” all in one window. Uses the local tree-walk interpreter; no API key required.

python -m codesage.gui

Terminal Mode

Type code directly in the terminal. Get scanner output, parser AST, plain English summary, and execution result.

python main.py

Uncomment the GPT block in main.py to enable AI-powered summaries.


βš™οΈ Installation

1️⃣ Clone the Repository

git clone https://github.com/Mokshii46/CODESAGE.git
cd CODESAGE

2️⃣ Set Up a Virtual Environment

python3 -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

3️⃣ Install Dependencies

pip install -r requirements.txt

🧬 How It Works β€” Pipeline

Stage Name Description Tag
1 Scanner / Lexer Reads raw source character by character; converts to tokens (keywords, operators, literals, identifiers); catches unrecognized symbols early lexical analysis
2 Recursive Descent Parser Transforms the token stream into an AST capturing the logical, hierarchical structure; generates meaningful syntax error messages syntax analysis
3 AST Summarizer Traverses the AST node by node, converting constructs β€” loops, conditionals, assignments β€” into structured, readable plain English code summarization
4 Tree-Walk Interpreter Recursively executes the AST β€” evaluates expressions, runs loops and functions, handles conditionals β€” producing live runtime output execution
5 NLP / GPT Integration (optional) Uncomment the GPT block in main.py to enable AI-powered line-by-line explanations via GPT-4o-mini. Requires an OpenAI API key in .env natural language
6 GUI / IDE Built with Tkinter β€” code editor, output console, AST summary panel, and colored AST tree visualization all in one window tkinter

πŸ“Ί Example Output

Input code:

i = 0
while i < 5:
    print(i)
    i = i + 1

AST Summary:

β†’ Assigning '0' to variable 'i'
β†’ While loop: runs while i < 5
β†’ Print value of i each iteration
β†’ Increment i by 1

Interpreter output: 0 1 2 3 4

GUI Panel Output:

── Code input ──────────────────────────
i = 0
while i < 5:
    print(i)
    i = i + 1

── Interpreter Output ──────────────────
0 Β· 1 Β· 2 Β· 3 Β· 4

── AST Summary ─────────────────────────
Assigning '0.0' to variable 'i'
While loop: repeatedly executes body while condition is true
Print statement printing the value of expression

── AST Tree ────────────────────────────
└── Expression
    └── Assign
└── While

🧩 Supported Python Constructs

Construct Details
Variables Declarations, assignments, arithmetic & logical operations
Loops for and while with full iteration support
Conditionals if, elif, else branching
Functions Return statements & built-ins like len, range
Lists Index-based access and list operations

πŸ—‚οΈ Project Structure

CODESAGE/
β”œβ”€β”€ main.py                    # Terminal entry point
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt           # added
β”œβ”€β”€ .gitignore                 # added
β”œβ”€β”€ .env                       # gitignored
β”œβ”€β”€ codesage/                  # core package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ scanner.py
β”‚   β”œβ”€β”€ parser.py
β”‚   β”œβ”€β”€ interpreter.py
β”‚   β”œβ”€β”€ resolver.py
β”‚   β”œβ”€β”€ nlp.py
β”‚   └── gui.py                 # GUI entry point
β”œβ”€β”€ nlp/                       # training pipeline
β”‚   β”œβ”€β”€ train.py
β”‚   β”œβ”€β”€ train_gpt.py
β”‚   β”œβ”€β”€ decoder.py
β”‚   β”œβ”€β”€ filter.py
β”‚   β”œβ”€β”€ generate_datasets.py
β”‚   └── prepare_embeddings.py
β”œβ”€β”€ models/                    # gitignored
β”œβ”€β”€ data/                      # datasets
└── assets/                    # images

⚠️ Challenges Faced

  • No suitable NLP training dataset was available initially
  • Built a custom template-based dataset filtered to interpreter capabilities
  • NLP accuracy gaps led to a pivot toward AST-based summarization as the primary explanation method

πŸš€ Roadmap

  • Integrate CodeT5 / LLaMA for richer, more nuanced code explanations
  • Add support for classes, modules, and advanced Python constructs
  • Replace Tkinter with a modern web-based IDE
  • Real-time explanation as users type

🧰 Tech Stack

Area Technology
Language Python 3.9+
GUI Tkinter
AST & Parsing Python ast module + custom recursive descent parser
NLP / AI OpenAI GPT-4o-mini (optional)
NLP Training Custom template-based dataset

πŸ‘₯ Team

Mentors

Yadnyesh Patil β€” Mentor, Project X Β· VJTI
Rupak Gupta β€” Mentor, Project X Β· VJTI

Contributors

Mokshi Shah β€” Developer Β· VJTI

⚠️ Notes

  • No API key is required to run the core interpreter or GUI
  • GPT-4o-mini integration requires an OpenAI API key stored in .env (gitignored)
  • The .env file and models/ directory are both excluded from version control

⭐ If you find CodeSage useful, consider starring the repo!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages