💹 FinBridge

AI-Powered Financial Document Interpreter

From Wall Street jargon to Main Street clarity — instantly.

Built by Kupakwashe T. Mapuranga
B.Tech Artificial Intelligence & Machine Learning
Symbiosis Institute of Technology, Pune · CA3 FinTech Project · 2026

📖 What is FinBridge?

Most people cannot understand financial documents. Annual reports run 200+ pages. Earnings calls are full of jargon. Fund fact sheets use terms that take years to learn. This creates a massive gap — people cannot make informed investment decisions because they cannot read the documents that matter.

FinBridge closes that gap.

You paste or upload any financial document — an annual report, a fund fact sheet, a news article, an earnings call transcript — and FinBridge analyses it in seconds using a trained AI pipeline. You get back a clean, plain-English breakdown that anyone can understand, regardless of their financial background.

"You don't need to understand finance to make informed decisions — FinBridge reads it for you."

✨ Features

Feature	What it does
🎯 Sentiment Analysis	Classifies the document as Positive, Neutral, or Negative using FinBERT — a financial-domain transformer trained specifically on market language. Returns confidence scores for all three classes.
💬 Plain English Summary	Uses Groq's LLaMA 3.3 70B model to generate a structured, jargon-free explanation. Includes a key numbers table, a simple story of what happened, good news, concerns, and a one-line verdict. Scales with document length.
⚠️ Risk Flag Detection	Scans for 30+ genuine risk phrases (going concern, material weakness, covenant breach, fraud, liquidity risk, etc.) using a context-aware engine that avoids false positives from standard legal boilerplate.
📚 Jargon Glossary	Automatically identifies 70+ financial terms in the document and provides plain-English definitions with mention counts.
📑 PDF Export	Generates a professionally formatted PDF report with proper tables, coloured headers, risk cards, and a clean layout — no garbled characters.
📝 DOCX Export	Generates an editable Word document with the same content — ready to share, annotate, or submit.
📄 TXT Export	Plain text version with ASCII table formatting — works everywhere, smallest file size.

🧠 How It Works

FinBridge runs every document through a four-stage AI pipeline:

┌─────────────────────────────────────────────────────────────────┐
│                        USER INPUT                               │
│              Paste text  OR  Upload PDF (max 10MB)              │
└─────────────────────┬───────────────────────────────────────────┘
                      │
          ┌───────────▼───────────┐
          │   TEXT EXTRACTION     │
          │   pdfplumber reads    │
          │   PDF → clean text    │
          └───────────┬───────────┘
                      │
        ┌─────────────┼─────────────────────────┐
        │             │             │            │
        ▼             ▼             ▼            ▼
  ┌──────────┐  ┌──────────┐  ┌────────┐  ┌─────────┐
  │ FinBERT  │  │  Groq /  │  │  Risk  │  │ Jargon  │
  │Sentiment │  │LLaMA 3.3 │  │Scanner │  │Detector │
  │Classifier│  │  70B LLM │  │30+ pat │  │70+ terms│
  │79.4% F1  │  │Plain Eng │  │Context │  │Glossary │
  └────┬─────┘  └────┬─────┘  └───┬────┘  └────┬────┘
       │              │            │             │
       └──────────────┴────────────┴─────────────┘
                      │
          ┌───────────▼───────────┐
          │   RESULTS DASHBOARD   │
          │  Sentiment · Summary  │
          │  Risk Flags · Jargon  │
          └───────────┬───────────┘
                      │
          ┌───────────▼───────────┐
          │   EXPORT REPORTS      │
          │   PDF · DOCX · TXT    │
          └───────────────────────┘

Stage 1 — Sentiment Classification (FinBERT)

The document text is fed into ProsusAI/finbert — a BERT-based transformer pre-trained on financial corpora and fine-tuned for 3-class sentiment classification. Unlike general-purpose sentiment models, FinBERT understands financial language nuances: "revenue declined" is negative, but "declining exposure to volatile assets" is neutral.

Model: ProsusAI/finbert
Classes: Positive · Neutral · Negative
Accuracy: 79.4% on held-out Financial PhraseBank test set
Input limit: 512 tokens (truncated if longer)

Stage 2 — Plain English Summary (Groq / LLaMA 3.3 70B)

The document is sent to Groq's inference API with a carefully engineered prompt that instructs the model to explain the content at Grade 6 reading level. Every financial term gets explained in brackets on first use. Numbers are given real-world context. The output scales with document length: short documents get 2-3 paragraphs, long annual reports get a full structured breakdown with tables.

Primary provider: Groq (LLaMA 3.3 70B) — free, very fast
Fallback 1: Google Gemini 1.5 Flash — free
Fallback 2: Anthropic Claude Haiku — paid
Fallback 3: Rule-based sentence extraction — always works offline

Stage 3 — Risk Flag Detection (Rule Engine)

A context-aware pattern matcher scans for 30+ risk phrases across six categories: solvency/survival risks, legal & regulatory risks, financial performance risks, market risks, governance risks, and restatement risks. Crucially, the engine checks surrounding context to avoid false positives — for example, "going concern basis is still appropriate" is a positive statement and is not flagged, while "going concern qualification issued" is a genuine warning and is flagged.

Stage 4 — Jargon Detection (Glossary Engine)

A curated dictionary of 70+ financial terms is scanned against the document. Every term found is returned with its mention count and a plain-English definition written at the same Grade 6 level as the summary.

📊 Sentiment Model — Performance Comparison

Five models were trained and evaluated on the Financial PhraseBank dataset (5,842 sentences, 3-class classification: Positive / Neutral / Negative).

Model	Accuracy	Precision	Recall	F1 Score
Naive Bayes	68.9%	74.3%	68.9%	62.9%
Logistic Regression	69.3%	70.7%	69.3%	69.9%
SVM (Linear)	69.5%	71.7%	69.5%	70.4%
LSTM	55.4%	—	—	55.4%
FinBERT ✅	79.4%	79.0%	79.0%	79.0%

FinBERT outperforms all classical models by a margin of 8.6 percentage points in F1 score. Its domain-specific pre-training on financial corpora makes it uniquely suited for this task.

📁 Project Structure

finbridge/
│
├── app/                          ← Core AI application
│   ├── main.py                   ← Streamlit UI (complete dashboard)
│   ├── sentiment.py              ← FinBERT sentiment classifier
│   ├── explainer.py              ← Multi-provider LLM summariser
│   ├── risk_flags.py             ← Context-aware risk scanner (30+ patterns)
│   ├── jargon.py                 ← Financial glossary detector (70+ terms)
│   ├── pdf_reader.py             ← PDF text extraction (pdfplumber)
│   └── report_generator.py      ← PDF / DOCX / TXT export engine
│
├── config/
│   └── settings.py              ← API keys, model config, provider priority
│
├── examples/
│   └── sample_documents.txt     ← 3 test documents (positive, negative, neutral)
│
├── tests/
│   └── test_core.py             ← Unit tests for risk scanner and jargon detector
│
├── streamlit_app.py             ← HuggingFace Spaces entry point
├── requirements.txt             ← All Python dependencies
├── .env.example                 ← Template for API keys
├── .gitignore                   ← Excludes .env and large files
└── README.md                    ← This file

🚀 Quick Start

Prerequisites

Python 3.10 or higher
A free Groq API key (takes 2 minutes)

1. Clone the repository

git clone https://github.com/kupakwash/finbridge.git
cd finbridge

2. Create a virtual environment

# Windows
python -m venv venv
venv\Scripts\activate

# Mac / Linux
python -m venv venv
source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

⚠️ First install downloads PyTorch (~800MB). Allow 5–10 minutes on first run.

4. Configure API keys

# Copy the template
cp .env.example .env

Open .env and add your Groq key:

GROQ_API_KEY=gsk_your_key_here

Get your free Groq key at console.groq.com → API Keys → Create Key.

5. Run the app

streamlit run app/main.py

Open your browser at http://localhost:8501

🔑 API Keys

FinBridge supports three LLM providers. You only need one — Groq is recommended because it is free and fast.

Provider	Key Variable	Cost	Where to get
Groq ⭐	`GROQ_API_KEY`	Free	console.groq.com
Google Gemini	`GEMINI_API_KEY`	Free	aistudio.google.com
Anthropic Claude	`ANTHROPIC_API_KEY`	~$0.01/analysis	console.anthropic.com

The app tries providers in this order: Groq → Gemini → Claude → Smart Fallback. If no key is configured, a rule-based fallback summary is generated automatically — the app always works.

🧪 Testing

python tests/test_core.py

Expected output:

✅ Risk flag: going concern detected
✅ Risk flag: empty text handled
✅ Risk flag: clean text returns 0 flags
✅ Risk summary: no flags message correct
✅ Jargon: bull market detected
✅ Jargon: empty text handled
✅ Explainer: short text rejected correctly
✅ Explainer: long text summary generated

✅ All 8 tests passed!

📄 Example Output

Given this input (Old Mutual ZWG Money Market Fund Fact Sheet):

"The Fund registered a return of 2.84% in Q4 against 3.71% in Q3 of 2025, bringing its full year return to 13.60%. The RBZ reaffirmed its commitment to a tight monetary policy. Resultantly, market liquidity remained constrained, and interest rates competitive above 10% per annum..."

FinBridge produces:

Sentiment: 🟡 Neutral (94.5% confidence)

Plain English Summary:

This is a money market fund — think of it like a savings account managed by professionals. Old Mutual pools your money with other investors and lends it to banks and the government for short periods, earning interest. That interest gets paid back to you monthly...

Risk Flags: LIQUIDITY RISK · INTEREST RATE RISK (both MEDIUM — manageable)

Jargon Detected: YIELD · LIQUIDITY · INTEREST RATE · PORTFOLIO · INFLATION · MONETARY POLICY

🌍 Why This Matters

3.5 billion people globally have limited financial literacy
Only 27% of Indian adults are financially literate (vs 52% global average for advanced economies)
Annual reports average 150–250 pages of dense financial language
Retail investors lose money not because markets are complex — but because documents are unreadable

FinBridge makes financial documents accessible to everyone — students, first-time investors, small business owners, and retail traders — regardless of their educational background or location.

Roadmap to wider impact:

🌐 Multilingual output (Shona, Swahili, Zulu, Hindi) — African & Asian markets
📱 Mobile app — analysis on the go
🔌 API access for fintech platforms and robo-advisors
🏫 Educational mode — learn finance while reading real documents

🛠️ Tech Stack

Layer	Technology	Purpose
UI	Streamlit 1.32	Web dashboard
Sentiment AI	FinBERT (HuggingFace Transformers)	Financial sentiment classification
Summary AI	Groq API / LLaMA 3.3 70B	Plain-English document explanation
PDF Parsing	pdfplumber	Extract text from uploaded PDFs
PDF Export	ReportLab	Generate formatted PDF reports
DOCX Export	python-docx	Generate Word document reports
Config	python-dotenv	Secure API key management
Deep Learning	PyTorch	FinBERT inference backend

📜 License

This project is licensed under the MIT License — see LICENSE for details.

👤 Author

Kupakwashe T. Mapuranga

B.Tech Artificial Intelligence & Machine Learning
Symbiosis Institute of Technology, Pune, India

📧 kupakwashemapuranga@gmail.com
🐙 github.com/kupakwash

CA3 FinTech Application Project · 2026

_{Built with ❤️ to make financial knowledge accessible to everyone.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💹 FinBridge

AI-Powered Financial Document Interpreter

📖 What is FinBridge?

✨ Features

🧠 How It Works

Stage 1 — Sentiment Classification (FinBERT)

Stage 2 — Plain English Summary (Groq / LLaMA 3.3 70B)

Stage 3 — Risk Flag Detection (Rule Engine)

Stage 4 — Jargon Detection (Glossary Engine)

📊 Sentiment Model — Performance Comparison

📁 Project Structure

🚀 Quick Start

Prerequisites

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

4. Configure API keys

5. Run the app

🔑 API Keys

🧪 Testing

📄 Example Output

🌍 Why This Matters

🛠️ Tech Stack

📜 License

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
api		api
app		app
config		config
examples		examples
frontend-lovable		frontend-lovable
tests		tests
.gitignore		.gitignore
GUIDE_EXACTLY_WHAT_TO_DO.md		GUIDE_EXACTLY_WHAT_TO_DO.md
README.md		README.md
app_hf.py		app_hf.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Folders and files

Latest commit

History

Repository files navigation

💹 FinBridge

AI-Powered Financial Document Interpreter

📖 What is FinBridge?

✨ Features

🧠 How It Works

Stage 1 — Sentiment Classification (FinBERT)

Stage 2 — Plain English Summary (Groq / LLaMA 3.3 70B)

Stage 3 — Risk Flag Detection (Rule Engine)

Stage 4 — Jargon Detection (Glossary Engine)

📊 Sentiment Model — Performance Comparison

📁 Project Structure

🚀 Quick Start

Prerequisites

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

4. Configure API keys

5. Run the app

🔑 API Keys

🧪 Testing

📄 Example Output

🌍 Why This Matters

🛠️ Tech Stack

📜 License

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages