Md-Emon-Hasan · Md-Emon-Hasan · Aug 7, 2025
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -2,37 +2,36 @@ name: Docker Image CI-CD
 
 on:
   push:
-    branches: [ "main" ]
+    branches: [ "master" ]
   pull_request:
-    branches: [ "main" ]
+    branches: [ "master" ]
 
 jobs:
   build:
-
     runs-on: ubuntu-latest
 
     steps:
-    - name: Checkout code
-      uses: actions/checkout@v3
-
-    - name: Set up Python
-      uses: actions/setup-python@v4
-      with:
-        python-version: '3.x'
-
-    - name: Set up Docker Buildx
-      uses: docker/setup-buildx-action@v2
-
-    - name: Cache Docker layers
-      uses: actions/cache@v3
-      with:
-        path: /tmp/.buildx-cache
-        key: ${{ runner.os }}-buildx-${{ github.sha }}
-        restore-keys: |
-          ${{ runner.os }}-buildx-
-
-    - name: Build Docker image
-      run: docker build -t informa-truth .
-
-    - name: Test the application (Run tests inside container)
-      run: docker run --rm informa-truth pytest tests/
+      - name: Checkout code
+        uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.x'
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v2
+
+      - name: Cache Docker layers
+        uses: actions/cache@v3
+        with:
+          path: /tmp/.buildx-cache
+          key: ${{ runner.os }}-buildx-${{ github.sha }}
+          restore-keys: |
+            ${{ runner.os }}-buildx-
+
+      - name: Build Docker image
+        run: docker build -t informa-truth .
+
+      - name: Test the application (Run tests inside container with PYTHONPATH)
+        run: docker run --rm -e PYTHONPATH=/app informa-truth pytest tests/
diff --git a/Dockerfile b/Dockerfile
@@ -1,17 +1,20 @@
 # Use an official Python runtime as a parent image
 FROM python:3.11-slim
 
+# Set environment variable for PYTHONPATH
+ENV PYTHONPATH=/app
+
 # Set the working directory
 WORKDIR /app
 
 # Copy the current directory contents into the container
 COPY . /app
 
-# Install the dependencies
+# Install dependencies
 RUN pip install --no-cache-dir -r requirements.txt
 
-# Expose Streamlit default port
-EXPOSE 8501
+# Expose the port Flask runs on
+EXPOSE 5000
 
-# Correct command to run Streamlit app
-CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
+# Default command to run the Flask app
+CMD ["python", "app.py"]
diff --git a/README.md b/README.md
@@ -1,12 +1,10 @@
-# 📘 InformaTruth: AI-Driven News Veracity Analyzer
-This project addresses the challenge by fine-tuning a transformer-based classification model (RoBERTa) on the LIAR dataset to automatically determine whether a news statement is real or fake. Additionally, it employs a generative LLM (FLAN-T5) to produce natural language explanations for its predictions, increasing user trust and transparency in the system.
+# 📘 InformaTruth: AI-Driven News Authenticity Analyzer
+🧠 Fine-tuned RoBERTa-based Multi-Modal Fake News Detector with Explanation Generation using FLAN-T5, URL/PDF/Text support, and Agentic LangGraph orchestration. Orchestrated through a LangGraph-powered agentic pipeline with Planner, Retriever, Tool Router, Fallback Agent, and LLM Answerer agents, plus memory and dynamic tool augmentation.
 
 [![InformaTruth](https://github.com/user-attachments/assets/42d5bc32-c739-4f5e-a8e6-9e89cc0a6e6e)](https://github.com/user-attachments/assets/42d5bc32-c739-4f5e-a8e6-9e89cc0a6e6e)
+[![InformaTruth](https://github.com/user-attachments/assets/a5c9932e-03a9-4008-9b34-f8fa80687334)](https://github.com/user-attachments/assets/a5c9932e-03a9-4008-9b34-f8fa80687334)
 
----
-
-## 🔍 Overview
-In the digital age, misinformation spreads rapidly across news outlets, social media, and online platforms. With the increasing difficulty of distinguishing between credible journalism and deceptive content, there is a growing demand for automated systems that can detect fake news efficiently and explain their reasoning. Manual fact-checking is time-consuming, prone to bias, and often fails to scale with the speed of information propagation. It also includes a user-friendly UI, Rest API, modular components and a complete Dockerized CI/CD pipeline for enterprise deployment.
+<!-- https://github.com/user-attachments/assets/a5c9932e-03a9-4008-9b34-f8fa80687334 -->
 
 ---
 
@@ -16,81 +14,181 @@ In the digital age, misinformation spreads rapidly across news outlets, social m
 
 ---
 
+## 🔍 Overview
+In the digital age, misinformation spreads rapidly across news outlets, social media, and online platforms. With the increasing difficulty of distinguishing between credible journalism and deceptive content, This agentic AI system detects fake news from text, PDF, or website URLs using a fine-tuned RoBERTa model. It leverages a multi-agent architecture with LangGraph, including Planner, Retriever, Tool Router, and Explanation Agent. When a claim is classified, the system uses FLAN-T5 to generate human-readable reasoning. If local evidence fails, it falls back on Wikipedia or DuckDuckGo search. This production-grade solution supports real-world fact-checking, multi-source ingestion, tool-augmented reasoning, and modular orchestration.
+
+---
+
 ## ⚙️ Tech Stack
-| **Category**               | **Technology/Resource**                                                                 |
-|----------------------------|----------------------------------------------------------------------------------------|
-| **Core Framework**         | PyTorch, Transformers                                                                  |
-| **Classification Model**   | Fine-tuned RoBERTa-base                                                                |
-| **Explanation Model**      | FLAN-T5-base                                                                           |
-| **Training Data**          | LIAR Dataset (Political Fact-Checking)                                                 |
-| **Evaluation Metrics**     | Accuracy, Precision, Recall, F1-score                                                 |
-| **Text Extraction**       | Newspaper3k (URLs), PyMuPDF (PDFs)                                                    |
-| **Training Framework**     | HuggingFace Trainer                                                                    |
-| **Deployment**            | Streamlit (Web Interface)                                                             |
-| **Hosting**               | Render                                                |
+| **Category**                | **Technology/Resource**                                                                                |
+| --------------------------- | ------------------------------------------------------------------------------------------------------ |
+| **Core Framework**          | PyTorch, Transformers, HuggingFace                                                                     |
+| **Classification Model**    | Fine-tuned RoBERTa-base on LIAR Dataset                                                                |
+| **Explanation Model**       | FLAN-T5-base (Zero-shot Prompting)                                                                     |
+| **Training Data**           | LIAR Dataset (Political Fact-Checking)                                                                 |
+| **Evaluation Metrics**      | Accuracy, Precision, Recall, F1-score                                                                  |
+| **Training Framework**      | HuggingFace Trainer                                                                                    |
+| **LangGraph Orchestration** | LangGraph (Multi-Agent Directed Acyclic Execution Graph)                                               |
+| **Agents Used**             | PlannerAgent, InputHandlerAgent, ToolRouterAgent, ExecutorAgent, ExplanationAgent, FallbackSearchAgent |
+| **Input Modalities**        | Raw Text, Website URLs (via Newspaper3k), PDF Documents (via PyMuPDF)                                  |
+| **Tool Augmentation**       | DuckDuckGo Search API (Fallback), Wikipedia (Planned), ToolRouter Logic                                |
+| **Web Scraping**            | Newspaper3k (HTML → Clean Article)                                                                     |
+| **PDF Parsing**             | PyMuPDF                                                                                                |
+| **Explainability**          | Natural language justification generated using FLAN-T5                                                 |
+| **State Management**        | Shared State Object (LangGraph-compatible)                                                             |
+| **Deployment Interface**    | Flask (HTML,CSS,JS)                                                                                |
+| **Hosting Platform**        | Render (Docker)                                                                  |
+| **Version Control**         | Git, GitHub                                                                                            |
+| **Logging & Debugging**     | Logs, Print Debugs, Custom Logger                                                 |
 | **Input Support**         | Text, URLs, PDF documents                                                             |
-| **Explainability**        | FLAN-T5 generated natural language explanations                                       |
 
 ---
 
-### ✅ Key Features
-- **Multi-format input**: Supports raw text, URLs, and PDF files.
-- **NLP Pipeline**: Includes summarization, classification, and LLM-based explanation.
-- **Moduler coding and logging**: Clean, modular code with logging.
-- **Streamlit UI**: Clean, responsive frontend for interaction.
-- **Rest API**: For integration with other systems.
-- **Dockerized**: Fully containerized for production deployments.
-- **CI/CD**: GitHub Actions pipeline for testing, linting, and Docker validation.
+## ✅ Key Features
+
+* **🔄 Multi-Format Input Support**
+  Accepts raw **text**, **web URLs**, and **PDF documents** with automated preprocessing for each type.
+
+* **🧠 Full NLP Pipeline**
+  Integrates summarization (optional), **fake news classification** (RoBERTa), and **natural language explanation** (FLAN-T5).
+
+* **🧱 Modular Agent-Based Architecture**
+  Built using **LangGraph** with modular agents: `Planner`, `Tool Router`, `Executor`, `Explanation Agent`, and `Fallback Agent`.
+
+* **📜 Explanation Generation**
+  Uses **FLAN-T5** to generate human-readable, zero-shot rationales for model predictions.
+
+* **🧪 Tool-Augmented & Fallback Logic**
+  Dynamically queries **DuckDuckGo** when local context is insufficient, enabling robust fallback handling.
+
+* **🧼 Clean, Modular Codebase with Logging**
+  Structured using clean architecture principles, agent separation, and informative logging.
+
+* **🌐 Flask with Web UI**
+  User-friendly, interactive, and responsive frontend for input, output, and visual explanations.
+
+* **🐳 Dockerized for Deployment**
+  Fully containerized setup with `Dockerfile` and `requirements.txt` for seamless deployment.
+
+* **⚙️ CI/CD with GitHub Actions**
+  Automated pipelines for testing, linting, and Docker build validation to ensure code quality and production-readiness.
 
 ---
 
-## 📦 Project Structure
+## 📦 Project File Structure
+
 ```bash
 InformaTruth/
-├── src/
-│   ├── config.py         # Configuration
-│   ├── data.py           # Data handling
-│   ├── inference.py      # Model inference
-│   ├── main.py           # Main script
-│   ├── model.py          # Model definition
-│   └── loggger.py        # Logging
 │
-├── fine_tuned_liar_detector/  # Fine-tuned model
+├── .github/              # GitHub Actions
+│   └── workflows/
+│       └── main.yml 
+│
+├── agents/                            # Modular agents (planner, executor, etc.)
+│   ├── executor.py
+│   ├── fallback_search.py
+│   ├── input_handler.py
+│   ├── planner.py
+│   ├── router.py
+│   └── __init__.py
+│
+├── fine_tuned_liar_detector/         # Fine-tuned RoBERTa model directory
+│   ├── config.json
+│   ├── vocab.json
+│   ├── tokenizer_config.json
+│   ├── special_tokens_map.json
+│   ├── model.safetensors
+│   └── merges.txt
+│
+├── graph/                            # LangGraph state and builder logic
+│   ├── builder.py
+│   ├── state.py
+│   └── __init__.py
+│
+├── models/                           # Classification + LLM model loader
+│   ├── classifier.py
+│   ├── loader.py
+│   └── __init__.py
 │
-├── notebook/
-│   └── experiment.ipynb  # Experimentation                                
+├── news/                             # Sample news or test input
+│   └── news.pdf
 │
-├── test/
+├── notebook/                         # Jupyter notebooks for experimentation
+│   ├── 1 Fine-Tuning.ipynb
+│   └── 2 Fine-Tuning with Multi Agent.ipynb
+│
+├── static/                           # Static files (CSS, JS)
+│   ├── css/
+│   │   └── style.css
+│   └── js/
+│       └── script.js
+│
+├── templates/                        # HTML templates for Flask UI
+│   ├── dj_base.html
+│   └── dj_index.html
+│
+├── tests/                            # Unit tests
 │   └── test_app.py
 │
-├── .github/              # GitHub Actions
-│   └── workflows/
-│       └── main.yml  
+├── train/                            # Training logic
+│   ├── config.py
+│   ├── data_loader.py
+│   ├── predictor.py
+│   ├── run.py
+│   ├── trainer.py
+│   └── __init__.py
+│
+├── utils/                            # Utilities like logging, evaluation
+│   ├── logger.py
+│   ├── results.py
+│   └── __init__.py
 │
-├── app.py                # Streamlit app
-├── app.png               # Demo
-├── demo.webm             # Demo video
-├── setup.py              # Python setup file
-├── Dockerfile            # Dockerfile
-├── flask_api.py          # Rest API
-├── requirements.txt      # Dependencies
-├── .gitignore            # Git ignore file
-├── LICENSE               # License
-└── README.md             # This file
+├── __init__.py                        
+├── app.png                           # Demo
+├── demo.webm                         # Demo video
+├── app.py                            # Flask app entry point
+├── main.py                           # Main script / orchestrator
+├── config.py                         # Configuratin file
+├── setup.py                          # Project setup for pip install
+├── render.yaml                       # Project setup render
+├── Dockerfile                        # Docker container spec
+├── requirements.txt                  # Python dependencies
+├── LICENSE                           # License file
+├── .gitignore                        # Git ignore rules
+├── .gitattributes                    # Git lfs rules
+└── README.md                         # Readme
 ```
 
 ---
 
 ## 🧱 System Architecture
 ```mermaid
 graph TD
-    A[Input] --> B{Input Type}
-    B -->|Text| C[Direct Processing]
-    B -->|URL| D[Newspaper3k Extraction]
-    B -->|PDF| E[PyMuPDF Extraction]
-    C & D & E --> F[Fine Tuneing RoBERTa Classification]
-    F --> G[FLAN-T5 Explanation]
-    G --> H[Streamlit UI Output]
+    A[User Input] --> B{Input Type}
+    B -->|Text| C[Direct Text Processing]
+    B -->|URL| D[Newspaper3k Parser]
+    B -->|PDF| E[PyMuPDF Parser]
+
+    C --> F[Text Cleaner]
+    D --> F
+    E --> F
+
+    F --> G[Context Validator]
+    G -->|Sufficient Context| H[RoBERTa Classifier]
+    G -->|Insufficient Context| I[Web Search Agent]
+
+    I --> J[Context Aggregator]
+    J --> H
+
+    H --> K[FLAN-T5 Explanation Generator]
+    K --> L[Output Formatter]
+
+    L --> M[Web UI using Flask,HTML,CSS,JS]
+
+    style M fill:#e3f2fd,stroke:#90caf9
+    style G fill:#fff9c4,stroke:#fbc02d
+    style I fill:#fbe9e7,stroke:#ff8a65
+    style H fill:#f1f8e9,stroke:#aed581
 ```
 
 ---
@@ -167,4 +265,4 @@ jobs:
 🔗 [Facebook](https://www.facebook.com/mdemon.hasan2001/)
 🔗 [WhatsApp](https://wa.me/8801834363533)
 
----
+---
diff --git a/__init__.py b/__init__.py
@@ -0,0 +1,7 @@
+# Package initialization
+from .models.loader import ModelLoader
+from .graph.builder import PipelineBuilder
+from .utils.logger import setup_logging
+from .utils.results import display_results
+
+__all__ = ['ModelLoader', 'PipelineBuilder', 'setup_logging', 'display_results']
diff --git a/agents/__init__.py b/agents/__init__.py
@@ -0,0 +1,7 @@
+from .input_handler import InputHandler
+from .planner import Planner
+from .fallback_search import FallbackSearch
+from .router import Router
+from .executor import Executor
+
+__all__ = ['InputHandler', 'Planner', 'FallbackSearch', 'Router', 'Executor']