Agentic Academic Paper Assistant

An agentic, multi-stage academic paper assistant built with LangGraph / LangChain that can search, read, summarize, verify, and cite research papers in a reliable, production-oriented way.

✨ Key Features

Multi-agent architecture with clear role separation
Graph-based orchestration (LangGraph) instead of fragile chains
Academic paper search (e.g. arXiv)
PDF download & parsing
Structured paper summaries (problem, method, contribution, limitations)
Citation generation (APA / BibTeX / IEEE-ready)
Long-term memory using VectorDB (Chroma)
Failure-aware fallbacks (no results, PDF unavailable, low confidence)
FastAPI service interface for deployment

🧠 System Design

This system is a role-based multi-agent system with centralized orchestration.

Instead of letting agents freely chat, the system enforces a structured workflow:

User Query
   ↓
Router / Planner Agent
   ↓
Paper Retrieval Agent
   ↓
PDF Reader / Summarizer Agent
   ↓
Citation & Verification Agent
   ↓
Memory (RAG) Update
   ↓
Final Response

This design improves:

reliability
debuggability
termination guarantees
production readiness

🧩 Agents & Responsibilities

1. Router / Planner Agent

Interprets user intent
Expands vague queries into academic language
Decides which agents should run
Sets constraints (top_k, year range, domain)

2. Paper Retrieval Agent

Searches academic sources (e.g. arXiv)
Deduplicates results
Returns structured metadata

3. PDF Reader / Summarizer Agent

Downloads PDFs when available
Extracts key sections
Produces structured summaries

4. Citation & Verification Agent

Generates citations (BibTeX / APA)
Verifies claims against sources
Flags low-confidence outputs

5. Memory Agent (RAG)

Stores previously seen papers
Enables long-term recall
Avoids repeated searches

🧠 Memory Design

The system separates short-term state and long-term memory:

Short-term state (in-session): the LangGraph AgentState plus a small short-term memory layer:
- messages: recent user + assistant turns (bounded, e.g. last 20)
- last_papers, last_citations, last_intent, last_slots: last turn’s context, used for follow-ups like “cite those in APA” or “download the second one”
- Implemented in src/agent/state.py and src/agent/short_term_memory.py
Long-term memory (persistent): VectorDB (Chroma)
- One paper = one vector entry
- Metadata stored alongside embeddings

This enables:

conversational follow-ups without repeating the full query
persistent knowledge across sessions
faster responses for repeated topics
incremental system learning

🚑 Failure Handling & Fallbacks

The agent is designed to fail gracefully:

No search results → query expansion & retry
PDF unavailable → abstract-only summarization
Low citation confidence → uncertainty explicitly reported
Tool failure → controlled retry or safe exit

This is a core 2025 agent design principle.

🛠️ Tech Stack

Python
LangGraph (graph-based agent orchestration)
LangChain (LLM & tool abstractions)
Chroma (vector database for memory)
FastAPI (service layer)
arXiv API (paper search)

🚀 Running the Project

# install dependencies
pip install -r requirements.txt

# run API server (stateless, one-shot requests)
uvicorn app:app --reload

CLI (multi-turn with short-term memory)

python main.py

In the CLI:

Each message runs the LangGraph workflow.
Short-term memory is carried across turns, so you can:
- First: “Find recent papers about diffusion models.”
- Then: “Cite those in APA.” (reuses last_papers instead of searching again)

Debug mode (print full agent state)

python main.py --debug

With --debug, the CLI prints the full final graph state (including intent, actions, papers, citations, memory logs) after each turn, which is useful when debugging routing or workflows.

📌 Design Philosophy

This project intentionally avoids:

free-form agent chat
uncontrolled agent loops
"emergent" coordination without rules

Instead, it follows explicit workflows, role boundaries, and verification steps — aligning with current best practices in multi-agent research (2025).

📖 Motivation

Single LLM calls are powerful but unreliable for academic work.

This project explores how agentic workflows and multi-agent coordination can make LLM systems:

more trustworthy
more controllable
more suitable for real research use

🔮 To do list

Multi-source retrieval (Semantic Scholar, OpenReview)
Improve the performance of search agent
Summary Node(Read the Node According to the pdf)
pdf upload
Service Env set up
UI Design

📄 License

MIT License

If you are interested in agentic systems, academic assistants, or production-grade LLM workflows, feel free to explore or contribute.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic Academic Paper Assistant

✨ Key Features

🧠 System Design

🧩 Agents & Responsibilities

1. Router / Planner Agent

2. Paper Retrieval Agent

3. PDF Reader / Summarizer Agent

4. Citation & Verification Agent

5. Memory Agent (RAG)

🧠 Memory Design

🚑 Failure Handling & Fallbacks

🛠️ Tech Stack

🚀 Running the Project

CLI (multi-turn with short-term memory)

Debug mode (print full agent state)

📌 Design Philosophy

📖 Motivation

🔮 To do list

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic Academic Paper Assistant

✨ Key Features

🧠 System Design

🧩 Agents & Responsibilities

1. Router / Planner Agent

2. Paper Retrieval Agent

3. PDF Reader / Summarizer Agent

4. Citation & Verification Agent

5. Memory Agent (RAG)

🧠 Memory Design

🚑 Failure Handling & Fallbacks

🛠️ Tech Stack

🚀 Running the Project

CLI (multi-turn with short-term memory)

Debug mode (print full agent state)

📌 Design Philosophy

📖 Motivation

🔮 To do list

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages