Academic literature reviews are time consuming and cognitively demanding. Researchers need to
read many papers to extract insights, identify research gaps, and validate novel directions.
This project explores how an AI assistant could streamline this process while maintaining
accuracy and providing verifiable citations.
Duration2 Weeks
Team4 Members
ToolsStreamlit, Python, OpenAI GPT-4o-mini, ChromaDB, LangChain,
Tavily API
My Contributions: I designed and implemented the core system architecture
including the multi-agent coordination framework, RAG pipeline with vector embeddings, and
intent classification system. Following this foundation, a team member developed the
Streamlit interface with streaming responses and integrated Tavily API and wrapped up other
small elements.
System architecture illustrating the multi-agent coordination framework with RAG
pipeline.
Architecture & Design
Multi-Agent System Overview
The system is built around specialized AI agents that coordinate to provide comprehensive literature
analysis. Each agent handles a distinct role, enabling efficient processing of diverse research
tasks.
The Four Specialized Agents
Reader Agent: Extracts and summarizes key sections from research papers
including abstract, methodology, results, and contributions. It provides structured summaries
with clear sections for findings, methodology, and limitations.
QA Agent: Answers specific questions about paper content by searching through
relevant chunks and providing precise, citation-backed responses. It distinguishes between what
papers state explicitly and interpretations.
Research Advisor: Synthesizes insights across papers to identify gaps,
limitations, and open questions. It suggests promising future research directions with clear
rationale and potential impact.
Research Verifier: Validates research suggestions by conducting web searches
through Tavily API. It assesses whether ideas are novel, identifies related work, and provides
revisions to position ideas more effectively.
The Research Advisor analyzes uploaded literature to identify research gaps and suggest
novel future directions.
Intent Classification & Routing
The intent classifier uses GPT-4o-mini to analyze user messages and route them to the right agent. It
categorizes queries into three types: "summarize" (for paper overviews), "question" (for specific
content inquiries), or "research" (for gap analysis and future directions). This classification
happens before the main agent interaction, keeping the system responsive while ensuring users get
the most relevant help.
Intent classification automatically routes queries to the most appropriate specialized
agent.
RAG (Retrieval-Augmented Generation) Pipeline
The RAG architecture grounds all responses in the actual uploaded papers. PDFs are processed through
LangChain's PyPDFLoader, split into 1000-character chunks with 200-character overlap, and then
embedded using OpenAI's text-embedding-3-small model. These embeddings are stored in ChromaDB's
ephemeral client for fast semantic search. When a user asks a question, the system retrieves the 5
most relevant chunks and injects them directly into the agent's context which helps ensure responses
are accurate and cite specific passages.
Technical Implementation
Document Processing Pipeline
The system handles PDF uploads through a multi-step pipeline. Each uploaded file is temporarily
saved, processed by LangChain's PyPDFLoader to extract text content, and deleted. The
RecursiveCharacterTextSplitter divides the extracted text into 1000-character chunks with
200-character overlap to maintain context at chunk boundaries. All chunks are aggregated and
embedded together which creates a vector store that spans all uploaded papers.
Vector Embeddings & Semantic Search
OpenAI's text-embedding-3-small model converts chunks into dense vector representations. These
embeddings enable semantic searches that go beyond simple keyword matching. ChromaDB handles fast
retrieval of relevant context based on similarity to user queries.
Key Technical Features
Streaming Responses: All agent interactions use OpenAI's streaming API to
deliver tokens as they're generated, providing immediate visual feedback and a more natural
conversation feel.
Smart Context Injection: For each query, the system retrieves the top 5
relevant chunks via semantic search and injects them into the system prompt, keeping responses
grounded in source material.
Session State Management: Streamlit's session state maintains the chat history,
uploaded files, and vector store across interactions. The last 10 messages are included in each
request for conversational context.
Source Transparency: Users can expand a "Source Chunks Used" section to see
exactly which paper excerpts informed each response, improving trust and verifiability.
Research Verification with Web Search
When users request research ideas, the system follows a two-step workflow. First, the Research
Advisor generates suggestions based on the uploaded papers. Then, the system automatically extracts
key ideas from these suggestions and conducts targeted web searches using Tavily API (typically 3
queries with phrases like "related work arxiv"). The Research Verifier agent then analyzes these
search results to assess whether each idea appears novel, closely related to existing work, or
already well-explored. It provides revised versions of promising ideas along with evidence from the
web searches.
The Research Verifier validates suggestions by searching for existing related work on
the web.
Technology Stack
The technology stack balances functionality, performance, and maintainability:
Frontend: Streamlit for the interactive web interface
LLM: OpenAI GPT-4o-mini accessed via API
Embeddings: OpenAI text-embedding-3-small for creating semantic vector
representations
Vector Store: ChromaDB ephemeral client for in-memory storage and fast
retrieval of embeddings
Document Processing: LangChain's PyPDFLoader and RecursiveCharacterTextSplitter
for robust PDF handling
Web Search: Tavily API for comprehensive research verification
Backend: Python with OpenAI SDK for clean, maintainable code
Results & Future Work
Project Outcomes
The project successfully demonstrated how coordinated AI agents can streamline academic literature
analysis. The system effectively handles multiple PDFs, provides accurate summaries with citations,
and offers research direction suggestions. The web-based verification feature ensures suggested
directions are genuinely novel.
Future Enhancements
There are several potential directions could extend the system's capabilities:
Additional Specialized Agents
Citation Network Analyzer: An agent that maps citation relationships between
papers to identify influential works and research lineages
Methodology Extractor: Specialized agent focused on extracting and comparing
research methodologies across papers
Dataset Finder: Agent that identifies datasets mentioned in papers and suggests
relevant publicly available datasets
Technical Improvements
Multi-LLM Support: Allow users to choose between different language models
(GPT-4, Claude, Llama) based on their needs and preferences
Batch Processing: Enable processing of large document collections with progress
tracking and resumption capabilities
Export Functionality: Allow users to export chat histories, summaries, and
research suggestions in various formats (PDF, Markdown, LaTeX)
Enhanced Verification
Integration with academic databases (arXiv, Semantic Scholar) for more comprehensive checking
Trend analysis showing publication volume over time for suggested research directions
Lessons Learned
This project provided me with experience building an AI integration system. Working through the
implementation reinforced my knowledge on how important clear system design becomes when
coordinating multiple agents, separating concerns between agents, getting the intent classification
right, and managing context effectively all turned out to be critical for keeping responses coherent
and reliable.