Smart PDF Reader

An interactive PDF reader powered by LangChain and GPT that enables users to upload PDF documents and chat with an AI assistant to extract insights, answer questions, and navigate content intelligently.

Features

PDF Upload: Upload any PDF document for interactive analysis
AI-Powered Q&A: Ask questions about your PDF content and get intelligent answers
Semantic Search: Uses vector embeddings to find relevant content accurately
Conversational Memory: Maintains chat history for context-aware follow-up questions
Answer-First PDF Display: Highlights the exact page containing the answer with 📍 indicator, followed by context pages
Image-Based PDF Rendering: Reliable cross-platform PDF viewing that works on all deployment environments
Smart Page Context: Automatically displays surrounding pages (±2 pages) for better understanding
Conversational Interface: Natural chat experience powered by GPT-3.5/GPT-4
Simple UI: Clean, intuitive interface built with Streamlit
Modular Architecture: Well-organized codebase with separation of concerns for easy maintenance and extension
Performance Optimized: Cached PDF-to-image conversion for faster repeated access

Technologies

LangChain - Framework for LLM application development
Streamlit - Web application framework
OpenAI GPT - Large language model for answer generation
Chroma - Vector database for embeddings
HuggingFace Transformers - Embedding models
pdf2image - PDF to image conversion for reliable rendering
Poppler - PDF rendering engine

Prerequisites

Python 3.12+ (Recommended). (Compatible with 3.10 – 3.13)
OpenAI API Key
HuggingFace API Token
Poppler (system dependency for PDF rendering)

Installation

Clone the repository

   git clone https://github.com/sheygs/smart-pdf-reader.git
   cd smart-pdf-reader

Create a virtual environment

   python3.12 -m venv venv
   source venv/bin/activate

Install system dependencies

For PDF rendering support, install Poppler:
- macOS:
```
brew install poppler
```
- Linux (Ubuntu/Debian):
```
sudo apt-get update
sudo apt-get install -y poppler-utils
```
- Windows: Download from poppler releases and add to PATH
Install Python dependencies
```
pip install -r requirements.txt
```
Set up environment variables

Rename the .env.example file to .env in the root directory and populate the required keys:
```
OPENAI_API_KEY=your_openai_api_key_here
HUGGINGFACEHUB_API_TOKEN=your_huggingface_api_token_here
```

Usage

Start the application

   streamlit run src/app.py

Upload your PDF
- Click on the file uploader in the sidebar
- Select a PDF document from your local machine
Process the PDF
- Click the "Process" button to analyze the document
- Wait for the processing to complete
Ask questions
- Type your question in the chat input
- The AI will analyze the PDF and provide relevant answers
- The answer page will be displayed first with a 📍 indicator
- Context pages (±2 pages) will be shown below for additional context

Folder Structure

smart-pdf-reader/
│
├── src/
│   ├── app.py                     # Main application entry point
│   ├── config.py                  # Configuration management
│   │
│   ├── core/                      # Core business logic
│   │   ├── conversation.py        # Conversation service (RAG chain)
│   │   ├── document_processor.py  # PDF document processing
│   │   ├── embeddings.py          # Embedding service
│   │   └── vector_store.py        # Vector database operations
│   │
│   ├── ui/                        # User interface components
│   │   ├── components.py          # Chat and PDF components
│   │   ├── html_templates.py      # HTML/CSS templates
│   │   ├── layout.py              # Application layout
│   │   └── session.py             # Session state management
│   │
│   └── utils/                     # Utility functions
│       ├── file_handlers.py       # File operations
│       └── pdf_renderer.py        # PDF rendering utilities
│
├── requirements.txt              # Python dependencies
├── .env.dev                      # Environment variables template
├── README.md                     # Project documentation
└── .gitignore

How It Works

PDF Processing: Uploaded PDFs are parsed and split into manageable chunks using PyPDF
Embedding Creation: Text chunks are converted to vector embeddings using HuggingFace models (default: thenlper/gte-small)
Vector Storage: Embeddings are stored in Chroma vector database for efficient similarity search
Conversational RAG: Uses LangChain's retrieval chain with chat history awareness
Query Processing: User questions are contextualized with chat history and matched against stored vectors
Answer Generation: Relevant chunks are passed to GPT model with the contextualized question
Answer-First Display:
- The page containing the answer is displayed first with a 📍 indicator
- Surrounding pages (±2 pages) are shown below for context
- PDF pages are converted to high-quality images (150 DPI) for reliable cross-platform rendering
- Caching ensures fast repeated access to the same pages

Architecture

The project follows a modular architecture pattern with clear separation of concerns:

Core Module (src/core/): Business logic for document processing, embeddings, vector store, and conversation management
UI Module (src/ui/): Streamlit interface components, layouts, and session management
Utils Module (src/utils/): Reusable utilities for file handling and image-based PDF rendering
Config Module (src/config.py): Centralized configuration and environment validation

Key Design Decisions

Image-Based PDF Rendering: Uses pdf2image instead of iframe embedding for reliable cross-platform display
Answer-First UX: Displays the answer page prominently before showing context pages
Cached Rendering: PDF-to-image conversion is cached using @st.cache_data for better performance
Configurable Context: Context window (pages before/after answer) is configurable via src/config.py

Configuration

You can customize the application behavior by modifying src/config.py:

@dataclass
class PDFConfig:
    context_page_before: int = 2  # Pages to show before answer page
    context_page_after: int = 2   # Pages to show after answer page
    default_page: int = 0         # Default page to display
    dpi: int = 150                # Image resolution for PDF rendering

Limitations

File Format Support: Currently only supports PDF files. Support for other document formats (Word, TXT, etc.) is planned for future releases
Internet Connection Required: Active internet connection needed for API calls to OpenAI and HuggingFace
API Costs: OpenAI API usage incurs costs based on usage. Monitor your API usage to avoid unexpected charges
PDF Size: Very large PDFs (100+ pages) may take longer to process and could impact performance
Language Support: Best performance with English text. Other languages may work but have not been extensively tested
Memory Usage: Processing large documents requires sufficient system memory. Close other applications if you experience slowdowns

Contributing

Contributions are welcome! The project follows a modular architecture to make it easy to contribute:

Core Features: Add new functionality in the src/core/ module
UI Improvements: Enhance the interface in the src/ui/ module
Utilities: Add helper functions in the src/utils/ module

Acknowledgments

OpenAI for GPT models
LangChain team for the excellent framework
Streamlit for the intuitive web framework
HuggingFace for open-source embedding models

License

MIT - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart PDF Reader

Features

Technologies

Prerequisites

Installation

Usage

Folder Structure

How It Works

Architecture

Key Design Decisions

Configuration

Limitations

Contributing

Acknowledgments

License

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.devcontainer		.devcontainer
.vscode		.vscode
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
packages.txt		packages.txt
requirements.txt		requirements.txt

License

sheygs/smart-pdf-reader

Folders and files

Latest commit

History

Repository files navigation

Smart PDF Reader

Features

Technologies

Prerequisites

Installation

Usage

Folder Structure

How It Works

Architecture

Key Design Decisions

Configuration

Limitations

Contributing

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages