Pre-processing:-

Working Demo of the Project

This repository comprises of Docbot which is a multimodal chatbot with current expanse to take in pdf inputs and then generating the answer and revelant image regarding the query. This is done in three phase:-

Pre-processing:-

PDF Input goes in and with use of apache pdfbox library we parse it into text file
PDF file is also given into input to a function which we extract images and the revelant texts regarding it( above/below text of image)
The text file is sent through a sentence transformer which generates embedding space for the text
The embedding file is then sent through ANNOY (Approximate Nearest Neighbour oh yeah) which makes tree structure correlating the embedding to each other

Query - Processing

Query comes in and is passed through Sentence Transformer and embeddings generated respectively.
Embeddings is then compared to pdf file's text embedding and then nearest k revelant sentence to query are retrieved (where k is hyperparameter)
The Query and nearest k sentence are sent through LLM Model (gemma 2b in our case) and revelant output is generated
The output embedding representation is compared to the image captions and most revelant (by use of annoy algorithm) is then selected.

Output:

The answer and revelant image are given as output.

We support japanese language aswell as of right now and it basically same but the embedding generation is done using bert trained on japanese wiki

TO-DO

Deploy the site and support multiple input type eg. docx
Better database system
Voice query support and output TTS

Details regarding the project and model: The details of the project and the model are given in the power point presentation named 'DOCBOT.pptx' in the current directory. A working vedio of the model name 'Working Clip.mp4' is also provided in the current directory.

Prerequisites:

Python 3.10 or later pip package installer PyTorch (with CUDA support) if using Windows (installation instructions on PyTorch's official website) Installation:

#Clone the repository:

git clone https://github.com/Sukhvansh2004/DocBot-P22-B.git

#Navigate to the cloned repository directory and install the required packages:

pip install -r requirements.txt

#Running the Application:
#Ensure you have the necessary prerequisites installed. See Important Notes
#Start the Streamlit app:

streamlit run streamlit.py

Important Notes:

Remember to install PyTorch with CUDA support on your device as per the official PyTorch installation instructions.

Additional Considerations: For enhanced performance and scalability, consider using a GPU-accelerated environment. Explore advanced question-answering techniques to improve the chatbot's accuracy and versatility.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.devcontainer		.devcontainer
.gitattributes		.gitattributes
.gitignore		.gitignore
Backend.py		Backend.py
Backend2.py		Backend2.py
DOCBOT.pptx		DOCBOT.pptx
Front Page UI.png		Front Page UI.png
Generate_text.py		Generate_text.py
README.md		README.md
Working Clip.mp4		Working Clip.mp4
config.yaml		config.yaml
file_1.pdf		file_1.pdf
file_1_unlocked.pdf		file_1_unlocked.pdf
image1.jpg		image1.jpg
image2.jpg		image2.jpg
loading.gif		loading.gif
loading.py		loading.py
requirements.txt		requirements.txt
streamlit.py		streamlit.py
testing.pdf		testing.pdf
your_logo.png		your_logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pre-processing:-

Query - Processing

Output:

TO-DO

Prerequisites:

Important Notes:

About

Releases

Packages

Languages

Sukhvansh2004/DocBot-P22-B

Folders and files

Latest commit

History

Repository files navigation

Pre-processing:-

Query - Processing

Output:

TO-DO

Prerequisites:

Important Notes:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages