Welcome to ACM-VIT, powered by VIT Vellore, presenting Study Easy - our project designed to help students and researchers revise and learn more effectively.
Study-Easy is a combination of two popular language models that enable summarization and generate question-answer pairs. These models are connected by a backend that automatically converts your documents (pdf, pptx) into text for the models to run on.
As mentioned above, it comprises two primary models:
- Summarization model - Fine-tuned Mistral-7B model
- Question-Answer model - Fine-tuned T5 model
UNDERSTANDING THE PROCESS
If you want to know more about the models and how we went about training them, please refer to the documentation within the notebooks in the repository. All information regarding usage, fine-tuning, and training datasets is mentioned in great detail.
In our model, we have fine-tuned the mistralai/Mistral-7B-v0.1 version.
Mistral 7B is a 7.3 billion parameter language model that uses Grouped-query Attention (GQA), allowing for faster inference times compared to standard full attention. Mistral 7B Sliding Window Attention (SWA) gives it the ability to handle longer text sequences at a low cost.
You can access Mistral 7B on HuggingFace, Vertex AI, Replicate, Sagemaker Jumpstart, Baseten, and via the Kaggle "Models" feature.
- Dual capability to excel in both natural language tasks and coding tasks.
- Understands a variety of verbal nuances, resulting in more complex interactions and perceptive answers.
Competition Check!
It outperforms the 13 billion parameter model Llama 2 on most benchmarks!
More parameters = greater efficiency
- Can be used to summarize any form of literature, including technical subjects like Discrete Mathematics.
- Can produce varying types of summaries, depending on the document.
- Primed for use in research and college notes.
- Need a larger training dataset for college notes summarization.
- Large computing power required.
- Slow loading time since it's heavy to load.
As the name suggests, this model uses the T5 (Text-To-Text Transfer Transformer) model to generate question and answer pairs based on a given context, i.e., the text from the document.
To understand how QAG models work, read here.
- Generation of flashcards - can be used as a study tool.
- Content Creation - Content generation platforms can use this model to automatically create questions and answers based on provided content.
- Educational Tools - Our model can be employed to create educational materials in quizzes & assessments.
- Context Sensitivity - It may generate answers based on superficial patterns rather than deep comprehension.
- Domain Specificity - If the model is trained on a specific domain, it may not generalize well to other domains. The model struggles with math-related problems.
- Length Limitations - Transformers have a maximum sequence length, so very long paragraphs may be truncated.
- More Training Data Required - Training large transformer models requires significant computational resources; hence, training them has been a challenge.
No, we aren't asking for money.
Despite our best efforts, there are certain aspects that render this project far from fully useful and operational.
- Locally run only - Due to resource limitations, we cannot deploy this project.
- No OCR - For pictorial understanding and handwriting recognition, OCR is key. However, we faced multiple issues in that regard.
- Recurring Problems - Even after the model shows no problem in running, there are instances where complex errors arise. While we have tried our best to mitigate that, the underlying issues may take time to erase.
If you see any obvious problems that can be taken care of, please feel free to put in a pull request. Your contribution will be highly appreciated.
The ACM-VIT Study Easy team is 10 members strong.
None of the members were limited to one particular task, and all regularly chipped in other domains.
- Vidit Kothari - ACM Developer Relations Lead 2024-25
- Hari R Kartha - ACM Internal Lead 2024-25
- Manav Muthanna - ACM Chairperson 2024-25
- Sarthak Gupta - ACM Projects Lead 2024-25
- Saharsh Bhansali - ACM Research Lead 2024-25
DOMAIN | PEOPLE | DESCRIPTION |
---|---|---|
Summarization Model | Yasha Pacholee Rohit Jindal Sunny Gogoi |
Dealt with Hugging Face, LoRa, manual data collection, and attempted OCR & text extraction from documents. |
Question-Answer Model | Abhirup Das Gudapati Nikhil Raghav Sampath |
Worked with QAG models and their types to try and resolve domain specificity. |
Frontend Design & Integration |
Aditi Sridhar Raghav Sampath |
Used Python streamlit for simple frontend-backend interaction and later switched to HTML, CSS & JS |
SPECIAL MENTIONS
This project wouldn't reach this stage without the help of Google, Stack Overflow, Medium, ChatGPT & YouTube.