ProjectQuora

Dànae Canillas Sánchez & Xavier Rubiés Cullell

In this project, a study of the dataset provided by Quora in its Kaggle competition has been carried out in order to detect duplicate questions. We will fine-tune pretrained transformer models from the Hugging Face library. We will present the results for different models (BERT, XLNet, DistilBERT, ...) and different hyperparameter combinations that have been used. Finally, we will explore sentence embedding meaning.

The dataset is taken from Quora competition at Kaggle:
https://www.kaggle.com/c/quora-question-pairs

plots: Folder that contains the plots generated in class_visualization.ipynb

2d_pca.html

2d_tsne.html

3d_pca.html

3d_tsne.html

report: Deliverables

imgs: Images contained in POE_Final_Project_Quora_CanillasRubies.pdf

Hyperparameters_Study.pdf: Table of the hyperparameters experiments

POE_Final_Project_Quora_CanillasRubies.pdf: Deliverable report

POE_Initial_Plan.pdf: First deliverable

Presentacio-XavierDanae.pdf: Intermediate project presentation

src: Folder containing script files

data: CVS files

train.csv: Raw data

sentences.csv: Table with questions and tokenizations (from BERT)

class-consistency.ipynb: Prediction consistency study

class_visualization.ipynb: Generates plots

data_analysis.ipynb: Data inference

input_net.py: Generates the model input

main.py: Model training and validation

most_similar_sentence.ipynb: Most similar sentence search

table_generation.ipynb: Generates sentences.csv

utils.py: Contains auxiliary functions

.gitignore : Untracked files
README.md: Project Documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProjectQuora

Dànae Canillas Sánchez & Xavier Rubiés Cullell

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
plots		plots
report		report
src		src
.gitignore		.gitignore
README.md		README.md

XavierRubiesCullell/ProjectQuora

Folders and files

Latest commit

History

Repository files navigation

ProjectQuora

Dànae Canillas Sánchez & Xavier Rubiés Cullell

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages