Skip to content

Latest commit

 

History

History
39 lines (34 loc) · 4.16 KB

File metadata and controls

39 lines (34 loc) · 4.16 KB

ProjectQuora

Dànae Canillas Sánchez & Xavier Rubiés Cullell

In this project, a study of the dataset provided by Quora in its Kaggle competition has been carried out in order to detect duplicate questions. We will fine-tune pretrained transformer models from the Hugging Face library. We will present the results for different models (BERT, XLNet, DistilBERT, ...) and different hyperparameter combinations that have been used. Finally, we will explore sentence embedding meaning.

The dataset is taken from Quora competition at Kaggle:
https://www.kaggle.com/c/quora-question-pairs

  • src: Folder containing script files