This repository provides PyTorch implementation for the paper Dynamic Semantic Matching and Aggregation Network for Few-shot Intent Detection (Findings of EMNLP'2020)
Python 3.6.2
Numpy
Pandas
Pytorch 1.0.1
Scikit-learn 0.21.1
We conduct the split on NLUE and SNIPS dataset in dataset directory. Please take a look at our paper for details of the split.
Please obtain and put the pre-trained FastText embedding in our fasttext directory of the repository (named as vectors-en.txt). Otherwise, create your own FastText embedding directory and update the argument in Configuration below.
--ckpt_dir: Saved directory for checkpoint--eps: Evaluate as nonepisodic or episodic procedure (i.e. eps or noneps)--num_eps: Number of episodes used for episodic training and/or evaluation--dataset: Choose dataset to train/evaluate (i.e. SNIPS/ NLUE)--num_run: number of runs (only for SNIPS)--num_fold: number of KFold counting from 1 to 10 (only for NLUE)--src: Source data used for training (i.e. seen)--tgt: Data used for evaluation (i.e. novel or joint)--num_samples_per_class: K in C-way K-shot--num_class: C class in C-way K-shot--num_query_per_class: num query per class (Q)--num_test_class: Number of classes used for evaluation (i.e. C for episodic, #total classes in joint/novel space)--fasttext_path: FastText pretrained embedding file location
Regularization hyperparameters
--self_attn_loss: coefficient for Self-attention Regularization--uniform_loss: coefficient for Head Uniform Regularization--same_intent_loss: coefficient for Head Distribution Regularization
FSL (episode-nonepisode)
python main.py --dataset='SNIPS' --tgt='novel' --eps='eps' --num_class=2 --num_test_class=2
python main.py --dataset='SNIPS' --tgt='novel' --eps ='noneps' --num_class=2 --num_test_class=2
GFSL (episode-nonepisode)
python main.py --dataset='SNIPS' --tgt='joint' --eps='eps' --num_class=2 --num_test_class=2
python main.py --dataset='SNIPS' --tgt='joint' --eps ='noneps' --num_class=2 --num_test_class=7
FSL (episode-nonepisode)
python main.py --dataset='NLUE' --tgt='novel' --eps='eps' --num_class=5 --num_test_class=5
python main.py --dataset='NLUE' --tgt='novel' --eps ='noneps' --num_class=5 --num_test_class=16
GFSL (episode-nonepisode)
python main.py --dataset='NLUE' --tgt='joint' --eps='eps' --num_class=5 --num_test_class=5
python main.py --dataset='NLUE' --tgt='joint' --eps ='noneps' --num_class=5 --num_test_class=64
If you use our ideas, code or dataset, please cite the following paper:
@inproceedings{nguyen-etal-2020-dynamic,
title = "Dynamic Semantic Matching and Aggregation Network for Few-shot Intent Detection",
author = "Nguyen, Hoang and
Zhang, Chenwei and
Xia, Congying and
Yu, Philip",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
month = nov,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.findings-emnlp.108",
pages = "1209--1218",
}
https://github.com/ZhixiuYe/MLMAN
https://github.com/galsang/BIMPM-pytorch