jieba: https://github.com/fxsjy/jieba
paddlepaddle: https://github.com/PaddlePaddle/Paddle
Tencent AI Lab Embedding Corpora for Chinese and English Words and Phrases:
https://ai.tencent.com/ailab/nlp/en/embedding.html
There is a download page link in the above link to get the embeddings, and we used the v0.2.0 with dimention 100 and vocab size Small (2,000,000), which is the 3rd one in the first table.
Download and decompress it, then put the entire folder into this CS577_FinalProject-main folder.
After ensuring everything is ready, go to CS577_FinalProject-main folder and type
python main.py
in command line to run the program.