Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

预训练word-embedding来源 #8

Open
zjuwfz opened this issue Dec 21, 2018 · 2 comments
Open

预训练word-embedding来源 #8

zjuwfz opened this issue Dec 21, 2018 · 2 comments

Comments

@zjuwfz
Copy link

zjuwfz commented Dec 21, 2018

您好,我最近在做bilstm-crf分词实验,使用了您项目中预训练的word-embedding之后结果提升了两个点。所以想问一下您的word-embedding来源是哪,还是自己训练的?

@hankcs
Copy link
Owner

hankcs commented Dec 21, 2018

感谢使用,这是个振奋人心的结果。我的word-embedding(其实是char-embedding)考虑了汉字的偏旁部首等构字信息,然后利用fastText的General Continuous Skip-Gram (SG) Model训练。关于这种字向量的原理,欢迎参考https://arxiv.org/pdf/1712.08841.pdf

@zjuwfz
Copy link
Author

zjuwfz commented Dec 24, 2018

非常感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants