We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
现象 在输入命令行后:rasa train -c config/config.yml --data data/training_dataset_1660793545.json data/stories.md --out models/movie --domain config/domain.yml --num-threads 5 --augmentation 100 -vv。 会出现类似以下的warning提示: C:\Users\26282\miniconda3\envs\rasa2formovieQA\lib\site-packages\rasa\shared\utils\io.py:93: UserWarning: Failed to use example '郭富城表演过哪些喜剧电影' to train MITIE entity extractor. Example will be skipped.Error: Invalid entity {'end': 10, 'entity': 'genre', 'start': 8, 'value': '喜剧'} in example '郭富城表演过哪些喜剧电影': entities must span whole tokens. Wrong entity end. 这导致在后面模型跑起来的时候,识别不出genre这种实体(喜剧、动画等等)。
rasa train -c config/config.yml --data data/training_dataset_1660793545.json data/stories.md --out models/movie --domain config/domain.yml --num-threads 5 --augmentation 100 -vv
训练模型的数据 {"text":"方中信表演动画电影有哪些","intent":"search_person_genre_movie","entities":[{"end":3,"entity":"person","start":0,"value":"方中信"},{"end":7,"entity":"genre","start":5,"value":"动画"}]}
config.yml 有设置jieba分词的用户词典 pipeline:
The text was updated successfully, but these errors were encountered:
我统计了下,在genre词典中,只有动画、恐怖、喜剧、科幻这四种,不能识别出来。请问这是为什么呀?
Sorry, something went wrong.
No branches or pull requests
现象
在输入命令行后:
rasa train -c config/config.yml --data data/training_dataset_1660793545.json data/stories.md --out models/movie --domain config/domain.yml --num-threads 5 --augmentation 100 -vv
。会出现类似以下的warning提示:
C:\Users\26282\miniconda3\envs\rasa2formovieQA\lib\site-packages\rasa\shared\utils\io.py:93: UserWarning: Failed to use example '郭富城表演过哪些喜剧电影' to train MITIE entity extractor. Example will be skipped.Error: Invalid entity {'end': 10, 'entity': 'genre', 'start': 8, 'value': '喜剧'} in example '郭富城表演过哪些喜剧电影': entities must span whole tokens. Wrong entity end.
这导致在后面模型跑起来的时候,识别不出genre这种实体(喜剧、动画等等)。
训练模型的数据
{"text":"方中信表演动画电影有哪些","intent":"search_person_genre_movie","entities":[{"end":3,"entity":"person","start":0,"value":"方中信"},{"end":7,"entity":"genre","start":5,"value":"动画"}]}
config.yml
有设置jieba分词的用户词典
pipeline:
model: "data/total_word_feature_extractor_zh.dat"
dictionary_path: "jieba_userdict"
The text was updated successfully, but these errors were encountered: