bert_ner

1.理论概述

基于google提出的bert模型和tensorflow实现，bilm+crf部分参考了Guillaume Genthial的代码，详细理论讲解可见右左瓜子的知乎专栏

2.Requires

google的bert实现google research bert
google提供的汉语预训练数据 chinese-pretrain

3.使用方法

1.下载以上require中的代码和文件，将bilm_crf.py和bert_bilm_crf.py放到和google的bert中和modeling.py平级的文件夹下
2.运行脚本

python .\bert_bilm_crf.py --task_name=ner --do_train=true --do_eval=false --do_predict=false --data_dir=path\to\yourdata \
--vocab_file=path\to\chinese_L-12_H-768_A-12\vocab.txt --bert_config_file=path\to\chinese_L-12_H-768_A-12\bert_config.json \
--init_checkpoint=path\to\chinese_L-12_H-768_A-12\bert_model.ckpt --max_seq_length=50 --train_batch_size=32 \
--learning_rate=5e-5 --num_train_epochs=2.0 --output_dir=/tmp/ner_output/

3.导出训练结果运行export.py
4.部署在服务端运行export.sh
5调用例子 clien.py

4.关于训练数据

在给出的example.tsv中有两行示例数据，把格式整理成类似的即可

1.theory detail

this is a solution to NER task base on BERT and bilm+crf, the BERT model comes from google's github, the bilm+crf part inspired from Guillaume Genthial's code, visit this page for more details

2.Requires

google's BERT model google research bert
a Chinese pre-trained model from googlechinese-pretrain

3.how to use

1.download codes and files mentioned above, put the bilm_crf.py and bert_bilm_crf.py into the google's BERT directory, the to python file is at the same level with modeling.py
2.run script

python .\bert_bilm_crf.py --task_name=ner --do_train=true --do_eval=false --do_predict=false --data_dir=path\to\yourdata \
--vocab_file=path\to\chinese_L-12_H-768_A-12\vocab.txt --bert_config_file=path\to\chinese_L-12_H-768_A-12\bert_config.json \
--init_checkpoint=path\to\chinese_L-12_H-768_A-12\bert_model.ckpt --max_seq_length=50 --train_batch_size=32 \
--learning_rate=5e-5 --num_train_epochs=2.0 --output_dir=/tmp/ner_output/

3.export model run export.py
4.deploy run export.sh
5.example clien.py

4.about trainning data

there is an example data in example.tsv to show the formate, you are surpoed to transform your data into this formate, or you can modify the input_fn in bert_bilm_crf.py

更新情感分析方法：bert_senta.py,以及预测方法senta_pred.py，senta_pred.py中读取数据的方法都注释掉了使用时添加上自己的数据读取方式预测方法使用了dataset数据流的形式，单个预测耗时10ms

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
LICENSE		LICENSE
README.md		README.md
bert_ner.py		bert_ner.py
bert_senta.py		bert_senta.py
client.py		client.py
example.tsv		example.tsv
export.py		export.py
export.sh		export.sh
senta_pred.py		senta_pred.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bert_ner

1.理论概述

2.Requires

3.使用方法

4.关于训练数据

1.theory detail

2.Requires

3.how to use

4.about trainning data

About

Releases

Packages

Languages

License

cedar33/bert_ner

Folders and files

Latest commit

History

Repository files navigation

bert_ner

1.理论概述

2.Requires

3.使用方法

4.关于训练数据

1.theory detail

2.Requires

3.how to use

4.about trainning data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages