Name		Name	Last commit message	Last commit date
parent directory ..
tokenizer		tokenizer
LICENSE		LICENSE
README.md		README.md
decode_excerpt.py		decode_excerpt.py
input.txt		input.txt
onnx_t5.py		onnx_t5.py
t5_base_japanese_ner.py		t5_base_japanese_ner.py

README.md

T5 base Japanese ner

Named entity recogtinion model made by fine-tuning sonoisa/t5-base-japanese

Input

TEXT file. The default text is "伊藤左千夫は1893年から知人から学んだ短歌を詠むようになったが、当初は古今和歌集の流れをくむ月並調の伝統的な短歌を詠んでいた。"

Output

Dictionaries of the recognized named entities. Span indicates the start and end of the named entity in the original sentence, and type indicates the category of the named entity. This model was trained to classify entities into one of the following categories: {人名, 法人名, 政治的組織名, その他の組織名, 地名, 施設名, 製品名, イベント名}. Finally, the 'text' contains the text of the named entity.

[{'span': [0, 5], 'type': '人名', 'text': '伊藤左千夫'}, {'span': [36, 41], 'type': '製品名', 'text': '古今和歌集'}]

Usage

An Internet connection is required when running the script for the first time, as the model files will be downloaded automatically.

Predicted named entities in the input text file will be automatically generated by running the script below.

Running this script in FP16 environments will result in an error due to the range of the floating point expression. Switch to using CPU if necessary. (This is done by setting the argument -e to 0 in the example below)

$ python3 t5_base_japanese_ner.py -f input.txt

Here is how to use the -i (or --input) argument instead.

$ python3 t5_base_japanese_ner.py -i 2008年10月5日、アウェーでのレクレアティーボ・ウェルバ戦でプリメーラ・ディビシオンでの初得点を決めた。

By using the --savepath option, the pickle of the list will be saved to the specified path.

$ python3 t5_base_japanese_ner.py -f input.txt -s result.pickle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t5_base_japanese_ner

t5_base_japanese_ner

README.md

T5 base Japanese ner

Input

Output

Usage

Reference

Framework

Model Format

Netron

encoder

decoder

Files

t5_base_japanese_ner

Directory actions

More options

Directory actions

More options

Latest commit

History

t5_base_japanese_ner

Folders and files

parent directory

README.md

T5 base Japanese ner

Input

Output

Usage

Reference

Framework

Model Format

Netron

encoder

decoder