AttributeError: 'NoneType' object has no attribute 'tokenize'

* 使用bert msr，下載了資料集，確定`/data/msr/`目錄下資料都在
* 也將`models/bert`中的`config.json`設定好了
* 使用`run.sh`中的bert msr部分，出現下列情況，用torch1.1或1.3都一樣。
```
device: cuda n_gpu: 2, distributed training: False, 16-bits training: False
# of word in train: 88120:
# of n-gram in memory: 49121
# of trainable parameters: 140006464
***** Running training *****
  Num examples = 81190
  Batch size = 16
  Num steps = 253700
  0%|                                                                                          | 0/5075 [00:00<?, ?it/s]
Epoch:   0%|                                                                                     | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/joshhu/workspace/WMSeg/wmseg_main.py", line 677, in <module>
    main()
  File "/home/joshhu/workspace/WMSeg/wmseg_main.py", line 667, in main
    train(args)
  File "/home/joshhu/workspace/WMSeg/wmseg_main.py", line 194, in train
    train_features = convert_examples_to_features(batch_examples)
  File "/home/joshhu/workspace/WMSeg/wmseg_model.py", line 270, in convert_examples_to_features
    token = tokenizer.tokenize(word)
AttributeError: 'NoneType' object has no attribute 'tokenize'
```
感謝。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'NoneType' object has no attribute 'tokenize' #15

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AttributeError: 'NoneType' object has no attribute 'tokenize' #15

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions