Skip to content

AttributeError: 'NoneType' object has no attribute 'tokenize' #15

@joshhu

Description

@joshhu
  • 使用bert msr,下載了資料集,確定/data/msr/目錄下資料都在
  • 也將models/bert中的config.json設定好了
  • 使用run.sh中的bert msr部分,出現下列情況,用torch1.1或1.3都一樣。
device: cuda n_gpu: 2, distributed training: False, 16-bits training: False
# of word in train: 88120:
# of n-gram in memory: 49121
# of trainable parameters: 140006464
***** Running training *****
  Num examples = 81190
  Batch size = 16
  Num steps = 253700
  0%|                                                                                          | 0/5075 [00:00<?, ?it/s]
Epoch:   0%|                                                                                     | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/joshhu/workspace/WMSeg/wmseg_main.py", line 677, in <module>
    main()
  File "/home/joshhu/workspace/WMSeg/wmseg_main.py", line 667, in main
    train(args)
  File "/home/joshhu/workspace/WMSeg/wmseg_main.py", line 194, in train
    train_features = convert_examples_to_features(batch_examples)
  File "/home/joshhu/workspace/WMSeg/wmseg_model.py", line 270, in convert_examples_to_features
    token = tokenizer.tokenize(word)
AttributeError: 'NoneType' object has no attribute 'tokenize'

感謝。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions