Skip to content

I get the same training error by every epoch  #98

@byuns9334

Description

@byuns9334

I'm running " th Train.lua -epochSave -learningRateAnnealing 1.1 -trainingSetLMDBPath prepare_datasets/libri_lmdb/train/ -validationSetLMDBPath prepare_datasets/libri_lmdb/test/ -LSTM -hiddenSize 500 -permuteBatch " on librispeech dataset, but I still get the same training error on every epoch, while the loss continuously gets decreased.

Here's what I get:

Number of parameters: 31576697
[==================== 136/136 ================>] Tot: 1m13s | Step: 646ms
Training Epoch: 1 Average Loss: nan Average Validation WER: 100.09 Average Validation CER: 62.14
Saving model..
[==================== 136/136 ================>] Tot: 1m18s | Step: 566ms
Training Epoch: 2 Average Loss: 7047724391721807312664730917666816.000000 Average Validation WER: 100.05 Average Validation CER: 61.98
Saving model..
[==================== 136/136 ================>] Tot: 1m18s | Step: 588ms
Training Epoch: 3 Average Loss: 3568794773768703940829837988462592.000000 Average Validation WER: 100.05 Average Validation CER: 62.00
Saving model..
[==================== 136/136 ================>] Tot: 1m19s | Step: 555ms
Training Epoch: 4 Average Loss: nan Average Validation WER: 100.05 Average Validation CER: 62.03
Saving model..

How should I resolve this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions