Test error calculation is not correct

I tried to use dcrnnsupervise.evaulate() to calculate the test MAE but found that the results will be largely different when using different test batch sizes. Later I found that the current implementation directly calculates the (masked) MAE per batch and then simply averages them. This is not correct since the masking average is not a linear operation and  then it cannot be done per batch and then calculate their average. The correct approach should be first to collect all predictions (and the corresponding targets) and then calculate the (masked) MAE over this full batch of data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test error calculation is not correct #16

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Test error calculation is not correct #16

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions