Skip to content
Stefan Weil edited this page May 11, 2020 · 13 revisions

Training Fraktur with Neue Zürcher Zeitung

See https://github.com/impresso/NZZ-black-letter-ground-truth.

Trainings

Training set 1

All original lines were randomly split in 90 % for training and 10 % for evaluation. 3 line images were skipped because they exceeded the width limit in current lstmtrain. Training is running for 10 epochs.

make -r MODEL_NAME=nzz-new GROUND_TRUTH_DIR=NZZ_groundtruth/gt MAX_ITERATIONS=388550 training

Current CER: 1.384 %, CPU time: 28:33 h

Training set 2

10 pages (same as in original test) were used for evaluation. The code for lstmtrain was modified to allow line images with a width of up to 4096 px. Training is running for 10 epochs with the default network specification and an alternate specification which scales to 64 px height.

make -r MODEL_NAME=nzz-ref GROUND_TRUTH_DIR=NZZ_groundtruth/gt MAX_ITERATIONS=387780 training

Current CER: 1.379 %, CPU time: 26:09 h

make -r MODEL_NAME=nzz-64 GROUND_TRUTH_DIR=NZZ_groundtruth/gt MAX_ITERATIONS=387780 NET_SPEC="[1,64,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c148]" training

Current CER: 2.852 %, CPU time: 15:43 h

``

Clone this wiki locally