-
Notifications
You must be signed in to change notification settings - Fork 2
Make the exponential decay lr schedule available #147
Comments
No more things in the configuration file, there is too much stuff in there already that is only relevant to training. Maybe |
There is https://kbknapp.github.io/clap-rs/clap/struct.Arg.html#method.requires |
maybe |
Yep. It's worth trying if they get hidden if the requirement is not given. But I guess at the very least it would also group the arguments together in usage information? (Which would go a long way of not making it to confusing.) |
https://kbknapp.github.io/clap-rs/clap/struct.ArgGroup.html
https://kbknapp.github.io/clap-rs/clap/struct.App.html#method.arg_group |
I looked a bit further into it, not yet happy with it. Examples are below. For the grouping in the help to work, we need to set Grouping:
No grouping:
Making the args mutually exclusive works via .group(ArgGroup::with_name(SCHEDULE_GROUP).required(true))
.arg(
Arg::with_name(PLATEAU)
.long("plateau")
.help("Plateau learning rate schedule")
.group(SCHEDULE_GROUP)
.requires(PLATEAU_GROUP),
)
.group(
ArgGroup::with_name(PLATEAU_GROUP)
.multiple(true)
.conflicts_with_all(&[EXPONENTIAL, EXPONENTIAL_GROUP])
)
.arg(
Arg::with_name(LR_PATIENCE)
.long("lr-patience")
.value_name("N")
.help("Scale learning rate after N epochs without improvement")
.group(PLATEAU_GROUP)
.default_value_if(PLATEAU, None, "5"),
)
.arg(
Arg::with_name(LR_SCALE)
.long("lr-scale")
.value_name("SCALE")
.help("Value to scale the learning rate by")
.group(PLATEAU_GROUP)
.default_value_if(PLATEAU, None, "0.5"),
)
.arg(
Arg::with_name(EXPONENTIAL)
.long("exponential")
.help("Exponential learning rate schedule")
.group(SCHEDULE_GROUP)
.requires(EXPONENTIAL_GROUP),
)
.group(
ArgGroup::with_name(EXPONENTIAL_GROUP)
.multiple(true)
.conflicts_with_all(&[PLATEAU, PLATEAU_GROUP])
)
.arg(
Arg::with_name(DECAY_RATE)
.long("decay-rate")
.value_name("RATE")
.help("coefficient of the exponential decay")
.group(EXPONENTIAL_GROUP)
.default_value_if(EXPONENTIAL, None, "0.998"),
)
.arg(
Arg::with_name(DECAY_STEPS)
.long("decay-steps")
.value_name("STEPS")
.help("global_step / steps is the exponent of the decay_rate")
.group(EXPONENTIAL_GROUP)
.default_value_if(EXPONENTIAL, None, "100"),
) The error messages we're getting are sometimes helpful: $ ./target/release/sticker train dep.conf ger/train.conll ger/dev.conll
error: The following required arguments were not provided:
<--plateau|--exponential>
USAGE:
sticker train <CONFIG> <TRAIN_DATA> <VALIDATION_DATA> --batchsize <BATCH_SIZE> --lr <LR> --patience <N> --warmup <N> <--plateau|--exponential>
For more information try --help
Sometimes not so much: $ ./target/release/sticker train dep.conf train.conll dev.conll --exponential --lr-patience 5 --lr-scale 0.3
error: The argument '--exponential' cannot be used with one or more of the other specified arguments
USAGE:
sticker train <CONFIG> <TRAIN_DATA> <VALIDATION_DATA> --batchsize <BATCH_SIZE> --lr <LR> --patience <N> --warmup <N> <--decay-rate <RATE>|--decay-steps <STEPS>> <--lr-patience <N>|--lr-scale <SCALE>> <--plateau|--exponential>
For more information try --help ./target/release/sticker train dep.conf ger/train.conll ger/dev.conll --exponential --decay-rate 5 --lr-scale 0.3
error: The argument '--lr-scale <SCALE>' cannot be used with one or more of the other specified arguments
USAGE:
sticker train <CONFIG> <TRAIN_DATA> <VALIDATION_DATA> --batchsize <BATCH_SIZE> --lr <LR> --patience <N> --warmup <N> <--decay-rate <RATE>|--decay-steps <STEPS>> <--lr-patience <N>|--lr-scale <SCALE>> <--plateau|--exponential>
For more information try --help |
Right now, the exponential decay lr schedule is not available for
sticker train
andsticker pretrain
. Once #145 is merged, it would make sense to have it available for both subcommands.This may clutter the command line arguments a bit since we then have:
Plateau decay
Exponential decay
Maybe it would make sense to move the learning-rate schedule related things to the config file.
The text was updated successfully, but these errors were encountered: