Releases: open-mmlab/mmengine
Releases · open-mmlab/mmengine
MMEngine Release V0.8.3
v0.8.3 (31/07/2023)
Highlights
- Support enabling
efficient_conv_bn_eval
for efficient convolution and batch normalization. See save memory on gpu for more details - Add Llama2 finetune example
- Support multi-node distributed training with MLU backend
New Features & Enhancements
- Enable
efficient_conv_bn_eval
for memory saving convolution and batch normalization by @youkaichao in #1202, #1251 and #1259 - Add Llama2 example by @HAOCHENYE in #1264
- Compare the difference of two configs by @gachiemchiep in #1260
- Enable explicit error for deepspeed not installed by @Li-Qingyun in #1240
- Support skipping initialization in
BaseModule
by @HAOCHENYE in #1263 - Add parameter
save_begin
to control when to save checkpoints by @KerwinKai in #1271 - Support multi-node distributed training with MLU backend by @josh6688 in #1266
- Enhance error message thrown by Config, build function and
ConfigDict.items
by @HAOCHENYE in #1272, #1270 and #1088 - Add the
loop_stage
runtime information inmessage_hub
by @zhouzaida in #1277 - Fix Visualizer that built
vis_backends
will not be used whensave_dir
isNone
by @Xinyu302 in #1275
Bug fixes
- Fix scalar check in RuntimeInfoHook by @i-aki-y in #1250
- Move data preprocessor to target device in FSDPStrategy by @HAOCHENYE in #1261
Docs
- Add ecosystem in README by @zhouzaida in #1247
- Add short explanation about registry scope by @mmeendez8 in #1114
- Add the data flow of Runner in README by @zhouzaida in #1257
- Introduce how to customize distributed training settings @zhouzaida in #1279
New Contributors
- @youkaichao made their first contribution in #1202
- @mmeendez8 made their first contribution in #1114
- @Xinyu302 made their first contribution in #1275
Full Changelog: v0.8.2...v0.8.3
MMEngine Release V0.8.2
Bug fixes
- Fix pickling the Python style config by @HAOCHENYE in #1241
- Fix the logic of setting
lazy_import
by @Li-Qingyun in #1239
New Contributors
- @Li-Qingyun made their first contribution in #1239
Full Changelog: v0.8.1...v0.8.2
MMEngine Release V0.8.1
New Features & Enhancements
- Accelerate
Config.dump
and support converting Lazyxxx to string inConfigDict.to_dict
by @HAOCHENYE in #1232
Bug fixes
- FSDP should call
_get_ignored_modules
by @HAOCHENYE in #1235
Docs
- Add a document to introduce how to train a large model by @zhouzaida in #1228
Full Changelog: v0.8.0...v0.8.1
MMEngine Release V0.8.0
v0.8.0 (07/03/2023)
Highlights
-
Support training with FSDP and DeepSpeed. Refer to the example for more detailed usages.
-
Introduce the pure Python style configuration file:
- Support navigating to base configuration file in IDE
- Support navigating to base variable in IDE
- Support navigating to source code of class in IDE
- Support inheriting two configuration files containing the same field
- Load the configuration file without other third-party requirements
Refer to the tutorial for more detailed usages.
New Features & Enhancements
- Support training with FSDP by @HAOCHENYE in #1213
- Add
FlexibleRunner
andStrategies
, and support training with DeepSpeed by @zhouzaida in #1183 - Support pure Python style configuration file by @HAOCHENYE in #1071
- Learning rate in log can show the base learning rate of optimizer by @AkideLiu in #1019
- Refine the error message when auto_scale_lr is not set correctly by @alexander-soare in #1181
- WandbVisBackend supports updating config by @zgzhengSEU in #977
Bug fixes
- CheckpointHook should check whether file exists before removing it by @zhouzaida in #1198
- Fix undefined variable error in Runner by @HAOCHENYE in #1219
Docs
- Add a document to introduce how to debug with vscode by @zhouzaida in #1212
- Update English introduction by @evdcush in #1189
- Fix parameter typing error in document by @syo093c in #1201
- Fix gpu collection during evaluation by @edkair in #1208
- Fix a comment in runner tutorial by @joihn in #1210
New Contributors
- @alexander-soare made their first contribution in #1181
- @zgzhengSEU made their first contribution in #977
- @AkideLiu made their first contribution in #1019
- @syo093c made their first contribution in #1201
- @edkair made their first contribution in #1208
- @joihn made their first contribution in #1210
Full Changelog: v0.7.4...v0.8.0
MMEngine Release V0.7.4
v0.7.4 (06/03/2023)
Highlights
- Support using
ClearML
to record experiment data - Add
Sophia
optimizers
New Features & Enhancements
- Add visualize backend for clearml by @gachiemchiep in #1091
- Support Sophia optimizers by @zhouzaida in #1170
- Refactor unittest syncbuffer by @HAOCHENYE in #813
- Allow
ann_file
,data_root
isNone
forBaseDataset
by @HAOCHENYE in #850 - Enable full precision training on Ascend NPU by @Ginray in #1109
- Creating a text classification example by @TankNee in #1122
- Add option to log selected config only by @KickCellarDoor in #1159
- Add an option to control whether to show progress bar in BaseInference by @W-ZN in #1135
- Support dipu device by @CokeDong in #1127
- Let unit tests not affect each other by @zhouzaida in #1169
- Add support for full wandb's
define_metric
arguments by @i-aki-y in #1099
Bug fixes
- Fix the incorrect device of inputs in get_model_complexity_info by @CescMessi in #1130
- Correctly saves
_metadata
ofstate_dict
when saving checkpoints by @Bomsw in #1131 - Correctly record random seed in log by @Shiyang980713 in #1152
- Close MLflowVisBackend only if active by @zimonitrome in #1151
- Fix
ProfileHook
cannot profile ddp-training by @HAOCHENYE in #1140 - Handle the case for Multi-Instance GPUs when using
cuda_visible_devices
by @adrianjoshua-strutt in #1164 - Fix attribute error when parsing
CUDA_VISIBLE_DEVICES
in logger @Xiangxu-0103 in #1172
Docs
- Translate
infer.md
by @Hongru-Xiao in #1121 - Fix a missing comma in
tutorials/runner.md
by @gy-7 in #1146 - Fix typo in comment by @YQisme in #1154
- Translate
data_element.md
by @xin-li-67 in #1067 - Add the usage of clearml by @zhouzaida in #1180
New Contributors
- @CescMessi made their first contribution in #1130
- @Bomsw made their first contribution in #1131
- @Hongru-Xiao made their first contribution in #1121
- @TankNee made their first contribution in #1122
- @W-ZN made their first contribution in #1135
- @gy-7 made their first contribution in #1146
- @YQisme made their first contribution in #1154
- @Shiyang980713 made their first contribution in #1152
- @KickCellarDoor made their first contribution in #1159
- @CokeDong made their first contribution in #1127
- @zimonitrome made their first contribution in #1151
- @adrianjoshua-strutt made their first contribution in #1164
- @gachiemchiep made their first contribution in #1091
- @i-aki-y made their first contribution in #1099
Full Changelog: v0.7.3...v0.7.4
MMEngine Release V0.7.3
What's Changed
New Features & Enhancements
- Add
MLflowVisBackend
by @sh0622-kim in #878 - Support customizing
worker_init_fn
in dataloader config by @shufanwu in #1038 - Make the parameters of get_model_complexity_info() friendly by @sjiang95 in #1056
- Add torch_npu optimizer by @luomaoling in #1079
- Support registering callable objects @C1rN09 in #595
- Complement type hint of get_model_complexity_info() by @sjiang95 in #1064
- MessageHub.get_info() supports returning a default value by @enkilee in #991
- Refactor logger hook unit test by @HAOCHENYE in #797
- Support BoolTensor and LongTensor on Ascend NPU by @Ginray in #1011
- Remove useless variable declaration by @HAOCHENYE in #1052
- Enhance the support for MLU device by @josh6688 in #1075
- Support configuring synchronization directory for BaseMetric by @HAOCHENYE in #1074
- Support accepting multiple
input_shape
forget_model_complexity_info
by @sjiang95 in #1065 - Enhance docstring and error catching in
MessageHub
by @HAOCHENYE in #1098 - Enhance the efficiency of Visualizer.show by @HAOCHENYE in #1015
- Update repo list by @HAOCHENYE in #1108
- Enhance error message during custom import by @HAOCHENYE in #1102
- Support
_load_state_dict_post_hooks
inload_state_dict
by @mzr1996 in #1103
Bug fixes
- Fix publishing multiple checkpoints when using multiple GPUs by @JunweiZheng93 in #1070
- Fix error when
log_with_hierarchy
isTrue
by @HAOCHENYE in #1085 - Call SyncBufferHook before validation in IterBasedTrainLoop by @Luo-Yihang in #982
- Fix the resuming error caused by HistoryBuffer by @HAOCHENYE in #1078
- Failed to remove the previous best checkpoints by @HAOCHENYE in #1086
- Fix using incorrect local rank by @C1rN09 in #973
- No training log when the num of iterations is smaller than the default interval by @shufanwu in #1046
collate_fn
could not be a function object by @zhouzaida in #1093- Fix
optimizer.state
could be saved in cuda:0 by @HAOCHENYE in #966 - Fix building unnecessary loop during train/test/val by @HAOCHENYE in #1107
Docs
- Introduce the use of wandb and tensorboard by @zhouzaida in #912
- Translate tutorials/evaluation.md by @LEFTeyex in #1053
- Translate design/evaluation.md by @zccjjj in #1062
- Fix three typos in runner by @jsrdcht in #1068
- Translate migration/hook.md to English by @SheffieldCao in #1054
- Replace MMCls with MMPretrain in docs by @zhouzaida in #1096
New Contributors
- @sh0622-kim made their first contribution in #878
- @Ginray made their first contribution in #1011
- @shufanwu made their first contribution in #1038
- @sjiang95 made their first contribution in #1056
- @JunweiZheng93 made their first contribution in #1070
- @SheffieldCao made their first contribution in #1054
- @jsrdcht made their first contribution in #1068
- @josh6688 made their first contribution in #1075
- @Luo-Yihang made their first contribution in #982
- @zccjjj made their first contribution in #1062
Full Changelog: v0.7.2...v0.7.3
MMEngine Release V0.7.2
v0.7.2 (04/06/2023)
Bug fixes
- Align the evaluation result in log by @kitecats in #1034
- Update the logic to calculate the
repeat_factors
inClassBalancedDataset
by @BIGWangYuDong in #1048 - Initialize sub-modules in
DistributedDataParallel
that defineinit_weights
during initialization by @HAOCHENYE in #1045 - Refactor checkpointhook unittest by @HAOCHENYE in #789
New Contributors
Full Changelog: v0.7.1...v0.7.2
MMEngine Release V0.7.1
v0.7.1 (04/03/2023)
Highlights
- Support compiling the model and enabling mixed-precision training at the same time
- Fix the bug where the logs cannot be properly saved to the log file after calling
torch.compile
New Features & Enhancements
- Add
mmpretrain
to theMODULE2PACKAGE
. by @mzr1996 in #1002 - Support using
get_device
in the compiled model by @C1rN09 in #1004 - Make sure the FileHandler still alive after
torch.compile
by @HAOCHENYE in #1021 - Unify the use of
print_log
andlogger.info(warning)
by @LEFTeyex in #997 - Publish models after training if published_keys is set in CheckpointHook by @KerwinKai in #987
- Enhance the error catching in registry by @HAOCHENYE in #1010
- Do not print config if it is empty by @zhouzaida in #1028
Bug fixes
- Fix there is no space between
data_time
and metric in logs by @HAOCHENYE in #1025
Docs
New Contributors
- @evdcush made their first contribution in #1018
- @KerwinKai made their first contribution in #987
Full Changelog: v0.7.0...v0.7.1
MMEngine Release V0.7.0
v0.7.0 (03/16/2023)
Highlights
- Support PyTorch 2.0! Accelerate training by compiling models. See the tutorial Model Compilation for details
- Add
EarlyStoppingHook
to stop training when the metric does not improve
New Features & Enhancements
- Add configurations to support
torch.compile
in Runner by @C1rN09 in #976 - Support
EarlyStoppingHook
by @nijkah in #739 - Disable duplicated warning during distributed training by @HAOCHENYE in #961
- Add
FUNCTIONS
root Registry by @HAOCHENYE in #983 - Save the "memory" field to visualization backends by @enkilee in #974
- Enable bf16 in
AmpOptimWrapper
by @C1rN09 in #960 - Support writing data to
vis_backend
with prefix by @HAOCHENYE in #972 - Support exporting logs of different ranks in debug mode by @HAOCHENYE in #968
- Silence error when
ManagerMixin
built instance with duplicate name. by @HAOCHENYE in #990
Bug fixes
- Fix optim_wrapper unittest for
pytorch < 1.10.0
by @C1rN09 in #975 - Support calculating the flops of
matmul
with single dimension matrix by @HAOCHENYE in #970 - Fix repeated warning by @HAOCHENYE in #992
- Fix lint by @zhouzaida in #993
- Fix AMP in Ascend and support using NPUJITCompile environment by @luomaoling in #994
- Fix inferencer gets wrong configs path by @HAOCHENYE in #996
Docs
- Translate "Debug Tricks" to English by @enkilee in #953
- Translate "Model Analysis" document to English by @enkilee in #956
- Translate "Model Complexity Analysis" to Chinese. by @VoyagerXvoyagerx in #969
- Add a document about setting interval by @YuetianW in #964
- Translate "how to set random seed" by @xin-li-67 in #930
- Fix typo by @zhouzaida in #965
- Fix typo in hook document by @acdart in #980
- Fix changelog date by @HAOCHENYE in #986
New Contributors
- @YuetianW made their first contribution in #964
- @enkilee made their first contribution in #953
- @acdart made their first contribution in #980
- @VoyagerXvoyagerx made their first contribution in #969
Full Changelog: v0.6.0...v0.7.0
MMEngine Release V0.6.0
v0.6.0 (02/24/2023)
Highlights
- Support
Apex
withApexOptimWrapper
- Support analyzing model complexity.
- Add
Lion
optimizer. - Support using environment variable in the config file.
New Features & Enhancements
- Support model complexity computation by @tonysy in #779
- Add Lion optimizer by @zhouzaida in #952
- Support using environment variable in config file. by @jbwang1997 in #744
- Improve registry infer_scope by @zhouzaida in #334
- Support configuring
timeout
in dist configuration by @apacha in #877 - Beautify the print result of the registry by @Eiuyc in #922
- Refine the style of table by @zhouzaida in #941
- Refine the
repr
of Registry by @zhouzaida in #942 - Feature NPUProfilerHook by @luomaoling in #925
- Refactor hooks unittest by @HAOCHENYE in #946
- Temporarily fix
collect_env
raise errors and stops programs by @C1rN09 in #944 - Make sure Tensors to broadcast is contiguous by @XWHtorrentx in #948
- Clean the UT warning caused by pytest by @zhouzaida in #947
Bug fixes
- Backend_args should not be modified by get_file_backend by @zhouzaida in #897
- Support update
np.ScalarType
data in message_hub by @HAOCHENYE in #898 - Support rendering Chinese character in
Visualizer
by @KevinNuNu in #887 - Support
Apex
withApexOptimWrapper
by @xcnick in #742 - Fix the bug of
DefaultOptimWrapperConstructor
when the shared parameters do not require the grad by @HIT-cwh in #903 - Support model complexity computation by @tonysy in #779
Docs
- Add the document for the transition between IterBasedTraining and EpochBasedTraining by @HAOCHENYE in #926
- Introduce how to set random seed by @zhouzaida in #914
- Count FLOPs and parameters by @zhouzaida in #939
- Enhance README by @Xiangxu-0103 in #835
- Add a document about debug tricks by @zhouzaida in #938
- Refine the format of changelog and visualization document by @zhouzaida in #906
- Move examples to a new directory by @zhouzaida in #911
- Resolve warnings in sphinx build by @C1rN09 in #915
- Fix docstring by @zhouzaida in #913
- How to set the interval parameter by @zhouzaida in #917
- Temporarily skip errors in building pdf docs at readthedocs by @C1rN09 in #928
- Add the links of twitter, discord, medium, and youtube by @vansin in #924
- Fix typo
shedule
by @Dai-Wenxun in #936 - Fix failed URL by @zhouzaida in #943
New Contributors
- @apacha made their first contribution in #877
- @KevinNuNu made their first contribution in #887
- @xcnick made their first contribution in #742
- @Eiuyc made their first contribution in #922
- @tonysy made their first contribution in #779
- @luomaoling made their first contribution in #925
- @XWHtorrentx made their first contribution in #948
Full Changelog: v0.5.0...v0.6.0