Skip to content

Conversation

@dolikey
Copy link

@dolikey dolikey commented Sep 27, 2022

hw01

修改了seed
选取了相关度最高的参数
改了入参个数
改了隐藏层神经元个数
换了一个优化器方法
修改了学习率

@dolikey dolikey changed the title feat: 优化提交 杜贵旺 Sep 27, 2022
@dolikey dolikey changed the title 杜贵旺 hw01-杜贵旺 Sep 27, 2022
@dolikey
Copy link
Author

dolikey commented Oct 20, 2022

hw02

kaggle best private score: 0.76713

● (1%) Simple baseline: 0.45797 (sample code)
● (1%) Medium baseline: 0.69747 (concat n frames, add layers)
● (1%) Strong baseline: 0.75028 (concat n frames, batchnorm, dropout, add layers)
● (1%) Boss baseline: 0.82324 (sequence-labeling(using RNN))

默认值:

hidden_layers = 1
hidden_dim = 256

[005/005] Train Acc: 0.460776 Loss: 1.876422 | Val Acc: 0.457841 loss: 1.889746
saving model with acc 0.458

实验二:
hidden_layers = 6
hidden_dim = 1024

[004/005] Train Acc: 0.494620 Loss: 1.715949 | Val Acc: 0.473995 loss: 1.807472
saving model with acc 0.474

实验三:
hidden_layers = 2
hidden_dim = 1700

[005/005] Train Acc: 0.486311 Loss: 1.749964 | Val Acc: 0.471650 loss: 1.814920
saving model with acc 0.472

实验四:
hidden_layers = 2
hidden_dim = 5000

[003/005] Train Acc: 0.489009 Loss: 1.736357 | Val Acc: 0.470320 loss: 1.820758
saving model with acc 0.470

结论:新增层数会提高准确率

实验五:
batch_size = 2048
num_epoch = 30
hidden_layers = 6
hidden_dim = 1024

[022/030] Train Acc: 0.502249 Loss: 1.686878 | Val Acc: 0.472946 loss: 1.809189
saving model with acc 0.473

结论:num_epoch可能会继续收敛,但是不用太高收敛还挺快,可以先确定其他参数后面再增大训练次数

实验六:
concat_nframes = 13
num_epoch = 10
hidden_layers = 6
hidden_dim = 1024

[009/010] Train Acc: 0.763975 Loss: 0.731162 | Val Acc: 0.684132 loss: 1.046159
saving model with acc 0.684

结论:新增concat_nframes会提高准确率

实验七:
nn.Dropout(0.25)

[010/010] Train Acc: 0.690315 Loss: 0.986674 | Val Acc: 0.702654 loss: 0.946877
saving model with acc 0.703

实验八:
nn.Dropout(0.5)

[010/010] Train Acc: 0.620712 Loss: 1.259031 | Val Acc: 0.666424 loss: 1.080704
saving model with acc 0.666

估计增大训练次数也可以继续收敛

结论:新增dropout会提高准确率,而且作用很明显(随机屏蔽神经元),够防止过拟合。但是不是越大越好

实验九:
nn.BatchNorm1d(output_dim),

[010/010] Train Acc: 0.702750 Loss: 0.925790 | Val Acc: 0.715762 loss: 0.889721
saving model with acc 0.716

结论:新增batchnorm会提高准确率

实验十:
concat_nframes = 19
num_epoch = 100
nn.Dropout(0.35)

[096/100] Train Acc: 0.767724 Loss: 0.708376 | Val Acc: 0.765344 loss: 0.743654
saving model with acc 0.765

@dolikey dolikey changed the title hw01-杜贵旺 hw02-杜贵旺 Oct 20, 2022
@dolikey dolikey changed the title hw02-杜贵旺 杜贵旺(hw01、hw02) Oct 21, 2022
@dolikey
Copy link
Author

dolikey commented Oct 21, 2022

hw02
kaggle_duguiwang

@dolikey dolikey changed the title 杜贵旺(hw01、hw02) 杜贵旺(hw01、hw02、hw03) Nov 4, 2022
@dolikey
Copy link
Author

dolikey commented Nov 5, 2022

hw03

Simple : 0.50099
Medium : 0.73207 Training Augmentation + Train Longer
Strong : 0.81872 Training Augmentation + Model Design + Train Looonger (+ Cross Validation + Ensemble)
Boss : 0.88446 Training Augmentation + Model Design +Test Time Augmentation + Train Looonger (+ Cross Validation + Ensemble)

实验1:
直接执行
[ Valid | 003/003 ] loss = 1.58524, acc = 0.47608 -> best

实验2:
增加次数
n_epochs = 20
Augmentation
transforms.Pad(padding=10, fill=0),
transforms.Grayscale(num_output_channels=3),
transforms.RandomRotation(180),
transforms.RandomResizedCrop((128, 128), scale=(0.5, 1.0)),

[ Train | 020/020 ] loss = 1.26153, acc = 0.56024
[ Valid | 012/020 ] loss = 2.86477, acc = 0.23221 -> best

结论:添加图片变形收敛速度变慢了,而且反而出现了过拟合

实验3:
去掉灰度,outputchannels=3不知到有什么效果,一般单通道的
改成新的修改修改亮度、对比度和饱和度
transforms.ColorJitter(brightness=0.6, contrast=0.6, saturation=0.4, hue=0.1)
n_epochs = 20重新跑一次

[ Valid | 019/020 ] loss = 1.43384, acc = 0.52173 -> best

结论:合适的图片变形会增加识别率

实验4:
归一化transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
使用了resNet50模型(开启了pretrained预训练模型,想对比一下)
models.resnet50(pretrained=True)

train_tfm = tf.Compose([
tf.PILToTensor(),
tf.ConvertImageDtype(torch.float),
tf.RandomHorizontalFlip(),
tf.Resize((224, 224), interpolation=torchvision.transforms.InterpolationMode.BICUBIC),
tf.RandomErasing(),
tf.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])

结论:这种大参数模型收敛更慢了

实验5:
替换了transform:
test_tfm = tf.Compose([
tf.PILToTensor(),
tf.ConvertImageDtype(torch.float),
tf.Resize((224, 224), interpolation=torchvision.transforms.InterpolationMode.BICUBIC),
tf.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])

重新设计了输出层,新增了dropout
self.fc_add = nn.Sequential(
nn.Linear(1000, 256),
nn.ReLU(),
nn.Dropout(0.4),
nn.Linear(256, 11),
nn.LogSoftmax(dim=1)
)

n_epochs = 20重新跑一次

[ Valid | 001/020 ] loss = 0.72903, acc = 0.76770 -> best
[ Valid | 008/020 ] loss = 0.54365, acc = 0.82701 -> best
只跑了八次,有结论后后面就不执行了

结论:好像需要改动测试集才行,现在可以看出来预训练还是很厉害的

实验6:
关闭预训练
models.resnet50(pretrained=False)
增加次数
n_epochs = 80
调高学习率
lr=0.001

[ Valid | 032/080 ] loss = 2.28870, acc = 0.19313

收敛太慢了

实验7:
使用代码实例中的Residual_Network

比上面的好一些了,但是还是收敛很慢

实验8:
使用随机梯度下降SGD
增大初始学习率新增余弦退火学习率
还是使用torchvision的resnet50模型
归一化
Data Augmentation(5个)

5:30分钟一个epoch

[ Train | 001/010 ] loss = 2.31989, acc = 0.14399
[ Valid | 001/010 ] loss = 2.29800, acc = 0.13130 -> best
[ Train | 002/010 ] loss = 2.29757, acc = 0.16403
[ Valid | 002/010 ] loss = 2.29432, acc = 0.14655 -> best

结论:收敛稍微快了一些,但是5分钟还是慢了。

实验9:
Data Augmentation 减少一个(4个)
n_epochs=3000跑一晚看看

[ Train | 001/3000 ] loss = 2.31403, acc = 0.14575
[ Valid | 001/3000 ] loss = 2.29702, acc = 0.16478 -> best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant