Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

val_loss #4

Open
15793723081 opened this issue Nov 27, 2019 · 3 comments
Open

val_loss #4

15793723081 opened this issue Nov 27, 2019 · 3 comments

Comments

@15793723081
Copy link

I have a new question. In my training results, the value of val_loss is very unstable, but the result of train_loss is good. What is the reason for this?

val_loss

0.121626744
0.064643869
1.067177767
1.446711094
1.281244414
0.085074004
0.033228467
0.07067455
0.057629818
0.058747965
1.917625544
0.041074573
0.15646584
1.15546266
1.469133962
0.040493928
1.874986998
1.004034303
0.048311408
0.100535154
1.64835466
0.054501295
0.055335191
0.10456412
0.276266118
0.068388779
0.715268846
0.098956134
2.583597354
0.538678182
1.748115192
0.060186028
1.108167116
0.309113808
0.101157197
0.124394018
0.066762745
0.055764281
0.460467963
2.378881313
0.066589679
0.471164338
0.078409711
0.065184287
0.086847927
0.101323177
0.069627192
1.949108887
0.56550861
1.750528685
1.873376575
1.489622217
0.994260068
0.066217561
0.065698464
0.072816557
0.073953711
0.071316366
0.421468448
0.85330223
0.076412463
1.056728009
0.092980409
0.074839294
0.63967285
1.866399809
1.322311701
0.07251732
0.076650151
0.075991224
0.592627899
0.178139341
0.874911083
0.086961034
2.873830932
0.082318357
0.086678351
0.09311628
0.110618592
0.797192226
0.08203448
0.144572049
0.083447114
0.103286757
0.089605071
0.136912736
0.149895869
0.079707029
0.091769812
0.987303428
0.093305247
0.089239203
1.865140941
1.418952992
0.174026518
0.094368358
0.095077354
0.629811309
0.090620203
1.349854706

train_loss

0.230814192
0.028987918
0.024291774
0.017069094
0.014445363
0.013398573
0.017434838
0.010788895
0.010943851
0.010186778
0.008584953
0.01008794
0.008166692
0.00812426
0.007191894
0.006891105
0.007163724
0.006250243
0.007437692
0.005759141
0.005480012
0.007539251
0.005302504
0.005195955
0.005868624
0.004928016
0.004689556
0.00476825
0.004551805
0.004410871
0.004312812
0.005318373
0.003946777
0.00389254
0.004088193
0.004916692
0.003619704
0.003698593
0.003596091
0.003551835
0.003599238
0.003506489
0.003349912
0.004076907
0.003196912
0.003090051
0.003097881
0.003223821
0.003074504
0.003073007
0.002930183
0.002931052
0.002878063
0.002818113
0.004682063
0.002902587
0.00261943
0.002597745
0.002628764
0.002631262
0.003391225
0.002469422
0.002454039
0.002499085
0.002512494
0.00248172
0.002464812
0.002398441
0.002394155
0.002346173
0.002301199
0.002282861
0.002268845
0.002215712
0.002189308
0.002250406
0.002511505
0.002021703
0.00206177
0.002084404
0.002078039
0.00203906
0.002020018
0.001995647
0.001981924
0.001943689
0.001929661
0.001958759
0.001993959
0.001822729
0.001834546
0.001916094
0.001810232
0.001773823
0.001764436
0.001750559
0.001739222
0.002303985
0.001640475
0.001577288

@cosmic-cortex
Copy link
Owner

Unfortunately, I cannot tell you the exact reason for this because I don't have enough information, but seems like your model is overfitting. Do you use the same augmentation during validation as you use during training?

@alejandrodelac
Copy link

alejandrodelac commented Feb 7, 2021

Question to @cosmic-cortex

Dear cosmic-cortex,

First of all, many thanks for sharing this implementation of the u-net. It seems to be a great deal of work.

Similar to 15793723081, I am also having problems with overfitting. The network seems no to generalize well.

I downloaded the "Kaggle Data Science Bowl 2018 nuclei detection challenge dataset". I used your program kaggle_dsb18_preprocessing.py" to prepare the images and the masks. From the 670 files of the data set, I used 450 images (with their respective masks) to train the network and 220 images for the validation test. Please note that I did not separate the fluorescence, brightfield, and histopathological images. I used them all together for the training.

For the network I used a depth=6 and a width=32 and ran it for 100 epochs. I think I am using augmentation, as I left these two lines:

tf_train = JointTransform2D(crop=crop, p_flip=0.5, color_jitter_params=None, long_mask=True)
tf_val = JointTransform2D(crop=crop, p_flip=0.5, color_jitter_params=None, long_mask=True)

Do you have any suggestions/comments on what to do?

Many thanks in advance.

@cosmic-cortex
Copy link
Owner

@alejandrodelac

Hi! I am going to be honest, the last time I trained a U-Net on this dataset was about three years ago, so I don't exactly remember what steps I took to make it generalize well.

If I remember correctly, these were the augmentation steps I used:

  • splitting up the images into 512 x 512 patches in advance
  • randomly cropping 256 x 256 patches
  • color_jitter_params=(0.1, 0.1, 0.1, 0.1)

Unfortunately, I don't remember the number of epochs at all. However, I used a learning rate scheduler and reduced the LR when the validation loss was plateauing.

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants