-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Hey, i tried the AlexNet model in network.py. The hyperparameter of lr is 0.01, optimizer is SGD, and use CE loss. However, the training does not seem to converge. After 10 epochs, the loss still remains about 2.30.
BTW, i notic that the training of AlexNet starts from pre-trained parameters in the caption of Figure 4 in the paper CryptGPU and the model architecture is the same as PyTorch default Alexnet model (https://pytorch.org/hub/pytorch_vision_alexnet/). However, according to the code in network.py and (https://pytorch.org/hub/pytorch_vision_alexnet/), these two models are quite different.
Below is the model in CryptGPU https://github.com/jeffreysijuntan/CryptGPU/blob/master/scripts/network.py#L32-L83
I wonder how whether there are some points that i missed to carry out the training (both plain-text and private)?