fine-tuning paradigms #4

shahpnmlab · 2025-05-14T22:28:29Z

shahpnmlab
May 14, 2025
Maintainer

progressive lrs (discriminative learning rates) doesnt result in good training results. to improve outcomes, it is necessary to train the network with as much data as possible and then perform fine-tuning. this typer of transfer learning reequires a lot of data to bridge domain gaps. try lowering the lr by lr(n-1)=lr/2.6 factor (https://paperswithcode.com/method/discriminative-fine-tuning). also is it worth replacing the last layer with random weghts before finetuning?

NOTE: transfer learning using the knowledge distillation helps with bridging domain gaps. the way to effectively train such a model is to have a big teacher model (more params) and a smaller student model that is then used for domain specific tasks (https://intellabs.github.io/distiller/knowledge_distillation.html). thus the TODO here would be for me to train a much larger model with as much data as I can get my hands on and then perform fine-tuning of a smaller model and measure the performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fine-tuning paradigms #4

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

fine-tuning paradigms #4

Uh oh!

Uh oh!

shahpnmlab May 14, 2025 Maintainer

Replies: 0 comments

shahpnmlab
May 14, 2025
Maintainer