Distillation/Pruning of BERT LM for faster response times #65

vdpappu · 2019-06-21T02:48:44Z

No description provided.

vdpappu · 2019-06-24T01:56:28Z

@shivamakhauri04 please post the current updates

shivamakhauri04 · 2019-06-24T06:29:30Z

Have written 60-70 percent of the training pipeline. Since there are certain uncertainties with the network architecture, I am going forward with setting the parent network to be a bert classifier network, as the only few kinds of literature available on ml distillation suggest the same.. Hence, I am taking the parent network to be the nsp context window classifier I had developed earlier (in the process, will see if I can do some optimization there too). Will train on a general dataset for validating the sanity of the pipeline, once fully set.

vdpappu added the enhancement New feature or request label Jun 21, 2019

vdpappu assigned shivamakhauri04 Jun 21, 2019

karthikmuralidharan unassigned shivamakhauri04 Mar 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distillation/Pruning of BERT LM for faster response times #65

Distillation/Pruning of BERT LM for faster response times #65

vdpappu commented Jun 21, 2019

vdpappu commented Jun 24, 2019

shivamakhauri04 commented Jun 24, 2019

Distillation/Pruning of BERT LM for faster response times #65

Distillation/Pruning of BERT LM for faster response times #65

Comments

vdpappu commented Jun 21, 2019

vdpappu commented Jun 24, 2019

shivamakhauri04 commented Jun 24, 2019