记录pytorch中如何固定部分参数的权重,实现模型部分训练。
参数能被更新的条件:
参数允许保存梯度(即requires_grad=True
)
有优化器对参数进行修改(即绑定了对应的Optimizer
对象)
二者必须同时满足。
冻结 在初始化模型之后,对想要冻结的参数设置其requires_grad=False
即可,即不记录梯度信息。
1 2 3 4 5 6 7 8 9 10 11 from collections.abc import Iterabledef set_freeze_by_names (model, layer_names, freeze=True ): if not isinstance (layer_names, Iterable): layer_names = [layer_names] for name, child in model.named_children(): if name not in layer_names: continue for param in child.parameters(): param.requires_grad = not freeze
实验结果如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 model = nn.Sequential( nn.Linear(32 ,64 ), nn.Linear(64 ,128 ), nn.Linear(128 ,256 ) ) print (list (model.named_children()))set_freeze_by_names(model, ('0' , '1' ), freeze=True ) print (model[0 ].weight.requires_grad)print (model[1 ].weight.requires_grad)print (model[2 ].weight.requires_grad)model.requires_grad_(False )
然后在训练前绑定优化器的时候只给保存梯度的参数绑定优化器即可:
1 optim = torch.optim.Adam([p for p in model.parameters() if p.requires_grad], lr=0.0001 )
解冻 解冻参数遵守两步:
允许参数保存梯度,即requires_grad=True
在优化器中加入解冻的参数,使其能被更新
1 2 3 4 5 6 7 8 9 10 11 set_freeze_by_names(model, ('0' ), freeze=False ) print (model[0 ].weight.requires_grad)print (model[1 ].weight.requires_grad)print (model[2 ].weight.requires_grad)model.requires_grad_(True )
然后在优化器中加入解冻出来的参数:
1 optim.add_param_group({'params' : model[0 ].parameters()})