Gradient Clipping in PyTorch

Mar 04, 2020

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

optimizer.zero_grad()        
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip)
optimizer.step()

Use torch.nn.utils.clips_grad_norm_ (which is in-place) instead of torch.nn.utils.clips_grad_norm (which returns a copy and has been deprecated).

References

https://discuss.pytorch.org/t/proper-way-to-do-gradient-clipping/191

About torch.nn.utils.clip_grad_norm

How to do gradient clipping in pytorch?

Comments