Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip)
optimizer.step()
- Use
torch.nn.utils.clips_grad_norm_(which is in-place) instead …