报错如下所示:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 1, 1, 155]], which is output 0 of UnsqueezeBackward0, is at version 1066; expected version 1065 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
造成这个原因主要是代码中有以下几种情况:
- a = a
- a += b
- 循环中 a[i,:,:]=...a[i,:,:]
我报这个错就是因为在代码中写了:
attention_sum += attention
贴一个总结比较好的知乎
关于 pytorch inplace operation, 需要知道的几件事



