pytorch损失函数binary

binary_cross_entropy和binary_cross_entropy_with_logits都是来自torch.nn.functional的函数，首先对比官方文档对它们的区别：

区别只在于这个logits，那么这个logits是什么意思呢？以下是从网络上找到的一个答案：

有一个（类）损失函数名字中带了with_logits. 而这里的logits指的是,该损失函数已经内部自带了计算logit的操作，无需在传入给这个loss函数之前手动使用sigmoid/softmax将之前网络的输入映射到[0,1]之间

再看看官方给的示例代码：
binary_cross_entropy：

input = torch.randn((3, 2), requires_grad=True)
target = torch.rand((3, 2), requires_grad=False)
loss = F.binary_cross_entropy(F.sigmoid(input), target)
loss.backward()
# input is  tensor([[-0.5474,  0.2197],
#         [-0.1033, -1.3856],
#         [-0.2582, -0.1918]], requires_grad=True)
# target is  tensor([[0.7867, 0.5643],
#         [0.2240, 0.8263],
#         [0.3244, 0.2778]])
# loss is  tensor(0.8196, grad_fn=)

binary_cross_entropy_with_logits：

input = torch.randn(3, requires_grad=True)
target = torch.empty(3).random_(2)
loss = F.binary_cross_entropy_with_logits(input, target)
loss.backward()
# input is  tensor([ 1.3210, -0.0636,  0.8165], requires_grad=True)
# target is  tensor([0., 1., 1.])
# loss is  tensor(0.8830, grad_fn=)

的确binary_cross_entropy_with_logits不需要sigmoid函数了。

事实上，官方是推荐使用函数带有with_logits的，解释是
This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.

翻译一下就是说将sigmoid层和binaray_cross_entropy合在一起计算比分开依次计算有更好的数值稳定性，这主要是运用了log-sum-exp技巧。

那么这个log-sum-exp主要就是讲如何防止数值计算溢出的问题：

pytorch损失函数binary

Python相关栏目本月热门文章