PyTorch 矩阵乘法总结

torch.dot
torch.mul
torch.mm
torch.bmm
torch.matmul
*, @

torch.dot() 点乘

a, b矩阵对应元素相乘，再相加，但只支持1-D tensor相乘

In [11]: torch.dot(torch.tensor([2, 3]), torch.tensor([2, 1]))
Out[11]: tensor(7)

torch.mul()

a, b矩阵对应位置元素相乘，ab的shape不要求一致，但必须要能broadcast

In [32]: a
Out[32]:
tensor([[1, 2, 3],
        [2, 3, 4]])

In [33]: torch.mul(a, a)
Out[33]:
tensor([[ 1,  4,  9],
        [ 4,  9, 16]])

# 行方向broadcast
In [34]: torch.mul(a, a[0])
Out[34]:
tensor([[ 1,  4,  9],
        [ 2,  6, 12]])

# 列方向broadcast
In [35]: torch.mul(a, a[:, :1])
Out[35]:
tensor([[1, 2, 3],
        [4, 6, 8]])

torch.mm()

matrix-matrix product

只能计算2-D Tensor的矩阵乘法，nxm and mxp Tensor相乘输出shape为 nxp

>>> mat1 = torch.randn(2, 3)
>>> mat2 = torch.randn(3, 3)
>>> torch.mm(mat1, mat2)
    tensor([[ 0.4851,  0.5037, -0.3633],
            [-0.0760, -3.6705,  2.4784]])

torch.bmm()

a, b必须是3-D Tensor, 且第一个维度相同，每一个batch对应相乘

input and mat2 must be 3-D tensors each containing the same number of matrices.

If input is a (b times n times m)(b×n×m) tensor, mat2 is a (b times m times p)(b×m×p) tensor, out will be a (b times n times p)(b×n×p) tensor.

>>> input = torch.randn(10, 3, 4)
>>> mat2 = torch.randn(10, 4, 5)
>>> res = torch.bmm(input, mat2)
>>> res.size()
torch.Size([10, 3, 5])

torch.matmul()

a, b两Tensor间的矩阵乘法，比较灵活

a,b 都是一维，类似于torch.dot
a是二维，b是一维，执行矩阵和向量乘法
a是一维，b是二维，先在a上unsqueeze(dim=0)扩充维度，在和b作矩阵乘法
ab都是二维的，类似于torch.mm
If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned

# vector x vector
tensor1 = torch.randn(3)
tensor2 = torch.randn(3)
torch.matmul(tensor1, tensor2).size()
# matrix x vector
tensor1 = torch.randn(3, 4)
tensor2 = torch.randn(4)
torch.matmul(tensor1, tensor2).size()
# batched matrix x broadcasted vector
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4)
torch.matmul(tensor1, tensor2).size()
# batched matrix x batched matrix
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(10, 4, 5)
torch.matmul(tensor1, tensor2).size()
# batched matrix x broadcasted matrix
tensor1 = torch.randn(10, 3, 4)
tensor2 = torch.randn(4, 5)
torch.matmul(tensor1, tensor2).size()

附：

标量 scalar，单独的一个数
向量 vector，一列数
矩阵 matrix，二维数组
张量 tensor，超过两维的数组

reference:

einsum满足你一切需要：深度学习中的爱因斯坦求和约定 - 知乎

PyTorch 矩阵乘法总结

Python相关栏目本月热门文章