pytorch 学习笔记1

pytorch 学习笔记

unsqueeze
unsqueeze可以用来在指定维度上增加一维

a1=torch.arange(0,10)
print(a1)
a2=torch.arange(0,10).unsqueeze(0)
print(a2)
a3=torch.arange(0,10).unsqueeze(1)
print(a3)
print(a1.size(),a2.size(),a3.size()
#tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
#tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
#tensor([[0],
#        [1],
#        [2],
#        [3],
#        [4],
#        [5],
#        [6],
#        [7],
#        [8],
#        [9]])
#torch.Size([10]) torch.Size([1, 10]) torch.Size([10, 1])

2.repeat 按照指定维度复制tensor
repeat(repeat次数，维度）

    a=torch.arange(0,10).unsqueeze(0)
    print(a)
    a1=torch.arange(0,10).unsqueeze(0).repeat(10,1)
    print(a1)
    #tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
#tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
#        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

nn.ModuleList, 与sequential不同，modulelist需要自己写forward函数

self.layers=nn.ModuleList(
[encoder_layer(hid_dim, n_heads, pf_dim, self_attention, positionwise_feedforward, dropout, device) for _ in range(n_layers)]
)
for layer in self.layers:
      src=layer(src, src_mask)

embedding层

self.tok_embedding=nn.Embedding(input_dim, hid_dim) #input_dim 词典大小 ，词嵌入的纬度

dropout层

self.dropout=dropout
self.do=nn.Dropout(dropout)
src=self.do((self.tok_embedding(src)*self.scale)+self.pos_embedding(pos))

6.layernormal层
LayerNorm 如果参数为整数，代表在最后一维上进行normalize，整数应该等于最后一维的大小。
如果参数为[3,4]代表对最后两维作normalize，最后两维的shape应该等于[3,4]。对12个数做normalize。

normalize公式为 eps是一个偏差值，防止分母接近0导致爆炸。
layernormal还有可选参数eps，代表这个偏差值。

    t=torch.FloatTensor([[1,2,3,4],[2,6,4,5],[2,5,9,7]])
    norm=nn.LayerNorm(4)
    t=norm(t)
    print(t)
#tensor([[-1.3416, -0.4472,  0.4472,  1.3416],
#        [-1.5213,  1.1832, -0.1690,  0.5071],
#        [-1.4501, -0.2900,  1.2568,  0.4834]],

7.Linear 层

li=nn.Linear(input_dim,output_dim)
#把输入张量[batchsize,infeatures]转换为[batchsize,outfeatures]

会在最后一维进行linear，从1展开到2

    w_k=nn.Linear(1,2)
    w=torch.FloatTensor(2,2,1)
    print(w)
    w=w_k(w)
    print(w)

tensor([[[ 0.0000e+00],
         [-1.5846e+29]],

        [[ 0.0000e+00],
         [-1.5846e+29]]])
tensor([[[-4.1201e-01,  4.7898e-01],
         [ 1.0952e+29, -8.0230e+28]],

        [[-4.1201e-01,  4.7898e-01],
         [ 1.0952e+29, -8.0230e+28]]], grad_fn=)

8.permute

>>> x = torch.randn(2, 3, 5) 
>>> x.size() 
torch.Size([2, 3, 5]) 
>>> x.permute(2, 0, 1).size() 
torch.Size([5, 2, 3])

9.矩阵乘法
1）点乘 torch.mul(a，b)
mul有boradcasting机制
2）矩阵乘，有torch.mm和torch.matmul两种
torch.mm只能用来做二维，torch.matmul可以做高维

A.shape =（b,m,n)；B.shape = (b,n,k)
matmul(A,B) 结果shape为(b,m,k)

A.shape =（m,n)； B.shape = (b,n,k)； C.shape=(k,l)
matmul(A,B) 结果shape为(b,m,k)
matmul(B,C) 结果shape为(b,n,l)

10.masked_fill

mask中1的地方被填为value，0的地方不变

>>>a=torch.tensor([1,0,2,3])
>>>a.masked_fill(mask = torch.ByteTensor([1,1,0,0]), value=torch.tensor(-1e9))
>>>a
>>>tensor([-1.0000e+09, -1.0000e+09,  2.0000e+00,  3.0000e+00]

softmax
dim参数用来指定在哪个维度上进行softmax

    a=torch.Tensor([[[1,3],[2,5],[6,3]],[[4,6],[5,5],[6,8]]])
    print(a.size())
    print(a)
    a=F.softmax(a,dim=-1)
    print(a)
    torch.Size([2, 3, 2])
output：
tensor([[[1., 3.],
         [2., 5.],
         [6., 3.]],

        [[4., 6.],
         [5., 5.],
         [6., 8.]]])
tensor([[[0.1192, 0.8808],
         [0.0474, 0.9526],
         [0.9526, 0.0474]],

        [[0.1192, 0.8808],
         [0.5000, 0.5000],
         [0.1192, 0.8808]]])

nn.Linear vs nn.conv1d
nn.Linear作用在输入数据的最后一个维度上，这一点不同于以下的nn.Conv1d

x = torch.randn(1, 3, 100)		            # 创建一个batch_size=1的点云，长度100
layer = nn.Conv1d(3, 10, kernel_size=1)	    # 构造一个输入节点为3，输出节点为10的网络层
y = F.sigmoid(layer(x))		                 # 计算y，sigmoid激活函数

print(x.size())
print(y.size())
'''
>>>torch.Size([1, 3, 100])
>>>torch.Size([1, 10, 100])
'''

通过上述代码可以看出，nn.Conv1d的输入数据格式只能一个三维tensor[batch, channel, length]，与nn.Linear输入数据格式不同；并且，nn.Conv1d的数据作用位置也不同，nn.Conv1d作用在第二个维度channel上

pytorch 学习笔记1

Python相关栏目本月热门文章