pytorch 学习笔记
- unsqueeze
unsqueeze可以用来在指定维度上增加一维
a1=torch.arange(0,10) print(a1) a2=torch.arange(0,10).unsqueeze(0) print(a2) a3=torch.arange(0,10).unsqueeze(1) print(a3) print(a1.size(),a2.size(),a3.size() #tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) #tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) #tensor([[0], # [1], # [2], # [3], # [4], # [5], # [6], # [7], # [8], # [9]]) #torch.Size([10]) torch.Size([1, 10]) torch.Size([10, 1])
2.repeat 按照指定维度复制tensor
repeat(repeat次数,维度)
a=torch.arange(0,10).unsqueeze(0)
print(a)
a1=torch.arange(0,10).unsqueeze(0).repeat(10,1)
print(a1)
#tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
#tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
- nn.ModuleList, 与sequential不同,modulelist需要自己写forward函数
self.layers=nn.ModuleList(
[encoder_layer(hid_dim, n_heads, pf_dim, self_attention, positionwise_feedforward, dropout, device) for _ in range(n_layers)]
)
for layer in self.layers:
src=layer(src, src_mask)
- embedding层
self.tok_embedding=nn.Embedding(input_dim, hid_dim) #input_dim 词典大小 ,词嵌入的纬度
- dropout层
self.dropout=dropout self.do=nn.Dropout(dropout) src=self.do((self.tok_embedding(src)*self.scale)+self.pos_embedding(pos))
6.layernormal层
LayerNorm 如果参数为整数,代表在最后一维上进行normalize,整数应该等于最后一维的大小。
如果参数为[3,4]代表对最后两维作normalize,最后两维的shape应该等于[3,4]。对12个数做normalize。
normalize公式为 eps是一个偏差值,防止分母接近0导致爆炸。
layernormal还有可选参数eps,代表这个偏差值。
t=torch.FloatTensor([[1,2,3,4],[2,6,4,5],[2,5,9,7]])
norm=nn.LayerNorm(4)
t=norm(t)
print(t)
#tensor([[-1.3416, -0.4472, 0.4472, 1.3416],
# [-1.5213, 1.1832, -0.1690, 0.5071],
# [-1.4501, -0.2900, 1.2568, 0.4834]],
7.Linear 层
li=nn.Linear(input_dim,output_dim) #把输入张量[batchsize,infeatures]转换为[batchsize,outfeatures]
会在最后一维进行linear,从1展开到2
w_k=nn.Linear(1,2)
w=torch.FloatTensor(2,2,1)
print(w)
w=w_k(w)
print(w)
tensor([[[ 0.0000e+00],
[-1.5846e+29]],
[[ 0.0000e+00],
[-1.5846e+29]]])
tensor([[[-4.1201e-01, 4.7898e-01],
[ 1.0952e+29, -8.0230e+28]],
[[-4.1201e-01, 4.7898e-01],
[ 1.0952e+29, -8.0230e+28]]], grad_fn=)
8.permute
>>> x = torch.randn(2, 3, 5) >>> x.size() torch.Size([2, 3, 5]) >>> x.permute(2, 0, 1).size() torch.Size([5, 2, 3])
9.矩阵乘法
1)点乘 torch.mul(a,b)
mul有boradcasting机制
2)矩阵乘,有torch.mm和torch.matmul两种
torch.mm只能用来做二维,torch.matmul可以做高维
A.shape =(b,m,n);B.shape = (b,n,k)
matmul(A,B) 结果shape为(b,m,k)
A.shape =(m,n); B.shape = (b,n,k); C.shape=(k,l)
matmul(A,B) 结果shape为(b,m,k)
matmul(B,C) 结果shape为(b,n,l)
10.masked_fill
mask中1的地方被填为value,0的地方不变
>>>a=torch.tensor([1,0,2,3]) >>>a.masked_fill(mask = torch.ByteTensor([1,1,0,0]), value=torch.tensor(-1e9)) >>>a >>>tensor([-1.0000e+09, -1.0000e+09, 2.0000e+00, 3.0000e+00]
- softmax
dim参数用来指定在哪个维度上进行softmax
a=torch.Tensor([[[1,3],[2,5],[6,3]],[[4,6],[5,5],[6,8]]])
print(a.size())
print(a)
a=F.softmax(a,dim=-1)
print(a)
torch.Size([2, 3, 2])
output:
tensor([[[1., 3.],
[2., 5.],
[6., 3.]],
[[4., 6.],
[5., 5.],
[6., 8.]]])
tensor([[[0.1192, 0.8808],
[0.0474, 0.9526],
[0.9526, 0.0474]],
[[0.1192, 0.8808],
[0.5000, 0.5000],
[0.1192, 0.8808]]])
- nn.Linear vs nn.conv1d
nn.Linear作用在输入数据的最后一个维度上,这一点不同于以下的nn.Conv1d
x = torch.randn(1, 3, 100) # 创建一个batch_size=1的点云,长度100 layer = nn.Conv1d(3, 10, kernel_size=1) # 构造一个输入节点为3,输出节点为10的网络层 y = F.sigmoid(layer(x)) # 计算y,sigmoid激活函数 print(x.size()) print(y.size()) ''' >>>torch.Size([1, 3, 100]) >>>torch.Size([1, 10, 100]) '''
通过上述代码可以看出,nn.Conv1d的输入数据格式只能一个三维tensor[batch, channel, length],与nn.Linear输入数据格式不同;并且,nn.Conv1d的数据作用位置也不同,nn.Conv1d作用在第二个维度channel上



