Environment(Ascend/GPU/CPU): GPU-GTX3090(24G)
Software Environment:
– MindSpore version (source or binary): 1.7.0
– Python version (e.g., Python 3.7.5): 3.8.13
– OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
– CUDA version : 11.0
此代码是ConvLSTM从PyTorch迁移到MindSpore的一部分,下面为报错部分
split = ops.Split(1, 2) output = split(x)1.2.2报错
部分个人信息做遮挡处理
Traceback (most recent call last): File "main.py", line 195, intrain() File "main.py", line 142, in train loss = train_network(data, label) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 612, in __call__ raise err File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 609, in __call__ output = self._run_construct(cast_inputs, kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 429, in _run_const ruct output = self.construct(*cast_inputs, **kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/wrap/cell_wrapper.py", line 373, in construct loss = self.network(*inputs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 612, in __call__ raise err File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 609, in __call__ output = self._run_construct(cast_inputs, kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 429, in _run_const ruct output = self.construct(*cast_inputs, **kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/wrap/cell_wrapper.py", line 111, in construct out = self._backbone(data) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 612, in __call__ raise err File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 609, in __call__ output = self._run_construct(cast_inputs, kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 429, in _run_const ruct output = self.construct(*cast_inputs, **kwargs) File "/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/model.py", line 31, in construct state = self.encoder(input) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 612, in __call__ raise err File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 609, in __call__ output = self._run_construct(cast_inputs, kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 429, in _run_const ruct output = self.construct(*cast_inputs, **kwargs) File "/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/encoder.py", line 42, in construct inputs, state_stage = self.forward_by_stage( File "/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/encoder.py", line 35, in forward_by_stage outputs_stage, state_stage = rnn(inputs, None) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 612, in __call__ raise err File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 609, in __call__ output = self._run_construct(cast_inputs, kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/nn/cell.py", line 429, in _run_const ruct output = self.construct(*cast_inputs, **kwargs) File "/home/xxxlab/zrj/mindspore/ConvLSTM-PyTorch/conv/ConvRNN.py", line 61, in construct combined_2 = P.Concat(1)((x, r * htprev)) # h' = tanh(W*(x+r*H_t-1)) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/common/tensor.py", line 278, in __mu l__ return tensor_operator_registry.get('__mul__')(self, other) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/composite/multitype_ops/_compile _utils.py", lin e 101, in _tensor_mul return F.mul(self, other) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/primitive.py", line 294, in __ca ll__ return _run_op(self, self.name, args) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/common/api.py", line 90, in wrapper results = fn(*arg, **kwargs) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/primitive.py", line 754, in _run _op output = real_run_op(obj, op_name, args) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/primitive.py", line 575, in __in fer__ out[track] = fn(*(x[track] for x in args)) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/operations/math_ops.py", line 78 , in infer_shap e return get_broadcast_shape(x_shape, y_shape, self.name) File "/home/xxxlab/anaconda2/envs/mindspore/lib/python3.8/site-packages/mindspore/ops/_utils/utils.py", line 70, in ge t_broadcast_sha pe raise ValueError(f"For '{prim_name}', {arg_name1}.shape and {arg_name2}.shape are supposed " ValueError: For 'Mul', x.shape and y.shape are supposed to broadcast, where broadcast means that x.shape[i] = 1 or -1 or y.shape[i] = 1 or -1 or x.shape[i] = y.shape[i], but now x.shape and y.shape can not broadcast, got i: -3, x.shape: [16, 2, 64, 64], y .shape: [16, 64 , 64, 64].
2 原因分析以及解决办法
当时真的无比疑惑,为什么split出来的维度不是自己想要的呢?还以为输入的维度就错了,从输入开始debug,结果发现前面都没问题,是split出问题了。
一开始我是通过pytorch-mindspore的对照表进行算子映射的。其中torch.split与mindspore.ops.Split相映射,且备注没有额外信息,我自然就以为他们的参数是一样的。但其实不然,翻阅pytorch和mindspore文档就可以知道torch.split中除了tensor和dim的参数是
- split_size_or_sections (int) or (list(int)) – size of a single chunk or list of sizes for each chunk
而mindspore中除了tensor和dim的参数是
-
output_num (int) - 指定分割数量。其值为正整数。默认值:1。
相对而言,mindspore的参数更好操作和理解,而pytorch还需要自己额外计算,所以在迁移时不能单纯把参数复制过来,还要看是否能相对应上。
迁移时要勤翻pytorch和mindspore api的文档,除了利用mindconvert进行自动映射外,还需要注意一下不支持算子的映射。



