MLU270调试compressai压缩图像算法逐层调试失败 - 寒武纪软件开发平台

MLU270调试compressai压缩图像算法逐层调试失败 已解决 等等20222022-08-16 16:43:58 回复 26 查看 技术答疑 使用求助

0 赞 0 收藏
分享到:

MLU270；

操作系统：UBUNTU;

驱动版本：4.8.0；

AI框架：Pytorch；

调试的是pytorch提供compressai公开框架（https://github.com/InterDigitalInc/CompressAI/tree/f63754c32724c7ee4e7f523729fe387a5c9f86c6），使用压缩算法为：mbt2018_mean，目前cpu调试成功，量化模型也跑通了，但是在逐层在线跑的时候，出现如下错误：（目前对firstconv False 和 True都试过了，都会出现这样的问题）

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:194][conv2d_first][thread:139902670112576][process:10560]: input[shape: [1, 3, 512, 768], device: mlu:0, dtype: Float] weight[shape: [128, 3, 5, 5], device: mlu:0, dtype: Float] bias[shape: [128], device: mlu:0, dtype: Float] padding[value: 2] [value: 2] stride[value: 2] [value: 2] dilation[value: 1] [value: 1] groups[value: 1] q_scale[shape: [2], device: mlu:0, dtype: Float] q_mode[shape: [1], device: mlu:0, dtype: Int] mean[shape: [3], device: mlu:0, dtype: Float] std[shape: [3], device: mlu:0, dtype: Float]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:398][max][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float]

[ERROR][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml/internal/maximum_internal.cpp][line:11][cnml_maximum_internal][thread:139902670112576][process:10560]:

Shape of input should match shape of other

[WARNING][/pytorch/catch/torch_mlu/csrc/aten/operators/op_methods.cpp][line:1386][max][thread:139902670112576][process:10560]:

max Op cannot run on MLU device, start running on CPU!

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128], device: cpu, dtype: Float] src[shape: [128], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [1], device: cpu, dtype: Float] src[shape: [1], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] src[shape: [128], device: cpu, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:506][pow][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] exponent[value: 2]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:690][sub][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float] alpha[value: 1]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:398][max][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float]

[ERROR][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml/internal/maximum_internal.cpp][line:11][cnml_maximum_internal][thread:139902670112576][process:10560]:

Shape of input should match shape of other

[WARNING][/pytorch/catch/torch_mlu/csrc/aten/operators/op_methods.cpp][line:1386][max][thread:139902670112576][process:10560]:

max Op cannot run on MLU device, start running on CPU!

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128, 128], device: cpu, dtype: Float] src[shape: [128, 128], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [1], device: cpu, dtype: Float] src[shape: [1], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] src[shape: [128, 128], device: cpu, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:506][pow][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] exponent[value: 2]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:690][sub][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float] alpha[value: 1]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:750][view][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] size[value: [128, 128, 1, 1]]

beta.shape

torch.Size([128])

gamma.shape

torch.Size([128, 128, 1, 1])

x.shape

torch.Size([1, 128, 256, 384])

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:506][pow][thread:139902670112576][process:10560]: self[shape: [1, 128, 256, 384], device: mlu:0, dtype: Float] exponent[value: 2]

[WARNING][/pytorch/catch/torch_mlu/csrc/aten/operators/op_methods.cpp][line:3256][convolution_overrideable][thread:139902670112576][process:10560]:

convolution_overrideable Op cannot run on MLU device, start running on CPU!

Traceback (most recent call last):

File "test_online.py", line 332, in <module>

test_mlu_mode(args)

File "test_online.py", line 171, in test_m

附件.zip