×

签到

分享到微信

打开微信,使用扫一扫进入页面后,点击右上角菜单,

点击“发送给朋友”或“分享到朋友圈”完成分享

MLU270调试compressai压缩图像算法逐层调试失败 已解决 等等20222022-08-16 16:43:58 回复 26 查看 技术答疑 使用求助
MLU270调试compressai压缩图像算法逐层调试失败
分享到:

MLU270;

操作系统:UBUNTU;

驱动版本:4.8.0;

AI框架:Pytorch;

调试的是pytorch提供compressai公开框架(https://github.com/InterDigitalInc/CompressAI/tree/f63754c32724c7ee4e7f523729fe387a5c9f86c6),使用压缩算法为:mbt2018_mean,目前cpu调试成功,量化模型也跑通了,但是在逐层在线跑的时候,出现如下错误:(目前对firstconv False 和 True都试过了,都会出现这样的问题)


[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:194][conv2d_first][thread:139902670112576][process:10560]: input[shape: [1, 3, 512, 768], device: mlu:0, dtype: Float] weight[shape: [128, 3, 5, 5], device: mlu:0, dtype: Float] bias[shape: [128], device: mlu:0, dtype: Float] padding[value: 2] [value: 2] stride[value: 2] [value: 2] dilation[value: 1] [value: 1] groups[value: 1] q_scale[shape: [2], device: mlu:0, dtype: Float] q_mode[shape: [1], device: mlu:0, dtype: Int] mean[shape: [3], device: mlu:0, dtype: Float] std[shape: [3], device: mlu:0, dtype: Float]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:398][max][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float]

[ERROR][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml/internal/maximum_internal.cpp][line:11][cnml_maximum_internal][thread:139902670112576][process:10560]:

Shape of input should match shape of other

[WARNING][/pytorch/catch/torch_mlu/csrc/aten/operators/op_methods.cpp][line:1386][max][thread:139902670112576][process:10560]:

max Op cannot run on MLU device, start running on CPU!

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128], device: cpu, dtype: Float] src[shape: [128], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [1], device: cpu, dtype: Float] src[shape: [1], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] src[shape: [128], device: cpu, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:506][pow][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] exponent[value: 2]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:690][sub][thread:139902670112576][process:10560]: self[shape: [128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float] alpha[value: 1]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:398][max][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float]

[ERROR][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml/internal/maximum_internal.cpp][line:11][cnml_maximum_internal][thread:139902670112576][process:10560]:

Shape of input should match shape of other

[WARNING][/pytorch/catch/torch_mlu/csrc/aten/operators/op_methods.cpp][line:1386][max][thread:139902670112576][process:10560]:

max Op cannot run on MLU device, start running on CPU!

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128, 128], device: cpu, dtype: Float] src[shape: [128, 128], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [1], device: cpu, dtype: Float] src[shape: [1], device: mlu:0, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:226][copy_][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] src[shape: [128, 128], device: cpu, dtype: Float] non_blocking[false]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:506][pow][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] exponent[value: 2]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:690][sub][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] other[shape: [1], device: mlu:0, dtype: Float] alpha[value: 1]

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:750][view][thread:139902670112576][process:10560]: self[shape: [128, 128], device: mlu:0, dtype: Float] size[value: [128, 128, 1, 1]]

beta.shape

torch.Size([128])

gamma.shape

torch.Size([128, 128, 1, 1])

x.shape

torch.Size([1, 128, 256, 384])

[DEBUG][/pytorch/catch/torch_mlu/csrc/aten/operators/cnml_ops.cpp][line:506][pow][thread:139902670112576][process:10560]: self[shape: [1, 128, 256, 384], device: mlu:0, dtype: Float] exponent[value: 2]

[WARNING][/pytorch/catch/torch_mlu/csrc/aten/operators/op_methods.cpp][line:3256][convolution_overrideable][thread:139902670112576][process:10560]:

convolution_overrideable Op cannot run on MLU device, start running on CPU!

Traceback (most recent call last):

  File "test_online.py", line 332, in <module>

    test_mlu_mode(args)

  File "test_online.py", line 171, in test_m

附件.zip

lu_mode

    out_net = net.forward(img.to(ct.mlu_device())) if (args.mmode != "CPU") else net.forward(img)

  File "/torch/compressai/compressai/models/google.py", line 356, in forward

    y = self.g_a(x)

  File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__

    result = self.forward(*input, **kwargs)

  File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward

    input = module(input)

  File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__

    result = self.forward(*input, **kwargs)

  File "/torch/compressai/compressai/ s/gdn.py", line 91, in forward

    norm = F.conv2d(torch.pow(x,2), gamma, beta)

RuntimeError: To do for CPU


之前想打印

net.to(ct.mlu_device())

发现会出错,这个我不确定是不是本身就是动态的不能打印,打印的时候会出现shape出错的log。

测试的模型文件和在线测试脚本我都放在附件里。



版权所有 © 2024 寒武纪 Cambricon.com 备案/许可证号:京ICP备17003415号-1
关闭