另外, 我最近又调了下1.9版本的, 发现了fallback问题, 请问这个问题怎么解呢?
另外, 我最近又调了下1.9版本的, 发现了fallback问题, 请问这个问题怎么解呢?
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
[ERROR][/torch/catch/torch_mlu/csrc/aten/operators/cnnl/_s_where.cpp:21][cnnl__s_where][process:2374][thread:140484020524864]:
x or y dtype of cnnl where op not implemented for 'Long'
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:2647][_s_where][process:2374][thread:140484020524864]: MLU _s_where failed, fallback to run on CPU automatically!
硬件:mlu370软件: yellow.hub.cambricon.com/pytorch/pytorch:v1.4.0-torch1.9-ubuntu18.04参考文档: https://www.cambricon.com/docs/sdk_1.7.0/cambricon_pytorch_1.6.0/porting_1.9/pytorch_3_porting/pytorch_porting.html#torch-mlu 跑的自研resnet50的demo, 完整错误如下:Cannot set version_counter for inference tensor[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:935][convolution_overrideable][process:33242][thread:139957320275776]: MLU convolution_overrideable failed, fallback to run on CPU automatically!Traceback (most recent call last): File "inference.py", line 535, in展开main() File "inference.py", line 454, in main args, model, data_loader, device, args.show, args.show_dir, **show_kwargs File "inference.py", line 157, in single_gpu_test result = model(return_loss=False, **data) File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func return old_func(*args, **kwargs) File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/base.py", line 85, in forward return self.forward_test(img, **kwargs) File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/base.py", line 67, in forward_test return self.simple_test(imgs[0], **kwargs) File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/image.py", line 152, in simple_test x = self.extract_feat(img) File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/image.py", line 111, in extract_feat x = self.backbone(img) File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/backbones/resnet_cifar.py", line 72, in forward x = self.conv1(x) File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 443, in forward return self._conv_forward(input, self.weight, self.bias) File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward self.padding, self.dilation, self.groups)RuntimeError: MLU convolution_overrideable does not have fallback CPU implementation!能否解释下:为啥conv2d会run到convolution_overrideable这个算子呢, 而且有异常, 然后catch有没有对这个算子做fallback to cpu
看着像是device 错了,可以看下代码中操作device的地方,逐一排查下
你好:您这是在什么环境下(370/270),参考哪个步骤进行的网络移植?运行的哪个脚本出现的错误?请提供详细的步骤和信息,方便定位问题。谢谢!展开
硬件:mlu370
软件: yellow.hub.cambricon.com/pytorch/pytorch:v1.4.0-torch1.9-ubuntu18.04
跑的自研resnet50的demo, 完整错误如下:
Cannot set version_counter for inference tensor
[WARNING][/torch/catch/torch_mlu/csrc/aten/operators/mlu_type_default.cpp:935][convolution_overrideable][process:33242][thread:139957320275776]: MLU convolution_overrideable failed, fallback to run on CPU automatically!
Traceback (most recent call last):
File "inference.py", line 535, in <module>
main()
File "inference.py", line 454, in main
args, model, data_loader, device, args.show, args.show_dir, **show_kwargs
File "inference.py", line 157, in single_gpu_test
result = model(return_loss=False, **data)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
return old_func(*args, **kwargs)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/base.py", line 85, in forward
return self.forward_test(img, **kwargs)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/base.py", line 67, in forward_test
return self.simple_test(imgs[0], **kwargs)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/image.py", line 152, in simple_test
x = self.extract_feat(img)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/classifiers/image.py", line 111, in extract_feat
x = self.backbone(img)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/mmcls/models/backbones/resnet_cifar.py", line 72, in forward
x = self.conv1(x)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: MLU convolution_overrideable does not have fallback CPU implementation!
能否解释下:为啥conv2d会run到convolution_overrideable这个算子呢, 而且有异常, 然后catch有没有对这个算子做fallback to cpu
请登录后评论