打开微信,使用扫一扫进入页面后,点击右上角菜单,
点击“发送给朋友”或“分享到朋友圈”完成分享
【寒武纪硬件产品型号】必填*:MLU 290
【使用操作系统】必填*:Ubuntu 20.04
【使用驱动版本】必填*: v4.20.23
【出错信息】必填*:
2025-06-27 15:47:53.601765: [cnrtError] [23551] [Card : 0] Error occurred during calling 'cnQueueWaitNotifier' in CNDrv interface.
2025-06-27 15:47:53.601853: [cnrtError] [23551] [Card : 0] Return value is 100080, CN_NOTIFIER_ERROR_INVALID, means that "invalid notifier"
2025-06-27 15:47:53.601896: [cnrtError] [23551] [Card : 0] QueueWaitNotifier: Query MLU queue failed.
[ERROR][/torch/src/catch/torch_mlu/csrc/aten/device/notifier.cpp:91][wait][process:23551][thread:140369276213056]:
Traceback (most recent call last):
File "./scGPT/scGPT-main/finatune_heart-Copy1_test.py", line 1051, in <module>
main()
File "./scGPT/scGPT-main/finatune_heart-Copy1_test.py", line 983, in main
model = DDP(model, device_ids=[gpu_id])
File "/torch/venv3/pytorch_py38/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 496, in __init__
dist._verify_model_across_ranks(self.process_group, parameters)
RuntimeError: CNRT error: failed to call the driver-api function.
【当前已做了哪些信息确认】选填:没做
【参考配置文档链接】选填
【相关日志文档】选填
如有,可附件
【出错代码链接】选填:
github的或gitee的代码的链接,
热门帖子
精华帖子