LV.1

胡汉三

33积分0赞

4 帖子 19 回复 1 收藏

TA的动态

TA的帖子

TA的回复

error: 'onnx.Unsqueeze' op requires a single operand 我的回复：顶 0

error: 'onnx.Unsqueeze' op requires a single operand 我的回复：error: 'onnx.Unsqueeze' op requires a single operand2022-11-25 08:07:19.705468: ERROR: magicmind/cc/implement/interface_parser_impl.cc:66] Internal: Onnx mlir graph is illegal2022-11-25 08:07:19.705642: ERROR: /usr/local/neuware/samples/magicmind/mm_build/parser.cc:16] Internal: Onnx mlir graph is illegalAborted (core dumped) 0

error: 'onnx.Unsqueeze' op requires a single operand 我的回复：onnx版本如图 0

动态性测试时性能下降问题 我的回复：#1踏雪寻梅回复尊敬的开发者您好，针对可变输入和固定输入的优化方式略有不同，但仍有几个信息希望和您对齐1. 300和500，是指hwtime吗2. 两种模式下，除了可变输入外，其他配置是否完全一致，是否方便将配置发出。3. 相关模型基础backbone请问是否方便提供。展开非动态时的配置文件：{        "archs": ["mtp_372"],        "graph_shape_mutable": false,        "precision_config":{          "precision_mode":"force_float16"        },    "opt_config": {          "type64to32_conversion": true,          "conv_scale_fold": true    }}命令：./onnx_build --onnx ./bert_squad/bert-base.onnx --precision force_float16 --input_dims 24,128 24,128 24,128 --calibration false --mlu_arch mtp_372 --build_config config.json动态时的配置文件：{        "archs": ["mtp_372"],        "graph_shape_mutable": true,        "precision_config":{          "precision_mode":"force_float16"        },    "opt_config": {          "type64to32_conversion": true,          "conv_scale_fold": true    }}命令：./onnx_build --onnx ./bert_squad/bert-base.onnx --precision force_float16 --calibration false --mlu_arch mtp_372 --build_config config.json 0

动态性测试时性能下降问题 我的回复：#2胡汉三回复pytorch导出的bert_base:=======动态model下测试命令=======/usr/local/neuware/bin/mm_run --magicmind_model /usr/local/neuware/samples/magicmind/mm_build/build/bert_base_dynamic.model --input_dims 24,128 24,128 24,128 --threads 1 --bind_cluster 0 --duration 30 --iterations 20 --kernel_capture 0=======非动态model下测试结果=======Iterations:                   20Host Wall Time (s):           1.09576MLU Compute Time (s):         1.09463Throughput (qps):             438.052=======非动态model下测试命令=======/usr/local/neuware/bin/mm_run --magicmind_model /usr/local/neuware/samples/magicmind/mm_build/build/bert_base_bs24_seq128.model --input_dims 24,128 24,128 24,128 --threads 1 --bind_cluster 0 --duration 30 --iterations 20 --kernel_capture 0=======非动态model下测试结果=======Iterations:                   20Host Wall Time (s):           0.970254MLU Compute Time (s):         0.969134Throughput (qps):             494.716展开打错了一个地方，第一处是“动态下测试结果” 0

动态性测试时性能下降问题 我的回复：#1踏雪寻梅回复尊敬的开发者您好，针对可变输入和固定输入的优化方式略有不同，但仍有几个信息希望和您对齐1. 300和500，是指hwtime吗2. 两种模式下，除了可变输入外，其他配置是否完全一致，是否方便将配置发出。3. 相关模型基础backbone请问是否方便提供。展开pytorch导出的bert_base:=======动态model下测试命令=======/usr/local/neuware/bin/mm_run --magicmind_model /usr/local/neuware/samples/magicmind/mm_build/build/bert_base_dynamic.model --input_dims 24,128 24,128 24,128 --threads 1 --bind_cluster 0 --duration 30 --iterations 20 --kernel_capture 0=======非动态model下测试结果=======Iterations:                   20Host Wall Time (s):           1.09576MLU Compute Time (s):         1.09463Throughput (qps):             438.052=======非动态model下测试命令=======/usr/local/neuware/bin/mm_run --magicmind_model /usr/local/neuware/samples/magicmind/mm_build/build/bert_base_bs24_seq128.model --input_dims 24,128 24,128 24,128 --threads 1 --bind_cluster 0 --duration 30 --iterations 20 --kernel_capture 0=======非动态model下测试结果=======Iterations:                   20Host Wall Time (s):           0.970254MLU Compute Time (s):         0.969134Throughput (qps):             494.716 0

如何运行多输入的ONNX模型？ 我的回复：我看明白了还来问？你猜猜我为什么不去问NV如何实现的？ 0