【YOLOX 部署】开源yolox模型进行寒武纪200平台的移植（上篇） - 寒武纪软件开发平台

首页 > 寒武纪开发者论坛>基础软件平台>PyTorch开发>推理 >正文

快速回复

【YOLOX 部署】开源yolox模型进行寒武纪200平台的移植（上篇） sqhuang2022-09-21 16:20:27 回复 5 查看 技术答疑 经验交流 干货资源

7 赞 5 收藏
分享到:

本文主要描述了对开源yolox模型进行寒武纪平台的移植步骤（上）

参考实现：https://github.com/Megvii- Detection/YOLOX

截至本文完成，使用 commit 74b637b494ad6a968c8bc8afec5ccdd7ca6b544f

0. 权重准备

进行开源 yolox 移植之前，需要额外转换一下模型权重。

这是因为，官网 pretrain 模型使用 torch 1.6 以上版本，保存出来的权重使用了一种新的 zip file，因为MLU270容器中 torch 版本是基于 1.3 的，首先需要使用新版 torch 做一次转换，使用_use_new_zipfile_serialization=False，得到 unzip 的file。

所以，我们首先需要在一个安装了高版本 torch 的环境上，安装 yolox，进行权重的转换。

git clone https://github.com/Megvii- Detection/YOLOX.git
pip3 install -r requirements.txt   -i https://pypi.tuna.tsinghua.edu.cn/simple

这里使用yolox-s 作为示例

https://github.com/Megvii- Detection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth

将其下载到 weights 文件下

接下来，修改 demo.py

diff --git a/tools/demo.py b/tools/demo.py
index b16598d..6afabe2 100644
--- a/tools/demo.py
+++ b/tools/demo.py
@@ -10,7 +10,7 @@ from loguru import logger
 import cv2

 import torch
-
+import numpy as np
 from yolox.data.data_augment import ValTransform
 from yolox.data.datasets import COCO_CLASSES
 from yolox.exp import get_exp
@@ -156,6 +156,7 @@ class Predictor( ):
         with torch.no_grad():
             t0 = time.time()
             outputs = self.model(img)
+            torch.save(self.model.state_dict(), "unzip.pth", _use_new_zipfile_serialization=False)
             if self.decoder is not None:
                 outputs = self.decoder(outputs, dtype=outputs.type())
             outputs = postprocess(

运行

python setup.py develop  #安装 yolox

接着，运行

python3 tools/demo.py image -n yolox-s -c weights/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device cpu

会发现当前目录已生成

unzip.pth

将其改名为 yolox-s-unzip.pth 备用。

完成了上述模型准备后，后续步骤就可以使用其进行推理了（以下步骤开始在270环境中进行）。

1. 270 环境准备

为了方便使用 PyTorch 和 Catch，用户可以通过下载和安装 PyTorch 和 Catch 的 Docker 镜像来使用。我们将 PyTorch 和 Catch 编译成 wheel 包，安装在 Docker 中；

同时将所需依赖包同步拷贝到 Docker 中。目录结构如下：

/torch/
├── requirements.txt ⋯⋯⋯⋯⋯⋯⋯⋯ python 依赖包
├── src ⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯ pytorch/catch/vision 源代码
├── venv2 ⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯  python2 虚拟环境
├── venv3 ⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯  python3 虚拟环境
├── wheel_py2 ⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯ python2 wheel 包
├── wheel_py3 ⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯ python3 wheel 包

这里我们演示使用社区版本：

cair.cambricon.com/cambricon/cambricon_pytorch:ubuntu16.04_sdk_v1.7.0_pytorch_v0.15.0-2

进入容器后，先激活虚拟环境：

source /torch/venv3/pytorch/bin/activate

2. 模型移植

2.1 依赖安装

同样使用官方 yolox 仓库

git clone https://github.com/Megvii- Detection/YOLOX.git

因为容器中已经提供了一些依赖，可以对 requirements.txt 进行修改

onnx 那些也用不到，可以删掉：

# TODO: Update with exact module version
numpy
# torch>=1.7
opencv_python
loguru
tqdm
# torchvision
thop
ninja
tabulate

# verified versions
# pycocotools corresponds to https://github.com/ppwwyyxx/cocoapi
pycocotools>=2.0.2
# onnx==1.8.1
# onnxruntime==1.8.0
# onnx-simplifier==0.3.5

执行

pip3 install -r requirements.txt   -i https://pypi.tuna.tsinghua.edu.cn/simple

2.2 torch 包相关修改

因为 torch 1.3 版本没有 SiLU， Hardswish, 这里需要手动添加简单实现

参照改动两个文件

/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/__init__.py

diff --git a/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/__init__.py b/__init__.py
index c3552e4..3f440d0 100644
--- a/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/__init__.py
+++ b/__init__.py
@@ -4,7 +4,7 @@ from .conv import Conv1d, Conv2d, Conv3d, \
     ConvTranspose1d, ConvTranspose2d, ConvTranspose3d
 from .activation import Threshold, ReLU, Hardtanh, ReLU6, Sigmoid, Tanh, \
     Softmax, Softmax2d, LogSoftmax, ELU, SELU, CELU, Hardshrink, LeakyReLU, LogSigmoid, \
-    Softplus, Softshrink, MultiheadAttention, PReLU, Softsign, Softmin, Tanhshrink, RReLU, GLU
+    Softplus, Softshrink, MultiheadAttention, PReLU, Softsign, Softmin, Tanhshrink, RReLU, GLU, SiLU, Hardswish
 from .loss import L1Loss, NLLLoss, KLDivLoss, MSELoss, BCELoss, BCEWithLogitsLoss, NLLLoss2d, \
     Cosine dingLoss, CTCLoss, Hinge dingLoss, MarginRankingLoss, \
     MultiLabelMarginLoss, MultiLabelSoftMarginLoss, MultiMarginLoss, \
@@ -53,5 +53,5 @@ __all__ = [
     'ConstantPad3d', 'Bilinear', 'CosineSimilarity', 'Unfold', 'Fold',
     'AdaptiveLogSoftmaxWithLoss', 'TransformerEncoder', 'TransformerDecoder',
     'TransformerEncoder ', 'TransformerDecoder ', 'Transformer',
-    'Flatten',
+    'Flatten',"SiLU", "Hardswish",
 ]

/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/activation.py

diff --git a/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/activation.py b/activation.py
index 6896dd7..f1022d9 100644
--- a/torch/venv3/pytorch/lib/python3.6/site-packages/torch/nn/modules/activation.py
+++ b/activation.py
@@ -1050,3 +1050,14 @@ class LogSoftmax(Module):

     def forward(self, input):
         return F.log_softmax(input, self.dim, _stacklevel=5)
+
+class Hardswish(Module):
+    @staticmethod
+    def forward(x):
+        return x * F.hardtanh(x+3, 0., 6.) / 6.
+
+class SiLU(Module):
+    @staticmethod
+    def forward(x):
+        return x * torch.sigmoid(x)
+

2.3 yolox 模型改动

为了更好地发挥 MLU 设备地作用，需要对 yolox 模型源码进行一些微调。

yolox/models/yolo_head.py

diff --git a/yolox/models/yolo_head.py b/yolox/models/yolo_head.py
index d67abd1..57c5b21 100644
--- a/yolox/models/yolo_head.py
+++ b/yolox/models/yolo_head.py
@@ -34,8 +34,7 @@ class YOLOXHead(nn.Module):

         self.n_anchors = 1
         self.num_classes = num_classes
-        self.decode_in_inference = True  # for deploy, set to False
-
+        self.decode_in_inference = False  # for deploy, set to False
         self.cls_convs = nn.ModuleList()
         self.reg_convs = nn.ModuleList()
         self.cls_preds = nn.ModuleList()

因为，控制流对 MLU 设备不友好，这里我们仿造 trt 的思路，把 decode_in_inference 设置为 False，后面在 cpu 上显示执行 decode 相关操作。

yolox/models/network_blocks.py

这里因为和高版本 troch 实现不一样，首先修改一下 SiLU 参数

- module = nn.SiLU(inplace=inplace)

+ module = nn.SiLU()

另外更改一下 Focus 函数实现，已加速 split concat 操作，这一步是因为 split concat 等操作对 MLU 这一类设备不友好，可以增加一个 conv 进行替代。

class Focus(nn.Module):
    """Focus width and height information into channel space."""

    def __init__(self, in_channels, out_channels, ksize=1, stride=1, act="silu"):
        super().__init__()
        self.space_to_depth_conv = nn.Conv2d(3, in_channels * 4, 2, 2, 0, groups=1, bias=False)
        # 修改权值文件，为新添加的卷积增加权值数据。
        weight = [[1.,0.,0.,0.],[0.,0.,0.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[1.,0.,0.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,0.,0.,0.],[1.,0.,0.,0.],
                  [0.,0.,1.,0.],[0.,0.,0.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,0.,1.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,0.,0.,0.],[0.,0.,1.,0.],
                  [0.,1.,0.,0.],[0.,0.,0.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,1.,0.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,0.,0.,0.],[0.,1.,0.,0.],
                  [0.,0.,0.,1.],[0.,0.,0.,0.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,0.,0.,1.],[0.,0.,0.,0.],
                  [0.,0.,0.,0.],[0.,0.,0.,0.],[0.,0.,0.,1.]]
        self.space_to_depth_conv.weight.data = torch.from_numpy(np.array(weight).reshape(12,3,2,2).astype(np.float32))
        self.conv =  Conv(in_channels * 4, out_channels, ksize, stride, act=act)

    def forward(self, x):
        x = self.space_to_depth_conv(x)
        return self.conv(x)

以上步骤都是为了更好地利用 MLU 设备，做到物尽其用。

完成了模型源码修改后，我们就可以仓库自带的 demo 进行推理了。

使用 mlu 推理 yolox 参考见：【YOLOX 部署】开源yolox模型进行寒武纪200平台的移植（下篇）

热门帖子