2022年12月27日08:27:50

这里只针对代码中出现的具体函数。使用pytorch版本的yolov3代码。

Pytorch版本的yolov3代码可见github连接：https://github.com/eriklindernoren/PyTorch-YOLOv3

只针对函数，不针对原理。

具体实现原理，可自行查阅其它文档。

附原论文链接：

https://pjreddie.com/media/files/papers/YOLOv3.pdf

torch.cuda

Tensor=torch.cuda.FloatTensor if torch.cuda.is_available() else torch.FloatTensor

这个语句用于决定计算时GPU或是CPU的选定。

torch.cuda.is_available()

该语句返回的是一个bool值，在于展示CUDA当前是否可用

torch.cuda是一个CUDA张量类型的支持包。

该包增加了对CUDA张量类型的支持，实现了张量的定义及相关功能，但使用GPU进行计算。

它的初始化方式很便捷，所以可以随时导入它，并使用is_available()来确定系统是否支持CUDA。

torch中文文档中有更多相关细节。

连接如下：

https://pytorch-cn.readthedocs.io/zh/latest/package_references/torch-cuda/

torch.Tensor

torch.Tensor是一种包含单一数据类型元素的多维矩阵。该类型一定会出现在Pytorch的使用过程中。

具体来讲，Torch定义了七种CPU tensor类型和八种GPU tensor类型：

Data tyoe	CPU tensor	GPU tensor
32-bit floating point	torch.FloatTensor	torch.cuda.FloatTensor
64-bit floating point	torch.DoubleTensor	torch.cuda.DoubleTensor
16-bit floating point	N/A	torch.cuda.HalfTensor
8-bit integer (unsigned)	torch.ByteTensor	torch.cuda.ByteTensor
8-bit integer (signed)	torch.CharTensor	torch.cuda.CharTensor
16-bit integer (signed)	torch.ShortTensor	torch.cuda.ShortTensor
32-bit integer (signed)	torch.IntTensor	torch.cuda.IntTensor
64-bit integer (signed)	torch.LongTensor	torch.cuda.LongTensor

不指定tensor类型的话，使用默认torch.FlaotTensor类型。

torch.Tensor是默认的tensor类型（torch.FlaotTensor）的简称。

torch中文文档中有更多Tensor的相关细节。

连接如下：

https://pytorch-cn.readthedocs.io/zh/latest/package_references/Tensor/>

Variable（）

input_imgs=Variable(input_imgs.type(Tensor))

Variable 属于Pytorch中Autograd的方法。用于将tensor转化为Variable，形成计算图。

Variable是对Tensor的一个封装，操作和Tensor是一样的，但是每个Variable都有三个属性。

Varibale的Tensor本身的.data
对应Tensor的梯度.grad
以及这个Variable是通过什么方式得到的.grad_fn

Variable API 几乎和 Tensor API一致 (除了一些in-place方法，这些in-place方法会修改 required_grad=True的 input 的值)。

多数情况下，将Tensor替换为Variable，代码一样会正常的工作。

torch中文文档中有更多的Variable相关细节。

连接如下：

https://pytorch-cn.readthedocs.io/zh/latest/package_references/torch-autograd/#variable

torchvision.datasets

d=ImageFolder(opt.image_folder,img_size=opt.img_size)

print(d[0])


dataloader=DataLoader(

    ImageFolder(opt.image_folder,img_size=opt.img_size),

    batch_size=opt.batch_size,

    shuffle=False,

    num_workers=opt.n_cpu,

)

Dataloader 属于 Torchvision中的数据集构造函数。

从相关路径下，读取数据集，构造数据集函数。同时设置batchsize等参数。

torchvision.datasets

torchvision.datasets中包含了以下数据集

MNIST
COCO（用于图像标注和目标检测）(Captioning and Detection)
LSUN Classification
ImageFolder
Imagenet-12
CIFAR10 and CIFAR100
STL10

由于以上Datasets都是 torch.utils.data.Dataset的子类，所以，他们也可以通过torch.utils.data.DataLoader使用多线程（python的多进程）。

举例说明： torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)

使用多线程的好处暂时不提，可自行查阅。显而易见，多线程可以在在数据加载阶段，填补CPU和GPU读写速度差，提高GPU利用率。

torch中文文档中有更多的Datasets相关细节。

连接如下：

https://pytorch-cn.readthedocs.io/zh/latest/torchvision/torchvision-datasets/

nn.model

返回一个包含当前模型所有模块的迭代器。

With torch.no_grad():

Detections = model(input_imgs)

Detections = non_max_suppression(detections,opt.conf_thres,opt.nms_thres)

这段代码的实际执行的操作是，使用yolov3模型处理输入图片。

接着再用非极大值抑制算法（NMS, non_max_supperssion）来进行二次处理。

具体的NMS算法原理和实现暂时不提。

YoloV3中Mode，是来自类DarkNet 的实例。

DarkNet是一个比较小众的深度学习框架，没有社区主要靠作者团队自行维护。所以推广较弱。

nn.model在Pytorch中，是所有网络（net）层的父类。

如果要自己实现相应的网络层的话，需要继承该类。

比如：

import torch.nn as nn
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.add_module("conv", nn.Conv2d(10, 20, 4))
        self.add_module("conv1", nn.Conv2d(20 ,10, 4))
model = Model()

for module in model.modules():
    print(module)

Output:

Model (
            (conv): Conv2d(10, 20, kernel_size=(4, 4), stride=(1, 1))
            (conv1): Conv2d(20, 10, kernel_size=(4, 4), stride=(1, 1))
            )
Conv2d(10, 20, kernel_size=(4, 4), stride=(1, 1))
Conv2d(20, 10, kernel_size=(4, 4), stride=(1, 1))

可以看出，modules()返回的内容不止我们输入的子模块。还包括Model()中的内容。这是和children()的不同。

torch中文文档中有更多的modules相关细节。

连接如下：

https://pytorch-cn.readthedocs.io/zh/latest/package_references/torch-nn/#modules

eval()

model.eval()  # Setinevaluationmode

eval( )属于torch.nn 中的元素。torch.nn用于构建pytorch模型的处理等操作。

该函数将模型设置成evaluation模式。

torch中文文档中有更多eval的相关细节。

连接如下：

https://pytorch-cn.readthedocs.io/zh/latest/package_references/torch-nn/#eval

interpolate（）

X = F.interpolate(x, scale_factor = self.scale_factor, mode=self.mode)

interpolate属于torch.nn的方法，在YoloV3代码中用于实现上下采样，辅助模型的训练过程。

向下/向上采样输入到给定的大小或给定的scale_factor。

输入维度以以下形式解释: 迷你批处理 x 通道 x [可选深度] x [可选高度] x 宽。

目前支持时间、空间和体积采样，即期望输入为3-D、4-D或5-D的形状。

可以调整大小的模式有:Nearst、Linear(仅3d)、双线性的、双三次的(仅4d)、三线性的(仅4d)、are。

torch中文文档中有更多的interpolate相关细节。

连接如下：

https://pytorch.org/docs/stable/nn.functional.html?highlight=interpolate

enumerate(dataloader)

For batch_i,(img_paths,input_imgs) in enumerate(dataloader):

enumerate（）函数属于Python内置函数。

目的是将dataloader中的数据加载出来。

enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标。

一般用在 for 循环当中。

返回 enumerate(枚举) 对象。

plt相关方法

get_cmp（）

cmap=plt.get_cmap("tab20b")

get_cmp（）实际上就是一种获取图片色阶的方式。

在YoloV3的代码中，用来设定后边boudingbox的颜色。

figure（）

plt.figure()

figure说明：The Figure instance returned will also be passed to new_figure_manager in the backends, which allows to hook custom Figure classes into the pylab interface. Additional kwargs will be passed to the figure init function

含义猜测：通过new_figure_manager（新图形管理器）返回一个figure图形实例。定制的figure类将与pylab接口进行关联，同时将相关参数传递给figure的初始化函数。

实际可以理解为返回一个图像实例，在YoloV3中用于对图像的显示和保存。

Cmap = plt.get_cmap("tab20b")

Colors = [cmap(i) for I in np.linspace(0,1,20)]

获取一种colormap的颜色值。Colormap 的颜色值为 “tab20b”。

之后从cmap 中获取20种颜色。

def get_cmap(name=None, lut=None):

"""

获取一个colormap实例，如果*name*为None，则默认为rc值。

添加:func: ' register_cmap '的Colormaps优先于

内置colormaps。

如果*name*是一个:class: ' matplotlib.colors。Colormap的实例，它会

返回。

如果*lut*不是None，那么它必须是一个整数

查找表中需要的条目和*name*必须是标准的

mpl colormap名字。

bbox_colors=random.sample(colors,n_cls_preds)

对于random.sample的用法，多用于截取列表的指定长度的随机数，但是不会改变列表本身的排序

参考自：https://www.cnblogs.com/fish-101/p/11339909.html

numpy.argsort()

image_pred=image_pred[(-score).argsort()]

numpy.argsort() 函数返回的是数组值从小到大的索引值。

Argsort() 函数是Numpy排序，条件刷选函数中的一种。Numpy提供了多种排序算法。

timedalte（）


inference_time=datetime.timedelta(seconds=current_time-prev_time)

timedalte 是datetime中的一个对象，该对象表示两个时间的差值。

构造函数：datetime.timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)

其中参数都是可选，默认值为0

where()

color=bbox_colors[int(np.where(unique_labels==int(cls_pred))[0])]

这句话的目的在于，返回检测到的图像框的数量，用来作为下次的随机取值。

where（）返回元素，可以是x或y，具体取决于条件(condition)

对于不同的输入，where返回的值是不同的。

参数：	condition：array_llike，bool
	如果为True，则产生x，否则产生y。
	x，y：array_like，可选
	要从中选择的值。x，y和条件需要可以播放到某种形状。
返回值：	out：ndarray或ndarray元组
	如果同时指定了x和y，则输出数组包含x的元素，其中condition为True，其他元素来自y。如果只给出条件，则返回元组condition.nanzero()，条件为True的索引。

其它方法

boxes[:,0]=((boxes[:,0]-pad_x//2)/unpad_w)*orig_w

符号 //，在 python 中意为整除。

 unique_labels=detections[:,-1].cpu().unique()

应该是一种数据转换，其中：

defunique(self): # real signature unknown;restored from__doc__

"""unique(self:torch._C.Value)->int"""

return0

defcpu(self)->Tensor:...

unique_labels=value[:,-1].cpu().unique()

.cpu()的含义是，把程序放到cpu上跑，因为这个代码默认是在GPU上运行的。

Unique的意思是，把list中的数据从大到小排序。

Value[ : , -1]的意思是，取所有行的最后一个数据。

Pytorch版本yolov3部分代码语句分析

torch.cuda

torch.Tensor

Variable（）

torchvision.datasets

nn.model

eval()

interpolate（）

enumerate(dataloader)

plt相关方法

get_cmp（）

figure（）

numpy.argsort()

timedalte（）

where()

其它方法

热门文章

torch.cuda

torch.Tensor

Variable（）

torchvision.datasets

nn.model

eval()

interpolate（）

enumerate(dataloader)

plt相关方法

get_cmp（）

figure（）

numpy.argsort()

timedalte（）

where()

其它方法

热门文章

登录 找回密码

登录找回密码