Pytorch2

7. 神经网络¶

7.1 神经网络基本骨架-nn.Module的使用¶

代码如下所示，必须要重写forward函数。

Python

import torch
from torch import nn

class Howjul(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, input):
        output = input + 1
        return output

howjul = Howjul()
x = torch.tensor(1.0)
output = howjul(x)
print(output)

7-1-nn_Module.py

7.2 卷积操作¶

官方说明

torch.nn和torch.nn.functional的区别：

torch.nn：平常使用这个就足够了，已经封装好了；
torch.nn.functional：更加精细的操作，用于了解神经网络函数内部的操作，其实和torch.nn里面的函数是一一对应的，这里为了讲解，选用torch.nn.functional。

本次以torch.nn.functional.conv2d为例，参数如下：

input：输入，注意是四维张量
weight：卷积核，注意也是四维张量
stride：步长，卷积层每次移动的距离，(1,2)表示横向步长和纵向步长；
padding：边缘填充

Python

import torch
import torch.nn.functional as F


input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1],
                      [1, 2, 1, 0, 0],
                      [5, 2, 3, 1, 1],
                      [2, 1, 0, 1, 1]])

kernel = torch.tensor([[1, 2, 1],
                       [0, 1, 0],
                       [2, 1, 0]])

input = torch.reshape(input, (1, 1, 5, 5))
kernel = torch.reshape(kernel, (1, 1, 3, 3))

output = F.conv2d(input, kernel, stride=1)
print(output)

output2 = F.conv2d(input, kernel, stride=2)
print(output2)

output3 = F.conv2d(input, kernel, stride=1, padding=1)
print(output3)

7-2-Conv_operation.py

7.3 卷积层¶

以torch.nn.Conv2d为例，这是处理二维图像最常用的函数，参数如下：

in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
输出通道数大于1但是输入通道数为1，那么会生成多个卷积核对输入进行卷积运算
kernel_size (int or tuple) – Size of the convolving kernel
stride (int or tuple, optional) – Stride of the convolution. Default: 1
padding (int, tuple or str, optional) – Padding added to all four sides of the input. Default: 0
padding_mode (str, optional) – 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'
dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1
groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional) – If True, adds a learnable bias to the output. Default: True

Python

import torch
import torchvision
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./Dataset",
                                       train=False,
                                       transform=torchvision.transforms.ToTensor(),
                                       download=True)
dataloader = DataLoader(dataset, batch_size=64)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)

    def forward(self, x):
        x = self.conv1(x)
        return x


howjul = Howjul()
print(howjul)

writer = SummaryWriter("logs")
step = 0
for data in dataloader:
    imgs, targets = data
    output = howjul(imgs)
    print(imgs.shape)
    print(output.shape)
    # torch.Size([64, 3, 32, 32])
<div markdown="1" style="margin-top: -30px; font-size: 0.75em; opacity: 0.7;">
:material-circle-edit-outline: 约 1435 个字 :fontawesome-solid-code: 406 行代码 :material-image-multiple-outline: 12 张图片 :material-clock-time-two-outline: 预计阅读时间 10 分钟
</div>
    writer.add_images("input", imgs, step)
    # torch.Size([64, 6, 30, 30]) -> [xxx, 3, 30, 30]
    # tensorboard无法显示六通道，所以要进行reshape
    # 输出shape的第一个值可以使用-1，函数自动进行计算
    output = torch.reshape(output, (-1, 3, 30, 30))
    writer.add_images("output", output, step)
    step = step + 1

7-3-Conv.py

7.4 最大池化的使用¶

以torch.nn.MaxPool2d为例，参数为：

kernel_size (Union[int, Tuple[int, int]]) – the size of the window to take a max over
stride (Union[int, Tuple[int, int]]) – the stride of the window. Default value is kernel_size
padding (Union[int, Tuple[int, int]]) – Implicit negative infinity padding to be added on both sides
dilation (Union[int, Tuple[int, int]]) – a parameter that controls the stride of elements in the window
return_indices (bool) – if True, will return the max indices along with the outputs. Useful for torch.nn.MaxUnpool2d later
ceil_mode (bool) – when True, will use ceil instead of floor to compute the output shape

如下图为一个最大池化的例子，就是在原来的图像中，每3x3的像素选最大的那个像素，如果Ceil_model为False，那么如果不足3x3的尺寸则会舍弃。

Python

import torch
from torch import nn
from torch.nn import MaxPool2d

input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1],
                      [1, 2, 1, 0, 0],
                      [5, 2, 3, 1, 1],
                      [2, 1, 0, 1, 1]], dtype=torch.float32)

input = torch.reshape(input, (-1, 1, 5, 5))
print(input.shape)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)

    def forward(self, input):
        output = self.maxpool1(input)
        return output


howjul = Howjul()
output = howjul(input)
print(output)

最大池化的作用：保留数据的特征的同时把数据量减小。

与之前的结合，对数据集进行处理

Python

import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./Dataset",
                                       train=False,
                                       transform=torchvision.transforms.ToTensor(),
                                       download=True)
dataloader = DataLoader(dataset, batch_size=64)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.maxpool1 = MaxPool2d(kernel_size=3, ceil_mode=True)

    def forward(self, input):
        output = self.maxpool1(input)
        return output


howjul = Howjul()

writer = SummaryWriter("logs_maxpool")
step = 0

for data in dataloader:
    imgs, labels = data
    writer.add_images("Input", imgs, step)
    output = howjul(imgs)
    writer.add_images("Output", output, step)
    step = step + 1

writer.close()

运行效果如下，可以看出池化层产生了类似马赛克的效果

7-4-nn_Pooling.py

7-4-nn_Pooling_dataset.py

7.5 非线性激活¶

比较常见的是 torch.nn.ReLU 和 torch.nn.Sigmoid，代码演示以ReLU为例

Python

import torch
from torch import nn
from torch.nn import ReLU

input = torch.tensor([[1, -0.5],
                      [-1, 3]])

output = torch.reshape(input, (-1, 1, 2, 2))
print(output.shape)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.relu1 = ReLU(inplace=False) # 不进行原地替换

    def forward(self, x):
        output = self.relu1(x)
        return output


howjul = Howjul()
output = howjul(input)
print(output)

输出结果为：

Text Only

torch.Size([1, 1, 2, 2])
tensor([[1., 0.],
        [0., 3.]])

我们再使用sigmoid函数对数据集进行变换

Python

import torch
import torchvision
from torch import nn
from torch.nn import ReLU, Sigmoid
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./Dataset",
                                       train=False,
                                       transform=torchvision.transforms.ToTensor(),
                                       download=True)
dataloader = DataLoader(dataset, batch_size=64)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.relu1 = ReLU(inplace=False)  # 不进行原地替换
        self.sigmoid1 = Sigmoid()

    def forward(self, x):
        output = self.sigmoid1(x)
        return output


howjul = Howjul()

writer = SummaryWriter("logs_relu")
step = 0
for data in dataloader:
    imgs, labels = data
    writer.add_images("Input", imgs, step)
    output = howjul(imgs)
    writer.add_images("Output", output, step)
    step = step + 1
writer.close()

输出结果为：

7-5-nn_Relu.py

7-5-nn_Relu_dataset.py

7.6 线性层及其他层介绍¶

其他层：

Padding Layers：进行填充的层
Normalization Layers：对输入采用正则化，采用正则化可以加快神经网络的训练速度
Dropout Layers：训练过程中，随机地把一些input变成0，防止过拟合

特定网络结构：

Recurrent Layers：有RNN、LSTM等，是一些特定的网络结构
Transform Layers：也是特定的网络结构
Sparse Layers：主要用于自然语言处理

线性层 Linear Layers

Python

torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

\[g_1 = (k_1 \times x_1 + b_1)+(k_2 \times x_2 + b_2) + ...\]

\(b\) 就是如上函数参数中的 bias ，\(g\) 就是输出，\(x\) 就是输入。

Water | Free Full-Text | Comparison of Multiple Linear Regression, Artificial Neural Network, Extreme Learning Machine, and Support Vector Machine in Deriving Operation Rule of Hydropower Reservoir

对于每个 \(x\) 的权重 \(k\) 和偏移 \(b\) ，会按照如下的方式进行确定

我们以 vgg16 为例，在最后的时候4096变成了1000，就是采用了线性层进行变换

VGG-16 | CNN model - GeeksforGeeks

Python

import torch
import torchvision
from torch import nn
from torch.nn import Linear
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("./Dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=64)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul,self).__init__()
        self.linear1 = Linear(196608, 10)

    def forward(self, x):
        x = self.linear1(x)
        return x


howjul = Howjul()

for imgs, labels in dataloader:
    print(imgs.shape)
    # output = torch.reshape(imgs, (1, 1, 1, -1))
    output = torch.flatten(imgs)  # 把输入展成一行
    print(output.shape)
    output = howjul(output)
    print(output.shape)

7-6-nn_Linear.py

7.7 搭建小实战和Sequential的使用¶

torch.nn.Sequential ，是一类容器，比 torch.nn.Module 更加简单，直接看如下例子即可，

Python

# Using Sequential to create a small model. 
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Using Sequential with OrderedDict.
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))

接下来我们实现如下所示的网络结构

Structure of CIFAR10-quick model. | Download Scientific Diagram

对于第一个卷积层，输入通道为3，输出通道为32，卷积核为5，要保持尺寸仍然为32x32，那么我们需要进行如下计算，因为不进行空洞卷积，所以 dilation 为1，我们要求的是 padding 和 stride ，我们把 stride 设置为1，那么 padding 就是2。

最后的神经网络定义代码如下所示：

Python

class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.conv1 = Conv2d(3, 32, 5, padding=2)
        self.maxpool1 = MaxPool2d(2)
        self.conv2 = Conv2d(32, 32, 5, padding=2)
        self.maxpool2 = MaxPool2d(2)
        self.conv3 = Conv2d(32, 64, 5, padding=2)
        self.maxpool3 = MaxPool2d(2)
        self.flatten = Flatten()
        self.linear1 = Linear(1024, 64)
        self.linear2 = Linear(64, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.maxpool1(x)
        x = self.conv2(x)
        x = self.maxpool2(x)
        x = self.conv3(x)
        x = self.maxpool3(x)
        x = self.flatten(x)
        x = self.linear1(x)
        x = self.linear2(x)
        return x

也可以是使用 Sequential 函数，使代码更加简洁

Python

class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.model1 = Sequential(
            Conv2d(3, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

然后可以使用 tensorboard 来使得过程可视化

Python

howjul = Howjul()
input = torch.ones((64, 3, 32, 32))
output = howjul(input)
print(output.shape)

writer = SummaryWriter("logs_seq")
writer.add_graph(howjul, input)
writer.close()

结果如下

7-7-nn_Sequential.py

7.8 损失函数与反向传播¶

损失函数一方面可以计算实际输出和目标之间的差距，另一方面可以为我们更新输出提供一定的依据（反向传播）比如计算出梯度。

Loss Functions 官方文档，在使用损失函数的时候，只需要注意自己的需求和输入输出即可。

L1Loss：直接对应的绝对值差进行相加或者取平均

Python

inputs = torch.tensor([1, 2, 3], dtype=torch.float32)
targets = torch.tensor([1, 2, 5], dtype=torch.float32)

inputs = torch.reshape(inputs, (1, 1, 1, 3))
targets = torch.reshape(targets, (1, 1, 1, 3))

loss = L1Loss(reduction='mean')
result = loss(inputs, targets) # 结果为0.6667

loss2 = L1Loss(reduction='sum')
result = loss(inputs, targets) # 结果为2.

MSELoss：对应的绝对值差的平方

Python

loss3 = MSELoss(reduction='mean')
result = loss3(inputs, targets) # 结果为1.3333

CrossEntropyLoss：在训练分类问题中比较有用，实际计算如下图所示

Python

x = torch.tensor([0.1, 0.2, 0.3])
y = torch.tensor([1])
x = torch.reshape(x, (1, 3))
loss4 = CrossEntropyLoss()
result = loss4(x, y) # 结果为1.1019

以下为实际的例子，读入 CIFAR10 数据集，输出每个类别的可能性，然后与实际的 label 进行比对。

Python

import torchvision
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Flatten, Linear
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("./Dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=1)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.model1 = Sequential(
            Conv2d(3, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x


loss = nn.CrossEntropyLoss()
howjul = Howjul()
for data in dataloader:
    imgs, labels = data
    output = howjul(imgs)
    result_loss = loss(output, labels)
    print(result_loss)
    result_loss.backward()  # 可以计算出梯度

结果如下：

7-8-LossFunc.py

7-8-LossFunc_network.py

7.9 优化器¶

pytorch 的优化器主要集中在 torch.optim 中，官方文档点此。

优化器使用方法如下，先对梯度进行清零，然后使输入经过神经网络模型进行输出，然后计算输出和目标之间的误差，然后调用 backward 来计算梯度，然后调用优化器的 step 函数来更新参数。

Python

for input, target in dataset:
    optimizer.zero_grad()
    output = model(input)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()

接下来就是具体的优化算法了，但是具体优化算法需要具体学习，一般需要给算法需要优化的参数的列表和学习速率（ learning rate ）。

如下是代码样例：

Python

import torch
import torchvision
from torch import nn
from torch.nn import Sequential, Conv2d, MaxPool2d, Flatten, Linear
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("./Dataset", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=1)


class Howjul(nn.Module):
    def __init__(self):
        super(Howjul, self).__init__()
        self.model1 = Sequential(
            Conv2d(3, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x


loss = nn.CrossEntropyLoss()
howjul = Howjul()
optim = torch.optim.SGD(howjul.parameters(), lr=0.01)
for epoch in range(20):
    running_loss = 0.0
    for data in dataloader:
        imgs, labels = data
        output = howjul(imgs)
        result_loss = loss(output, labels)
        optim.zero_grad()  # 清除梯度信息
        result_loss.backward()  # 计算出梯度
        optim.step()  # 对每个参数进行优化
        running_loss += result_loss
    print("Epoch: {}/{} Loss: {}".format(epoch+1, 20, running_loss))

结果如下，可以看到 Loss 之和在不断变小

Text Only

Files already downloaded and verified
Epoch: 1/20 Loss: 18687.427734375
Epoch: 2/20 Loss: 16134.6591796875
Epoch: 3/20 Loss: 15481.9697265625

7-9-optim.py

参考：Enl_Z，小土堆教程

本文总阅读量次