本篇主要是记录无监督学习相关知识

一、监督学习

  像之前的分类和回归问题,比如分类问题,要提前人工写好label告诉它这张图片里是数字9,但是这种labeling实际上可能非常耗时耗力,现实生活中存在着大量没有labeling的数据,如何利用这些数据是无监督学习一直想要做的。
图片描述

二、auto-encoder

  无监督学习的目标的重建他自己
图片描述

三、无监督学习的训练

  无监督学习的损失函数和监督学习一样,不管是Crossentropy或者均方差等都可以,训练方法一致。

四、autoencoder实战

1. autoencoder类

  同样继承自nn.module,里面有encoderdecoder两大部分

import torch
from torch import nn


class AutoEncoder(nn.Module):
    def __init__(self):
        super(AutoEncoder, self).__init__()

        # [b, 784] => [b, 20]
        self.encoder = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 64),
            nn.ReLU(),
            nn.Linear(64, 20),
            nn.ReLU()
        )
        # [b, 20] => [b, 784]
        self.decoder = nn.Sequential(
            nn.Linear(20, 64),
            nn.ReLU(),
            nn.Linear(64, 256),
            nn.ReLU(),
            nn.Linear(256, 784),
            # 像素压缩到 0~1
            nn.Sigmoid()
        )

    def forward(self, x):
        """
        Args:
            x: [b, 1, 28, 28]
        Returns:
        """
        batchsz = x.size(0)
        # flatten
        x = x.view(batchsz, 784)
        # encoder
        x = self.encoder(x)
        # decoder
        x = self.decoder(x)
        # reshape
        x = x.view(batchsz, 1, 28, 28)
        return x

2. main函数

  和之前同样没什么区别,每一轮加载数据,喂给模型,作反向传播计算梯度,计算损失函数,最好可视化出来。

import torch
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision import datasets
from my_ae import AutoEncoder
from torch import nn
from torch import optim
import visdom

# 利用 autoencoder 重建 Minist 数据集

def main():
    minist_train = datasets.MNIST('minist', train=True, transform=transforms.Compose([
        transforms.ToTensor()
    ]), download=True)
    minist_train = DataLoader(minist_train, batch_size=32, shuffle=True)

    minist_test = datasets.MNIST('minist', train=True, transform=transforms.Compose([
        transforms.ToTensor()
    ]), download=True)
    minist_test = DataLoader(minist_test, batch_size=32, shuffle=True)

    x, _ = iter(minist_train).next()
    print('x: ', x.shape)

    device = torch.device('cuda')
    model = AutoEncoder().to(device)
    criteon = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=1e-3)
    print(model)

    viz = visdom.Visdom()
    for epoch in range(1000):

        for batchidx, (x, _) in enumerate(minist_train):
            # [b, 1, 28, 28]
            x = x.to(device)

            x_hat = model(x)
            loss = criteon(x_hat, x)

            # backpro
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        print(epoch, 'loss:', loss.item())
        x, _ = iter(minist_test).next()
        x = x.to(device)
        with torch.no_grad():
            x_hat = model(x)
        viz.images(x, nrow=8, win='x', opts=dict(title='x'))
        viz.images(x_hat, nrow=8, win='x_hat', opts=dict(title='x_hat'))

if __name__ == '__main__':
    main()

四、variautoencoder实战

1. VAE类

  和之前不同的是在编码与解码之间多了一些操作
图片描述
  图中在代码中体现如下:

    def forward(self, x):
        """
        Args:
            x: [b, 1, 28, 28]
        Returns:
        """
        batchsz = x.size(0)
        # flatten
        x = x.view(batchsz, 784)
        # encoder
        # [b, 20], 包括mean 和 sigma
        h_ = self.encoder(x)
        # [b, 20] => [b, 10], [b, 10]
        mean, sigma = h_.chunk(2, dim=1)
        # reparemetrize trick
        # 解决sample操作不能求梯度的问题
        h = mean + sigma * torch.randn_like(sigma)
        # print('h.shape', h.shape)
        # h.shape torch.Size([32, 10])
        # decoder
        x_hat = self.decoder(h)
        # reshape
        x_hat = x_hat.view(batchsz, 1, 28, 28)

        # 计算KL Divergence
        kld = 0.5 * torch.sum(
            torch.pow(mean, 2) +
            torch.pow(sigma, 2) -
            torch.log(1e-8 + torch.pow(sigma, 2)) - 1
        ) / (batchsz * 28 * 28)
        return x_hat, kld