pytorch学习记录,本篇主要是介绍和一个简单回归问题的demo

Pytorch框架简介

一、与tensorflow主要区别

  1. 动态图与静态图
      Pytorch是动态图,tensorflow是静态图,处理和创建分离,预先定义处理过程,运行过程不能干预,对于调试过程想查看处理过程相对不方便
  2. Pytorch能做什么
    GPU加速
    自动求导 例如:
       import torch
       from torch import autograd
    
       x = torch.tensor(1.)
       a = torch.tensor(1., requires_grad=True)
       b = torch.tensor(2., requires_grad=True)
       c = torch.tensor(3., requires_grad=True)
    
       y = a**2 * x + b * x + c
       print('before:', a.grad, b.grad, c.grad)
       grads = autograd.grad(y, [a, b, c])
       print('after:', grads[0], grads[1], grads[2])
       ```       
       常用网络层  
      
    ### 简单回归问题
    #### 一、梯度下降算法
    1. 概念 
    loss = x^2 * sin(x)
    y = loss(x)
    y’ = 2x * sin(x) + x^2 * cos(x)
    核心: X = x - lr * y’ (lr是学习率,过大会导致最优解的求解过程抖动过大)
       ![梯度下降算法](https://e.im5i.com/2021/11/12/UschaQ.png)
    2. 应用(以求二元一次方程为例)
    y = w * x + b
    y = w * x + b + e (e是一个高斯噪声干扰)
    e ~ N(0.01,1)
    loss = (wx + b - y)^2
       ![参数拟合](https://e.im5i.com/2021/11/12/Us8KAT.png),![参数拟合](https://e.im5i.com/2021/11/12/Us8Q0A.png)
         通过对图像中的x,y输入到损失函数中,求解最优(最小)的loss,也就能求出,w,b。从而给出拟合曲线),下面是拟合的图像,可以很清晰地看出求解过程
       ![参数拟合](https://e.im5i.com/2021/11/12/Us8R2S.png)
    3. Linear Regression
       - Linear Regression  
         ```上面的y是整个实数范围内```
       - Logistic Regression  
         ```y会被压缩到[0,1]这个区间,好处是可以和概率挂钩,可以处理二分类问题```
       - Classfication
         ```比如数字识别,满足所有的概率加起来是1```  
    ### 回归问题实战
    #### 一、梯度下降算法
    1. 概念 
    loss = x^2 * sin(x)
    y = loss(x)
    y’ = 2x * sin(x) + x^2 * cos(x)
    核心: X = x - lr * y’ (lr是学习率,过大会导致最优解的求解过程抖动过大)
       ![梯度下降算法](https://e.im5i.com/2021/11/12/UschaQ.png)  
    1. 代码
       ``` python
       import numpy as np
    
       # y = wx + b
       # 计算loss,points是一系列x,y的组合,通过points和w,b两者作均方差
       def compute_error_for_line_given_points(b, w, points):
          totalError = 0
          for i in range(0, len(points)):
             x = points[i, 0]
             y = points[i, 1]
             totalError += (y - (w * x + b)) ** 2
          return totalError / float(len(points))
       
    梯度公式:$w’=w-lr$ * $▽loss\over▽w$ $loss$ =$(wx+b-y)^2$
    # 计算loss_fn关于w的梯度
    def step_gradient(b_current, w_current, points, learningRate):
       b_gradient = 0
       w_gradient = 0
       N = float(len(points))
       for i in range(0, len(points)):
          x = points[i, 0]
          y = points[i, 1]
          # 下面的公式是对上面“梯度公式”中第二个公式求梯度
          #(即Loss对w, b分别求偏导)
          b_gradient += -(2/N) * (y - ((w_current * x) + b_current))
          w_gradient += -(2/N) * x * (y - ((w_current * x) + b_current))
       new_b = b_current - (learningRate * b_gradient)
       new_w = w_current - (learningRate * w_gradient)
       return [new_b, new_w]
    
    def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations):
       b = starting_b
       m = starting_m
       for i in range(num_iterations):
          b, m = step_gradient(b, m, np.array(points), learning_rate)
       return [b, m]
    
    def run():
       points = np.genfromtxt("data.csv", delimiter=",")
       learning_rate = 0.0001
       initial_b = 0 # initial y-intercept guess
       initial_m = 0 # initial slope guess
       num_iterations = 1000
       print("Starting gradient descent at b = {0}, m = {1}, error = {2}"
             .format(initial_b, initial_m,
                      compute_error_for_line_given_points(initial_b, initial_m, points))
             )
       print("Running...")
       [b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations)
       print("After {0} iterations b = {1}, m = {2}, error = {3}".
             format(num_iterations, b, m,
                   compute_error_for_line_given_points(b, m, points))
             )
    
    if __name__ == '__main__':
       run()
    
    下面是用tensorboard画出来的拟合结果
    拟合结果