pytorch学习笔记-回归问题
pytorch学习记录,本篇主要是介绍和一个简单回归问题的demo
Pytorch框架简介
一、与tensorflow主要区别
- 动态图与静态图
Pytorch是动态图,tensorflow是静态图,处理和创建分离,预先定义处理过程,运行过程不能干预,对于调试过程想查看处理过程相对不方便 - Pytorch能做什么
GPU加速
自动求导 例如:
loss = x^2 * sin(x)import torch from torch import autograd x = torch.tensor(1.) a = torch.tensor(1., requires_grad=True) b = torch.tensor(2., requires_grad=True) c = torch.tensor(3., requires_grad=True) y = a**2 * x + b * x + c print('before:', a.grad, b.grad, c.grad) grads = autograd.grad(y, [a, b, c]) print('after:', grads[0], grads[1], grads[2]) ``` 常用网络层 ### 简单回归问题 #### 一、梯度下降算法 1. 概念
y = loss(x)
y’ = 2x * sin(x) + x^2 * cos(x)
核心: X = x - lr * y’ (lr是学习率,过大会导致最优解的求解过程抖动过大)
y = w * x + bdata:image/s3,"s3://crabby-images/04b5c/04b5c3535adb6a00ed62854787072954f6177a9c" alt="梯度下降算法" 2. 应用(以求二元一次方程为例)
y = w * x + b + e (e是一个高斯噪声干扰)
e ~ N(0.01,1)
loss = (wx + b - y)^2
loss = x^2 * sin(x)data:image/s3,"s3://crabby-images/90f19/90f19257e4ff7191039e8f7b589d7dddaf669e2a" alt="参数拟合",data:image/s3,"s3://crabby-images/ebd59/ebd598be2d0f71df1d999694ba6f9229b8f782dc" alt="参数拟合"   通过对图像中的x,y输入到损失函数中,求解最优(最小)的loss,也就能求出,w,b。从而给出拟合曲线),下面是拟合的图像,可以很清晰地看出求解过程 data:image/s3,"s3://crabby-images/3c274/3c2747822c9c18df3407b2638dad8d4b8551d4cb" alt="参数拟合" 3. Linear Regression - Linear Regression ```上面的y是整个实数范围内``` - Logistic Regression ```y会被压缩到[0,1]这个区间,好处是可以和概率挂钩,可以处理二分类问题``` - Classfication ```比如数字识别,满足所有的概率加起来是1``` ### 回归问题实战 #### 一、梯度下降算法 1. 概念
y = loss(x)
y’ = 2x * sin(x) + x^2 * cos(x)
核心: X = x - lr * y’ (lr是学习率,过大会导致最优解的求解过程抖动过大)
梯度公式:$w’=w-lr$ * $▽loss\over▽w$ $loss$ =$(wx+b-y)^2$data:image/s3,"s3://crabby-images/04b5c/04b5c3535adb6a00ed62854787072954f6177a9c" alt="梯度下降算法" 1. 代码 ``` python import numpy as np # y = wx + b # 计算loss,points是一系列x,y的组合,通过points和w,b两者作均方差 def compute_error_for_line_given_points(b, w, points): totalError = 0 for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] totalError += (y - (w * x + b)) ** 2 return totalError / float(len(points))
下面是用tensorboard画出来的拟合结果# 计算loss_fn关于w的梯度 def step_gradient(b_current, w_current, points, learningRate): b_gradient = 0 w_gradient = 0 N = float(len(points)) for i in range(0, len(points)): x = points[i, 0] y = points[i, 1] # 下面的公式是对上面“梯度公式”中第二个公式求梯度 #(即Loss对w, b分别求偏导) b_gradient += -(2/N) * (y - ((w_current * x) + b_current)) w_gradient += -(2/N) * x * (y - ((w_current * x) + b_current)) new_b = b_current - (learningRate * b_gradient) new_w = w_current - (learningRate * w_gradient) return [new_b, new_w] def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations): b = starting_b m = starting_m for i in range(num_iterations): b, m = step_gradient(b, m, np.array(points), learning_rate) return [b, m] def run(): points = np.genfromtxt("data.csv", delimiter=",") learning_rate = 0.0001 initial_b = 0 # initial y-intercept guess initial_m = 0 # initial slope guess num_iterations = 1000 print("Starting gradient descent at b = {0}, m = {1}, error = {2}" .format(initial_b, initial_m, compute_error_for_line_given_points(initial_b, initial_m, points)) ) print("Running...") [b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations) print("After {0} iterations b = {1}, m = {2}, error = {3}". format(num_iterations, b, m, compute_error_for_line_given_points(b, m, points)) ) if __name__ == '__main__': run()
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 不听话的兔子君!