Tensorflow 线性回归

在有监督学习问题中, 线性回归是一种最简单的建模手段.

给定一个数据点集合作为训练集, 线性回归的目标是找到一个与这些数据最为吻合的线性函数.

y(x1,x2,x3,...,xk) = w1x1+w2x2+w3x3+...+wkxk+b

Y 为待预测值

X 是一组独立的预测变量

w 为权值 (可学习)

b 为偏移量 (可学习)

上面的公式用代码表示应该是

w=tf.Variable(tf.zeros([2,1]), name="weights")

b=tf.Variable(0.0, name="bias")

def inference(x):
    return tf.matmul(x,w)+b

接下来需要定义如何计算损失

对于这种简单的模型,将采用总平方误差,即每个训练样本产生的实际值与期望值之差的平方的总和

完整的例子如下

# -*- coding: utf-8 -*-
import tensorflow as tf

w=tf.Variable(tf.zeros([2,1]), name="weights")
b=tf.Variable(0.0, name="bias")


def inference(x):
    #计算推断模型在数据x上的输出,返回结果
    return tf.matmul(x,w)+b
    
def loss(x,y):
    #根据训练数据x及期望获得的结果y计算损失
    y_predicted = inference(x)
    return tf.reduce_sum(tf.squared_difference(y,y_predicted))
def inputs():
    #读取或生成训练数据x极其期望输出y
    weight_age = [[84,46],[73,20],[65,52],[70,30],[76,57],[69,25]]
    blood_fat_content = [354,190,405,263,451,302]
    return tf.to_float(weight_age),tf.to_float(blood_fat_content)
    
def train(total_loss):
    #根据计算的总损失训练或调整模型参数
    learning_rate = 0.00000001
    return tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)
    
def evaluate(sess,x,y):
    #对训练得到的模型进行评估
    print sess.run(inference([[80.,25.]]))
    print sess.run(inference([[65.,25.]]))
    print sess.run(inference([[88.,33.]]))
    
saver = tf.train.Saver()

#在一个会话对象中启动数据流图, 搭建流程

with tf.Session() as sess:
    
    tf.initialize_all_variables().run()
    
    x,y = inputs()
    
    total_loss = loss(x,y)
    train_op = train(total_loss)
    
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess,coord=coord)
    
    #实际的训练迭代次数
    training_steps = 100000
    
    for step in range(training_steps):
        sess.run([train_op])
        #输出损失递件过程
        if step%10==0:
            print "loss:", sess.run([total_loss])
            
        if step%1000==0:
            saver.save(sess,'my-checkpoint', global_step=step)
    
    evaluate(sess,x,y)
    
    coord.request_stop()
    coord.join(threads)
    
    saver.save(sess,'my-checkpoint', global_step=training_steps)
    sess.close()

这个例子运行时可以看到错误率一直在降低,最后我们尝试给出3个评估,也可以看到虽然输入数据非常少,但是33岁体重88公斤的人的血脂含量仍然是最大的

loss: [300929.16]
loss: [300928.09]
loss: [300927.0]
loss: [300925.97]
loss: [300924.88]
loss: [300923.78]
loss: [300922.69]
loss: [300921.62]
loss: [300920.56]
loss: [300919.5]
...

...

loss: [300392.31]
loss: [300392.34]
loss: [300392.31]
loss: [300392.31]
loss: [300392.34]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.28]
loss: [300392.31]
loss: [300392.31]
loss: [300392.28]
loss: [300392.31]
loss: [300392.28]
loss: [300392.28]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.31]
loss: [300392.25]
loss: [300392.28]
loss: [300392.31]
loss: [300392.28]
loss: [300392.28]
loss: [300392.25]
loss: [300392.25]
loss: [300392.28]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.22]
loss: [300392.25]
loss: [300392.22]
loss: [300392.25]
loss: [300392.25]
loss: [300392.25]
loss: [300392.22]
loss: [300392.22]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.22]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.19]
loss: [300392.16]
loss: [300392.16]
loss: [300392.16]
loss: [300392.19]
loss: [300392.16]
loss: [300392.19]
loss: [300392.16]
[[ 356.84619141]]
[[ 290.0894165]]
[[ 392.63818359]]
[root@localhost sample]#

为了看的清楚一点,我们对上面的代码稍作改动, 增加一些summary和基础数据

# -*- coding: utf-8 -*-
import tensorflow as tf

w=tf.Variable(tf.zeros([2,1]), name="weights")
b=tf.Variable(0.0, name="bias")


def inference(x):
    #计算推断模型在数据x上的输出,返回结果
    return tf.matmul(x,w)+b
    
def loss(x,y):
    #根据训练数据x及期望获得的结果y计算损失
    y_predicted = inference(x)
    return tf.reduce_sum(tf.squared_difference(y,y_predicted))
def inputs():
    #读取或生成训练数据x极其期望输出y
    weight_age = [[84,46],[73,20],[65,52],[70,30],[76,57],\
              [69,25],[63,28],[72,36],[79,57],[75,44],\
              [27,24],[89,31],[65,52],[57,23],[59,60],\
              [69,48],[60,34],[79,51],[75,50],[82,34],\
              [59,46],[67,23],[85,37],[55,40],[63,30]
             ]
    blood_fat_content = [354,190,405,263,451,302,288,385,402,365,209,290,346,254,395,434,220,374,308,220,311,181,274,303,244]
    return tf.to_float(weight_age),tf.to_float(blood_fat_content)
    
def train(total_loss):
    #根据计算的总损失训练或调整模型参数
    learning_rate = 0.00000001
    return tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)
    
def evaluate(sess,x,y):
    #对训练得到的模型进行评估
    print sess.run(inference([[80.,25.]]))
    print sess.run(inference([[65.,25.]]))
    print sess.run(inference([[88.,33.]]))
    
saver = tf.train.Saver()

#在一个会话对象中启动数据流图, 搭建流程

with tf.Session() as sess:
    
    tf.initialize_all_variables().run()
    
    x,y = inputs()
    
    total_loss = loss(x,y)
    train_op = train(total_loss)
    
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess,coord=coord)
    
    #实际的训练迭代次数
    training_steps = 100000000

    summary_writer = tf.train.SummaryWriter('./my_graph', sess.graph)
    tf.scalar_summary('total_loss', total_loss)
    sum_ops = tf.merge_all_summaries()
    
    for step in range(training_steps):
        sess.run([train_op])
        s_val = sess.run(sum_ops)
            summary_writer.add_summary(s_val, global_step=step)

        #输出损失递件过程
        if step%1000==0:
            print "loss:", sess.run([total_loss])," step:",step
            
        if step==(training_steps-1):
            print "w:", sess.run(w)
            print "b:", sess.run(b)

        #if step%1000==0:
            #saver.save(sess,'my-checkpoint', global_step=step)
    
    evaluate(sess,x,y)
    
    coord.request_stop()
    coord.join(threads)
    
    saver.save(sess,'my-checkpoint', global_step=training_steps)
    sess.close()

可以适当调整

learning_rate = 0.0000002

training_steps = 1000000

来提高精度

这里我是对loss进行统计,所以可以看到随着step递增,loss逐渐减少

总结:

其实一开始我对下面的语句是有很大疑问的

def inference(x):
    return tf.matmul(x,w)+b

首先这里 x 变量是一个矩阵, w 也是一个矩阵, 矩阵与矩阵的乘积应该也是一个矩阵

但是这里的 b变量 是一个常数, 理论上矩阵和常数相加, 必需先把常数转换成矩阵,结果也应该是一个矩阵,所以我蒙圈了

只能去查函数说明

tf.matmul(a, b, transpose_a=False, transpose_b=False, a_is_sparse=False, b_is_sparse=False, name=None)

Multiplies matrix a by matrix b, producing a * b.

The inputs must be two-dimensional matrices, with matching inner dimensions, possibly after transposition.

Both matrices must be of the same type. The supported types are: floatdoubleint32complex64.

Either matrix can be transposed on the fly by setting the corresponding flag to True. This is False by default.

If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding a_is_sparse or b_is_sparse flag to True. These are False by default.

For example:

# 2-D tensor `a`
a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3]) => [[1. 2. 3.]
                                                      [4. 5. 6.]]
# 2-D tensor `b`
b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2]) => [[7. 8.]
                                                         [9. 10.]
                                                         [11. 12.]]
c = tf.matmul(a, b) => [[58 64]
                        [139 154]]

也就是

1*7+2*9+3*11  1*8+2*10+3*12
4*7+5*9+6*11  4*8+5*10+6*12

最后我推荐你自己动手试试它的这些函数是如何输出的

# -*- coding: utf-8 -*-
import tensorflow as tf

a = [[1.,2.],[3.,4.]]
b = [[1.],[2.]]
c = tf.Variable(2.1, name="bias")

with tf.Session() as sess:
    tf.initialize_all_variables().run()
    print(sess.run(tf.to_float(a)))
    print(sess.run(tf.to_float(b)))
    print(sess.run(tf.matmul(a,b)))
    print(sess.run(tf.matmul(a,b)+tf.to_float(c)))
    print(sess.run(tf.reduce_sum(tf.squared_difference(tf.matmul(a,b),tf.matmul(a,b)+tf.to_float(c)))))

结果是

[root@localhost sample]# python sample2.py 
[[ 1.  2.]
 [ 3.  4.]]
[[ 1.]
 [ 2.]]
[[  5.]
 [ 11.]]
[[  7.0999999 ]
 [ 13.10000038]]
8.82

从行为上来看

这里矩阵与常数相加应该和matlab上差不多, 每个子都被加上了这个常量