Class03 Answer:

Calculate RMSE for that line.

RMSE is an acronym for Root Mean Square Error.

https://www.google.com/search?q=Statistics+Root+Mean+Square+Error&tbm=isch


"""
class03p12.py

This script should calculate RMSE of straight line between first and last prices of 2016.
"""

import pandas as pd
import numpy  as np

csvfile   = 'http://spy611.herokuapp.com/csv/allpredictions.csv'
cp_df     = pd.read_csv(csvfile).sort_values(['cdate'])
cp2016_sr = (cp_df.cdate > '2016') & (cp_df.cdate < '2017')
cp2016_df = cp_df[['cdate','cp']][cp2016_sr]
x1x0_i    = len(cp2016_df)
y1y0_f    = cp2016_df.iloc[-1].cp-cp2016_df.iloc[0].cp
m_f       = y1y0_f / x1x0_i
b_f       = cp2016_df.iloc[0].cp
# My equation for straight line:
def yval(x_in):
    return m_f * x_in + b_f
# I should collect points to plot straight line:
yvals_l = [ yval(x_i) for x_i in range(len(cp2016_df))]
# Add the points to the DataFrame:
cp2016_df['sl'] = yvals_l

# Goog: In Pandas how to combine columns?
# I should square the difference of each error:
sqdiffe = (cp2016_df.cp - cp2016_df.sl)**2

# I should find mean and then sqrt:
print('RMSE between straight line and closing price:')
rmse_f = np.sqrt(np.mean(sqdiffe))
print(rmse_f)

'bye'

I ran the above script and saw this:

dan@h79:~/ml4/public/class03demos $ python class03p12.py
RMSE between straight line and closing price:
60.4232371144
dan@h79:~/ml4/public/class03demos $
dan@h79:~/ml4/public/class03demos $

Class03 Lab


ml4.us About Blog Contact Class01 Class02 Class03 Class04 Class05 Class06 Class07 Class08 Class09 Class10 dan101 Forum Google Hangout Vboxen