Class08 Answer:

Transform some data into features

I did this lab and it required some knowledge.

I typed some shell commands:

mkdir -p ~fx1/Downloads/
cd       ~fx1/Downloads/
unzip EURUSD-2016-09.zip
mkdir fx3
mv EURUSD-2016-09.csv fx3/

Next, I wrote a python script:


"""
f11.py

This script should read and aggregate some forex data.
Demo:
~/anaconda3/bin/python f11.py
"""

import pandas as pd

f0_df = pd.read_csv('fx3/EURUSD-2016-09.csv', names=['pair','ts0','bid','ask'])

f1_df       =  f0_df.copy()[['pair','ts0']]
f1_df['cp'] = (f0_df.bid+f0_df.ask)/2
ts1_l       = [ts[:14] for ts in f1_df.ts0]
ts2_sr      = pd.to_datetime(ts1_l, utc=True)
f1_df['ts'] = [5*60*int(int(ts.strftime("%s"))/5/60) for ts in ts2_sr]
f2_df       = f1_df.copy()[['ts','cp']]
print(f2_df.tail())
f3_df = f2_df.groupby(['ts']).cp.mean()
print(f3_df.head())
print(f3_df.tail())

f3_df.to_csv('fx3/eur.csv', float_format='%4.6f')

'bye'

Then, I installed Anaconda into the fx1 account:

wget https://repo.continuum.io/archive/Anaconda3-5.3.1-Linux-x86_64.sh
bash Anaconda3-5.3.1-Linux-x86_64.sh -b
echo 'export PATH="${HOME}/anaconda3/bin:$PATH"' >> ~fx1/.bashrc
mv ~/anaconda3/bin/curl ~/anaconda3/bin/curl_ana
bash

Next, I ran the python script:

python f11.py

The above script needed 4 minutes to run on my laptop.

And, I ran another python script:

python f12.py

which is listed below:


"""
f12.py

This script should generate features from csv data.
Demo:
~/anaconda3/bin/python f12.py
"""

import pandas as pd

f10_df = pd.read_csv('fx3/eur.csv', names=['ts','cp'])

slopes_a = [2,3,4,5,6,7,8,9]

# I should compute dependent variable, piplead:
f10_df['piplead'] = (10000.0*(f10_df.cp.shift(-1) - f10_df.cp) / f10_df.cp).fillna(0)
print(f10_df.head())

# I should compute mvgavg-slope for each slope_i

# ref:
# https://ml4.herokuapp.com/cclasses/class03pd41
# http://pandas.pydata.org/pandas-docs/stable/computation.html#rolling-windows
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html#pandas.DataFrame.rolling

for slope_i in slopes_a:
  rollx          = f10_df.rolling(window=slope_i)
  col_s          = 'slope'+str(slope_i)
  slope_sr       = 10000.0 * (rollx.mean().cp - rollx.mean().cp.shift(1))/rollx.mean().cp
  f10_df[col_s] = slope_sr
print(f10_df.tail())

# I should write to CSV file to be used later:
f10_df.to_csv('fx3/feat.csv', float_format='%4.4f', index=False)

'bye'

The above script finished in less than two seconds on my laptop.

At this point I had data which Logistic Regression could learn from.

Class08 Lab


learn4.us About Blog Contact Class01 Class02 Class03 Class04 Class05 Class06 Class07 Class08 Class09 Class10 dan101 Forum Google Hangout Vboxen