Class03 Answer:

Write/Run a script which combines:

If I have Postgres installed, this is an easy lab.

If I don't have Postgres installed, I install it with the help from the instructions at the URL listed below:

https://github.com/danbikle/madlib_demos/blob/master/madlib_installation_steps.txt

Another way to get a copy of Postgres running on my laptop is to use this VirtualBox instance:

madlib2016a.ova

After I installed Postgres, using the above instructions, I wrote this script:


# pdpost10.py

# This script should copy gspc prices from web into a DataFrame.
# Next it copies the prices from DataFrame into Postgres table.
# Finally, copies prices from the table into 2nd DataFrame.
# Script assumes Postgres has role named madlib with password named madlib.
# and database named madlib.

import pandas as pd

prices_df = pd.read_csv('http://tkrprice.herokuapp.com/static/gspc.csv')
prices_df.columns = ['cdate','openp','highp','lowp','closep','adjp','volume']

# Ref:
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html
# http://docs.sqlalchemy.org/en/latest/core/engines.html#postgresql

# I should use sqlalchemy to connect to madlib role in Postgres:
from sqlalchemy import create_engine
engine = create_engine('postgresql://madlib:madlib@127.0.0.1:5432/madlib')
# I assume madlib role has password 'madlib' and database named madlib.

# I should write to a table named prices:
prices_df.to_sql('prices',engine, index=False, if_exists='replace')

# I should read from a table named prices:
sql_s      = 'select * from prices order by cdate;'
prices2_df = pd.read_sql(sql_s,engine,parse_dates=['cdate'])
print(prices2_df.tail())

'bye'

When I ran the above script I saw this:


dan@pavlap:~/ml4us/public $ 
dan@pavlap:~/ml4us/public $ python pdpost10.py 
           cdate        openp        highp         lowp       closep  \
16772 2016-08-29  2170.189941  2183.479980  2170.189941  2180.379883   
16773 2016-08-30  2179.449951  2182.270020  2170.409912  2176.120117   
16774 2016-08-31  2173.560059  2173.790039  2161.350098  2170.949951   
16775 2016-09-01  2171.330078  2173.560059  2157.090088  2170.860107   
16776 2016-09-02  2177.489990  2184.870117  2173.590088  2179.979980   

           volume         adjp  
16772  2654780000  2180.379883  
16773  3006800000  2176.120117  
16774  3766390000  2170.949951  
16775  3392120000  2170.860107  
16776  3091120000  2179.979980  
dan@pavlap:~/ml4us/public $ 
dan@pavlap:~/ml4us/public $ 
dan@pavlap:~/ml4us/public $ 

The above script does these tasks:

Class03 Lab


ml4.us About Blog Contact Class01 Class02 Class03 Class04 Class05 Class06 Class07 Class08 Class09 Class10 dan101 Forum Google Hangout Vboxen