Class03 Answer:

Write Pandas syntax which mimics this SQL:

SELECT * FROM prices WHERE cdate = (SELECT MAX(cdate) FROM prices);
SELECT cdate,closep FROM prices WHERE cdate = (SELECT MAX(cdate)-1 FROM prices);
SELECT cdate,closep FROM prices WHERE cdate > (SELECT MAX(cdate)-10 FROM prices);

I ran the above SQL and saw this:


ann@ub16feb:~/ml4us/public/class03demos$ psql.bash
psql (9.5.7)
Type "help" for help.

ann=# SELECT * FROM prices WHERE cdate = (SELECT MAX(cdate) FROM prices);
   cdate    |   openp    |   highp    |    lowp    |   closep   |    adjp    | volume  
------------+------------+------------+------------+------------+------------+---------
 2017-06-15 | 153.289993 | 154.690002 | 153.289993 | 154.220001 | 154.220001 | 4651700
(1 row)

ann=# SELECT cdate,closep FROM prices WHERE cdate = (SELECT MAX(cdate)-1 FROM prices);
   cdate    |   closep   
------------+------------
 2017-06-14 | 153.809998
(1 row)

ann=# 
ann=# SELECT cdate,closep FROM prices WHERE cdate > (SELECT MAX(cdate)-10 FROM prices);
   cdate    |   closep   
------------+------------
 2017-06-06 | 152.369995
 2017-06-07 | 150.979996
 2017-06-08 | 152.100006
 2017-06-09 | 154.100006
 2017-06-12 | 155.179993
 2017-06-13 | 154.250000
 2017-06-14 | 153.809998
 2017-06-15 | 154.220001
(8 rows)

ann=# 
ann=#

I did this lab by writing the script listed below:


"""
class03pd13.py

This script should mimic the SQL listed below:

SELECT * FROM prices WHERE cdate = (SELECT MAX(cdate) FROM prices);              
SELECT cdate,closep FROM prices WHERE cdate = (SELECT MAX(cdate)-1 FROM prices);
SELECT cdate,closep FROM prices WHERE cdate > (SELECT MAX(cdate)-10 FROM prices);
"""

import pandas as pd
import numpy  as np
from datetime import datetime,timedelta

# SQL lacks the ability to get data from the web.
# Pandas can get data from the web:

prices_df         = pd.read_csv('http://ml4.us/csv/ibm.csv')
prices_df.columns = ['cdate_s', 'openp', 'highp', 'lowp', 'closep', 'adjp', 'volume']

# Mimic
# SELECT * FROM prices WHERE cdate = (SELECT MAX(cdate) FROM prices);

prices_df['cdate'] = pd.to_datetime(prices_df.cdate_s)
max_sr = (prices_df.cdate == prices_df.cdate.max())
print(prices_df[max_sr])

# Mimic
# SELECT cdate,closep FROM prices WHERE cdate = (SELECT MAX(cdate)-1 FROM prices);

max_1_sr = (prices_df.cdate == (prices_df.cdate.max() - timedelta(days=1)))
myrow    = prices_df[['cdate','closep']][max_1_sr]
print(myrow)

# Mimic
# SELECT cdate,closep FROM prices WHERE cdate > (SELECT MAX(cdate)-10 FROM prices);

max_10_sr = (prices_df.cdate > (prices_df.cdate.max() - timedelta(days=10)))
myrows    = prices_df[['cdate','closep']][max_10_sr]
print(myrows)

'bye'

I saw this:


dan@h79:~/ml4/public/class03demos $ python class03pd13.py 
          cdate_s       openp       highp        lowp      closep        adjp  \
13959  2017-06-15  153.289993  154.690002  153.289993  154.220001  154.220001   

        volume      cdate  
13959  4651700 2017-06-15  
           cdate      closep
13958 2017-06-14  153.809998
           cdate      closep
13952 2017-06-06  152.369995
13953 2017-06-07  150.979996
13954 2017-06-08  152.100006
13955 2017-06-09  154.100006
13956 2017-06-12  155.179993
13957 2017-06-13  154.250000
13958 2017-06-14  153.809998
13959 2017-06-15  154.220001
dan@h79:~/ml4/public/class03demos $ 
dan@h79:~/ml4/public/class03demos $

Class03 Lab


ml4.us About Blog Contact Class01 Class02 Class03 Class04 Class05 Class06 Class07 Class08 Class09 Class10 dan101 Forum Google Hangout Vboxen