0% found this document useful (0 votes)
11 views

S02-Accessing A SQLite Database With Pandas

Uploaded by

AB
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

S02-Accessing A SQLite Database With Pandas

Uploaded by

AB
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.

ipynb - Colab

keyboard_arrow_down Accessing a SQLite Database with Pandas


Pandas, a great python pakage, provides us a cool funtion of read_sql_query to access a SQLite database, directly.

This notebook is extracted from another tutorial of pySQLiteSWAT. If you are intested in more python-based tools such as Pandas, GeoPandas,
PySAL, folium, etc., you can have a look.

keyboard_arrow_down 1. Load all needed libraries


import pandas as pd
import sqlite3
import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline

keyboard_arrow_down 2. Use pandas to read SQLite database


keyboard_arrow_down 2.1 Create a db connection

con = sqlite3.connect('data\\demo.db3')
cursor = con.cursor()

keyboard_arrow_down 2.2 Check table names

Here we take the RCH table as an example

cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")


print(cursor.fetchall())

[(u'rch',), (u'hru',), (u'sub',), (u'sed',), (u'watershed_daily',), (u'watershed_monthly',), (u'watershed_yearly',), (u'channel_dimensio

keyboard_arrow_down 2.3 Check columns of the RCH table

def table_columns(db, table_name):


curs = db.cursor()
sql = "select * from %s where 1=0;" % table_name
curs.execute(sql)
return [d[0] for d in curs.description]

table_columns(con, 'rch')

['RCH',
'YR',
'MO',
'FLOW_INcms',
'FLOW_OUTcms',
'EVAPcms',
'TLOSScms',
'SED_INtons',
'SED_OUTtons',
'SEDCONCmg_kg',
'ORGN_INkg',
'ORGN_OUTkg',
'ORGP_INkg',
'ORGP_OUTkg',
'NO3_INkg',
'NO3_OUTkg',
'NH4_INkg',
'NH4_OUTkg',
'NO2_INkg',
'NO2_OUTkg',
'MINP_INkg',
'MINP_OUTkg',
'CHLA_INkg',

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 1/4
7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.ipynb - Colab
'CHLA_OUTkg',
'CBOD_INkg',
'CBOD_OUTkg',
'DISOX_INkg',
'DISOX_OUTkg',
'SOLPST_INmg',
'SOLPST_OUTmg',
'SORPST_INmg',
'SORPST_OUTmg',
'REACTPSTmg',
'VOLPSTmg',
'SETTLPSTmg',
'RESUSP_PSTmg',
'DIFFUSEPSTmg',
'REACBEDPSTmg',
'BURYPSTmg',
'BED_PSTmg',
'BACTP_OUTct',
'BACTLP_OUTct',
'CMETAL_1kg',
'CMETAL_2kg',
'CMETAL_3kg',
'TOT_Nkg',
'TOT_Pkg',
'NO3ConcMg_l',
'WTMPdegc']

keyboard_arrow_down 2.4 Read specified columns of RCH table inot a pandas DataFrame

e.g., Reach ID, Year, Month and Flow-out

df = pd.read_sql_query("SELECT RCH, YR, MO, FLOW_OUTcms from rch", con)


df = df.set_index(['MO'])
print(df.head(10))

RCH YR FLOW_OUTcms
MO
1 1 1981 146.252487
1 2 1981 96.182854
1 3 1981 11.861368
1 4 1981 49.406513
1 5 1981 272.106018
1 6 1981 486.318573
1 7 1981 23.289026
1 8 1981 214.281433
1 9 1981 193.491211
1 10 1981 53.856945

keyboard_arrow_down 2.5 Finally, do not forget to close the db connection

con.close()

keyboard_arrow_down 3. Further process flow data with pandas


keyboard_arrow_down 3.1 Custom quarters definition

quarters = {1: 'DJF', 2: 'DJF', 3: 'MAM', 4: 'MAM', 5: 'MAM', 6: 'JJA',


7: 'JJA', 8: 'JJA', 9: 'SON', 10: 'SON', 11: 'SON', 12: 'DJF'}

keyboard_arrow_down 3.2 Carry out seasonal statistics for each reach

ssndf = df.groupby(['RCH',quarters])['FLOW_OUTcms'].mean()
ssndf.head(5)

RCH
1 DJF 99.204991
JJA 1405.262298
MAM 559.746932
SON 454.737985
2 DJF 56.328539
Name: FLOW_OUTcms, dtype: float64

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 2/4
7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.ipynb - Colab

ssndf = ssndf.reset_index()
ssndf.set_index(['RCH'])
ssndf.head(5)

RCH level_1 FLOW_OUTcms

0 1 DJF 99.204991

1 1 JJA 1405.262298

2 1 MAM 559.746932

3 1 SON 454.737985

4 2 DJF 56.328539

ssndf = ssndf.rename(index=str, columns={"level_1":"SSN"})


ssndf.head(5)

RCH SSN FLOW_OUTcms

0 1 DJF 99.204991

1 1 JJA 1405.262298

2 1 MAM 559.746932

3 1 SON 454.737985

4 2 DJF 56.328539

pivoted = ssndf.pivot(index='RCH', columns='SSN', values='FLOW_OUTcms')


pivoted.head()

SSN DJF JJA MAM SON

RCH

1 99.204991 1405.262298 559.746932 454.737985

2 56.328539 773.664280 203.242828 297.934326

3 32.838328 166.310382 379.771358 27.425488

4 28.384519 619.184517 278.996801 155.766145

5 336.824441 1635.518864 1215.423962 499.265579

keyboard_arrow_down 4. Visualize
Set some parameters to make figure pretty

# Plot size to 15" x 7"


matplotlib.rc('figure', figsize = (15, 7))
# Font size to 14
matplotlib.rc('font', size = 14)
# Display top and right frame lines
matplotlib.rc('axes.spines', top = True, right = True)
# Remove grid lines
matplotlib.rc('axes', grid = False)
# Set backgound color to white
matplotlib.rc('axes', facecolor = 'white')

pivoted.plot(kind='bar', title='Seasonal Mean Discharge from 1981 to 2010 ($m^3/s$)')

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 3/4
7/25/24, 4:14 PM s02-Accessing a SQLite Database with Pandas.ipynb - Colab

<matplotlib.axes._subplots.AxesSubplot at 0xbed9128>

keyboard_arrow_down References
Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29
(2007), DOI:10.1109/MCSE.2007.53

John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55

Wes McKinney. Data Structures for Statistical Computing in Python, Proceedings of the 9th Python in Science Conference, 51-56 (2010)

https://www.sqlite.org/lang.html

Start coding or generate with AI.

https://colab.research.google.com/drive/1fdQTJC7_Fm-otRsl2LC_Z9WC1Sk2YirY#printMode=true 4/4

You might also like