QL-Assignment-3-Plaksha2023.ipynb - Colaboratory
QL-Assignment-3-Plaksha2023.ipynb - Colaboratory
ipynb - Colaboratory
Submission details:
Please submit this as a Jupyter Notebook and a PDF of your results (both should show output). Also push your solutions to Github.
For the submision create a local database with sqlite3 or sqlalchemy in a Jupyter notebook and make the queries either with a cursor
object (and then print the results) or by using pandas pd.read_sql_query() .
When completing this homework you can experiment with SQL commands by utilizing this great online editor:
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_all
If you want you can drop them by running DROP TABLE [table-name]; (or just keep them).
Exercises:
First create a table called students. It has the columns: 'student_id', 'name', 'major', 'gpa' and 'enrollment_date' We will use a new form of CREATE
TABLE expression to produce this table.
Note that you can improve this and are welcome to do so -- e.g. by specifying for example a PRIMARY KEY and a FOREIGN KEY in Q2 :)
CREATE TABLE students AS
SELECT 1 AS student_id, "John" AS name, "Computer Science" AS major, 3.5 AS gpa, "01-01-2022" AS enrollment_date UNION
SELECT 2, "Jane", "Physics", 3.8, "01-02-2022" UNION
SELECT 3, "Bob", "Engineering", 3.0, "01-03-2022" UNION
SELECT 4, "Samantha", "Physics", 3.9, "01-04-2022" UNION
SELECT 5, "James", "Engineering", 3.7, "01-05-2022" UNION
SELECT 6, "Emily", "Computer Science", 3.6, "01-06-2022" UNION
SELECT 7, "Michael", "Computer Science", 3.2, "01-07-2022" UNION
SELECT 8, "Jessica", "Engineering", 3.8, "01-08-2022" UNION
SELECT 9, "Jacob", "Physics", 3.4, "01-09-2022" UNION
SELECT 10, "Ashley", "Physics", 3.9, "01-10-2022";
Q2 Joins
Create a new table called courses, which indicates the courses taken by the students.
CREATE TABLE courses AS
SELECT 1 AS course_id, "Python programming" AS course_name, 1 AS student_id, "A" AS grade UNION
SELECT 2, "Data Structures", 2, "B" UNION
SELECT 3, "Database Systems", 3, "B" UNION
SELECT 1, "Python programming", 4, "A" UNION
SELECT 4, "Quantum Mechanics", 5, "C" UNION
https://colab.research.google.com/drive/1d6BE0pvJ9BUwOq5a-QvJeipaBPO_eapJ#scrollTo=C8obi6ZgjllM&printMode=true 1/5
04/02/2023, 01:35 PragyaWasan_SQL-Assignment-3-Plaksha2023.ipynb - Colaboratory
SELECT 1, "Python programming", 6, "F" UNION
SELECT 2, "Data Structures", 7, "C" UNION
SELECT 3, "Database Systems", 8, "A" UNION
SELECT 4, "Quantum Mechanics", 9, "A" UNION
SELECT 2, "Data Structures", 10, "F";
Your solution
import sqlite3
# open connnection to a db file stored locally on disk
# if file doesn't exist it is created
connection = sqlite3.connect('company.db')
# In order to run SQL commands with
# sqlite 3 we must create a cursor object
# that traverses the database
cursor = connection.cursor()
# to run sql commands execute them
# Check that we are working with an empty db
cursor.execute("DROP TABLE IF EXISTS employee;")
<sqlite3.Cursor at 0x7fb0e3e0c730>
Question 1
# We can define long SQL commands within three quotes
sql_command = """
CREATE TABLE students AS
SELECT 1 AS student_id, "John" AS name, "Computer Science" AS major, 3.5 AS gpa, "01-01-2022" AS enrollment_date UNION
SELECT 2, "Jane", "Physics", 3.8, "01-02-2022" UNION
SELECT 3, "Bob", "Engineering", 3.0, "01-03-2022" UNION
SELECT 4, "Samantha", "Physics", 3.9, "01-04-2022" UNION
SELECT 5, "James", "Engineering", 3.7, "01-05-2022" UNION
SELECT 6, "Emily", "Computer Science", 3.6, "01-06-2022" UNION
SELECT 7, "Michael", "Computer Science", 3.2, "01-07-2022" UNION
SELECT 8, "Jessica", "Engineering", 3.8, "01-08-2022" UNION
SELECT 9, "Jacob", "Physics", 3.4, "01-09-2022" UNION
SELECT 10, "Ashley", "Physics", 3.9, "01-10-2022";
"""
# In order to run SQL command on the databse file
# we have to execute them with the cursor
cursor.execute(sql_command)
<sqlite3.Cursor at 0x7fb0e3e0c730>
https://colab.research.google.com/drive/1d6BE0pvJ9BUwOq5a-QvJeipaBPO_eapJ#scrollTo=C8obi6ZgjllM&printMode=true 2/5
04/02/2023, 01:35 PragyaWasan_SQL-Assignment-3-Plaksha2023.ipynb - Colaboratory
a = cursor.execute('SELECT * FROM students;')
# fetch values, a.fetchall is a generator object
for row in a.fetchall():
print(row)
b = cursor.execute('SELECT * FROM students WHERE major= "Computer Science";')
for row in b.fetchall():
print(row)
3. SELECT all unique majors (use SELECT DISTINCT) and order them by name, descending order (i.e. Physics first).
c = cursor.execute('SELECT DISTINCT major FROM students ORDER BY major DESC;')
for row in c.fetchall():
print(row)
('Physics',)
('Engineering',)
('Computer Science',)
4. SELECT all students that have an 'e' in their name and order them by gpa in ascending order.
d = cursor.execute('SELECT name FROM students WHERE name LIKE "%e%" ORDER BY gpa;')
for row in d.fetchall():
print(row)
('Michael',)
('Emily',)
('James',)
('Jane',)
('Jessica',)
('Ashley',)
Question 2
sql_command2 = """
CREATE TABLE courses AS
SELECT 1 AS course_id, "Python programming" AS course_name, 1 AS student_id, "A" AS grade UNION
SELECT 2, "Data Structures", 2, "B" UNION
SELECT 3, "Database Systems", 3, "B" UNION
SELECT 1, "Python programming", 4, "A" UNION
SELECT 4, "Quantum Mechanics", 5, "C" UNION
SELECT 1, "Python programming", 6, "F" UNION
SELECT 2, "Data Structures", 7, "C" UNION
SELECT 3, "Database Systems", 8, "A" UNION
SELECT 4, "Quantum Mechanics", 9, "A" UNION
SELECT 2, "Data Structures", 10, "F";
"""
cursor.execute(sql_command2)
<sqlite3.Cursor at 0x7fb0e3e0c730>
https://colab.research.google.com/drive/1d6BE0pvJ9BUwOq5a-QvJeipaBPO_eapJ#scrollTo=C8obi6ZgjllM&printMode=true 3/5
04/02/2023, 01:35 PragyaWasan_SQL-Assignment-3-Plaksha2023.ipynb - Colaboratory
e = cursor.execute('SELECT COUNT(DISTINCT course_name) FROM courses;')
for row in e.fetchall():
print(row)
(4,)
2. JOIN the tables students and courses and COUNT the number of students with the major Computer Science taking the
course Python programming.
f = cursor.execute('SELECT COUNT(DISTINCT name) FROM students AS s INNER JOIN courses AS c ON s.student_id=c.student_id WHERE
for row in f.fetchall():
print(row)
(2,)
3. JOIN the tables students and courses and select the students who have grades higher than "C", only show their name, major,
gpa, course_name and grade.
g = cursor.execute('SELECT s.name, s.major, s.gpa, c.course_name, c.grade FROM students AS s INNER JOIN courses AS c ON s.stud
for row in g.fetchall():
print(row)
Question 3
h = cursor.execute('SELECT AVG(gpa) FROM students;')
for row in h.fetchall():
print(row)
(3.5800000000000005,)
2. SELECT the student with the maximum gpa, display only their student_id, major and gpa
i = cursor.execute('SELECT student_id, major, MAX(gpa) FROM students;')
for row in i.fetchall():
print(row)
3. SELECT the student with the minimum gpa, display only their student_id, major and gpa
j = cursor.execute('SELECT student_id, major, MIN(gpa) FROM students;')
for row in j.fetchall():
print(row)
https://colab.research.google.com/drive/1d6BE0pvJ9BUwOq5a-QvJeipaBPO_eapJ#scrollTo=C8obi6ZgjllM&printMode=true 4/5
04/02/2023, 01:35 PragyaWasan_SQL-Assignment-3-Plaksha2023.ipynb - Colaboratory
4. SELECT the students with a gpa greater than 3.6 in the majors of "Physics" and "Engineering", display only their student_id,
major and gpa
k = cursor.execute('SELECT student_id, major, gpa FROM students WHERE (gpa>3.6) AND (major="Physics" OR major="Engineering");'
for row in k.fetchall():
print(row)
5. Group the students by their major and retrieve the average grade of each major.
l = cursor.execute('SELECT major, AVG(gpa) FROM students GROUP BY major;')
for row in l.fetchall():
print(row)
6. SELECT the top 2 students with the highest GPA in each major and order the results by major in ascending order, then by GPA
in descending order
m= cursor.execute('''
WITH subquery AS (SELECT name, major, gpa, row_number() OVER (PARTITION BY major ORDER BY gpa DESC) AS rank FROM students)
SELECT name, major, gpa FROM subquery WHERE rank <= 2 ORDER BY major, gpa DESC;
''')
for row in m.fetchall():
print(row)
https://colab.research.google.com/drive/1d6BE0pvJ9BUwOq5a-QvJeipaBPO_eapJ#scrollTo=C8obi6ZgjllM&printMode=true 5/5