0% found this document useful (0 votes)
27 views

Aggregate Queries

SQL

Uploaded by

Anusha Mediboina
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Aggregate Queries

SQL

Uploaded by

Anusha Mediboina
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

The Case Aggregate Functions Defining Groups HAVING Clause

Aggregate Queries in SQL

Stéphane Bressan
The Case Aggregate Functions Defining Groups HAVING Clause

Requirements

We want to develop an application for managing the data of our online app store. We
would like to store several items of information about our customers such as their first
name, last name, date of birth, e-mail, date and country of registration to our online
sales service and the customer identifier that they have chosen . We also want to
manage the list of our products, games, their name, their version and their price. The
price is fixed for each version of each game. Finally, our customers buy and download
games. So we must remember which version of which game each customer has
downloaded. It is not important to keep the download date for this application.
The Case Aggregate Functions Defining Groups HAVING Clause

Entity-relationship Diagram

dob since country

customers downloads games

first last email customer name version price


name name id
The Case Aggregate Functions Defining Groups HAVING Clause

SQLite

We can use SQLite interactive terminal sqlite3.exe.

1 sqlite> . open m y f i l e . db
2 sqlite> . mode column
3 sqlite> . h e a d e r s on
4 sqlite> PRAGMA f o r e i g n k e y s = ON;
5 sqlite> . r e a d AppStoreSchema . s q l
6 sqlite> . read AppStoreCustomers . s q l
7 sqlite> . r e a d AppStoreGames . s q l
8 sqlite> . read AppStoreDownloads . s q l
9 ...
10 sqlite> . quit
The Case Aggregate Functions Defining Groups HAVING Clause

SQLite

We can use PostgreSQL interactive terminal psql.

1 % p s q l =h l o c a l h o s t =U p o s t g r e s
2 Password f o r u s e r p o s t g r e s :
3 psql (9.6.3 , server 9.6.4)
4 Type ” h e l p ” f o r h e l p .
5 p o s t g r e s=# CREATE DATABASE d e d o m e n o l o g y ;
6 CREATE DATABASE
7 p o s t g r e s=# \c d e d o m e n o l o g y ;
8 psql (9.6.3 , server 9.6.4)
9 You a r e now c o n n e c t e d t o d a t a b a s e ” d e d o m e n o l o g y ” a s u s e r ” p o s t g r e s ” .
10 p o s t g r e s=# \ i AppStoreSchema . s q l
11 CREATE TABLE
12 CREATE TABLE
13 CREATE TABLE
14 p o s t g r e s=# \ i A p p S t o r e C u s t o m e r s . s q l
15 INSERT 0 1
16 ...
17 p o s t g r e s=# \ i AppStoreGames . s q l
18 INSERT 0 1
19 ...
20 p o s t g r e s=# \ i A p p S t o r e D o w n l o a d s . s q l
21 INSERT 0 1
22 ...
23 p o s t g r e s=# \q
The Case Aggregate Functions Defining Groups HAVING Clause

Complete Schema for the Case

This is the complete schema for our example

1 CREATE TABLE I F NOT EXISTS c u s t o m e r s (


2 f i r s t n a m e VARCHAR( 6 4 ) NOT NULL ,
3 l a s t n a m e VARCHAR( 6 4 ) NOT NULL ,
4 e m a i l VARCHAR( 6 4 ) UNIQUE NOT NULL ,
5 dob DATE NOT NULL ,
6 s i n c e DATE NOT NULL ,
7 c u s t o m e r i d VARCHAR( 1 6 ) PRIMARY KEY ,
8 c o u n t r y VARCHAR( 1 6 ) NOT NULL) ;
9
10 CREATE TABLE I F NOT EXISTS games (
11 name VARCHAR( 3 2 ) ,
12 v e r s i o n CHAR( 3 ) ,
13 p r i c e NUMERIC NOT NULL ,
14 PRIMARY KEY ( name , v e r s i o n ) ) ;
15
16 CREATE TABLE d o w n l o a d s (
17 c u s t o m e r i d VARCHAR( 1 6 ) REFERENCES c u s t o m e r s ( c u s t o m e r i d )
18 ON UPDATE CASCADE ON DELETE CASCADE
19 DEFERRABLE INITIALLY DEFERRED ,
20 name VARCHAR( 3 2 ) ,
21 v e r s i o n CHAR( 3 ) ,
22 PRIMARY KEY ( c u s t o m e r i d , name , v e r s i o n ) ,
23 FOREIGN KEY ( name , v e r s i o n ) REFERENCES games ( name , v e r s i o n )
24 ON UPDATE CASCADE ON DELETE CASCADE
25 DEFERRABLE INITIALLY DEFERRED) ;
The Case Aggregate Functions Defining Groups HAVING Clause

Aggregate Functions

The values of a column can be aggregated aggregation functions such as COUNT(),


SUM(), MAX(), MIN(), AVG(), STDDEV() etc.
PostgreSQL also allows User-defined aggregate functions.

1 SELECT COUNT( * )
2 FROM c u s t o m e r s ;

count
1000

The above query prints the number of rows in the table customers.
The Case Aggregate Functions Defining Groups HAVING Clause

Aggregate Functions

1 SELECT COUNT( c . c u s t o m e r i d )
2 FROM c u s t o m e r s c ;

count
1000

The above query prints the number of rows in the table customers.

1 SELECT COUNT( c . c o u n t r y )
2 FROM c u s t o m e r s c ;

count
1000

The above query also prints the number of rows in the table customers.
The Case Aggregate Functions Defining Groups HAVING Clause

Aggregate Functions

1 SELECT COUNT( c . c o u n t r y )
2 FROM c u s t o m e r s c ;

1 SELECT COUNT( ALL c . c o u n t r y )


2 FROM c u s t o m e r s c ;

count
1000

The two queries above are the same.


The keyword ALL is generally omitted as it is the default.
The Case Aggregate Functions Defining Groups HAVING Clause

Aggregate Functions

1 SELECT COUNT( DISTINCT c . c o u n t r y )


2 FROM c u s t o m e r s c ;

count
5

We need to add the keyword DISTINCT inside the COUNT() aggregate function if we
want to count the number of different countries in the column country of the table
customers.
DISTINCT can be used in other aggregate functions similarly.
The Case Aggregate Functions Defining Groups HAVING Clause

Aggregate Functions

The following query finds the maximum, minimum, average and standard deviation
prices of our games.
It uses the arithmetic function TRUNC() to display two decimal places for average and
standard deviation.

1 SELECT MAX( g . p r i c e ) ,
2 MIN( g . p r i c e ) ,
3 TRUNC(AVG( g . p r i c e ) , 2 ) AS ave ,
4 TRUNC(STDDEV( g . p r i c e ) , 2 ) AS s t d
5 FROM games g ;

max min avg std


12 1.99 6.97 3.96
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

The GROUP BY clause creates groups of records that have the same values for the
specified fields before computing the aggregate functions.

first name last name email ··· country


”Deborah” ”Ruiz” ”[email protected]” ··· ”Singapore”
”Tammy” ”Lee” ”[email protected]” ··· ”Singapore”
···
”Raymond” ”Tan” ”[email protected] ” ··· ”Thailand”
”Jean” ”Ling” ”[email protected]” ··· ”Thailand”
···
”Russel” ”Hakim” ”[email protected]” ··· ”Vietnam”
”Gerald” ”Ford” ”[email protected]” ··· ”Vietnam”
···
”Marie” ”Flores” ”[email protected]” ··· ”Indonesia”
”Cheryl” ”Reyes” ”[email protected]” ··· ”Indonesia”
···
”Cynthia” ”Pierce” ”[email protected]” ··· ”Malaysia”
”Nicole” ”Lee” ”[email protected] ” ··· ”Malaysia”
···

1 GROUP BY c . c o u n t r y ;
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

The aggregation functions are calculated for each group.

1 SELECT c . c o u n t r y , COUNT( * )
2 FROM c u s t o m e r s c
3 GROUP BY c . c o u n t r y ;

country count
”Vietnam” 98
”Singapore” 391
”Thailand” 100
”Indonesia” 243
”Malaysia” 168

1 SELECT COUNT( * )
2 FROM c u s t o m e r s c

If no GROUP BY clause is specified only one group is formed as soon as one aggregate
function is used.
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

Groups are formed after the rows have been filtered by the WHERE clause.

1 SELECT c . c o u n t r y , COUNT( * )
2 FROM c u s t o m e r s c
3 WHERE c . dob >= ’ 2000 =01 =01 ’
4 GROUP BY c . c o u n t r y ;

country count
”Vietnam” 4
”Singapore” 25
”Thailand” 5
”Indonesia” 15
”Malaysia” 12
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

The following query finds the total spending for each customer.

1 SELECT c . c u s t o m e r i d , c . f i r s t n a m e , c . l a s t n a m e , SUM( g . p r i c e )
2 FROM c u s t o m e r s c , d o w n l o a d s d , games g
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 AND d . name = g . name AND d . v e r s i o n = g . v e r s i o n
5 GROUP BY c . c u s t o m e r i d , c . f i r s t n a m e , c . l a s t n a m e ;

fist name last name sum


”Adam” ”Howell” 28.98
”Jimmy” ”Gibson” 20.99
”Margaret” ”Mitchell” 13.99
···

Note that we include the columns first_name and last_name in the GROUP BY
clause because we want to print them.
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

It is recommended (and required according to the SQL standard) to include the


attributes projected in the SELECT clause in the GROUP BY clause.

1 SELECT c . f i r s t n a m e , c . l a s t n a m e , SUM( g . p r i c e )
2 FROM c u s t o m e r s c , d o w n l o a d s d , games g
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 AND d . name = g . name AND d . v e r s i o n = g . v e r s i o n
5 GROUP BY c . c u s t o m e r i d ;

The above query works only because the first and last name are guaranteed to be
unique for a given customer identifier (which is the primary key of the table
customers).

Do not write such queries for the sake of readability and portability.
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

We should write the query as follows, making sure that every column mentioned in the
SELECT clause is mentioned in the GROUP BY unless it is used in an aggregate
function.

1 SELECT c . f i r s t n a m e , c . l a s t n a m e , SUM( g . p r i c e )
2 FROM c u s t o m e r s c , d o w n l o a d s d , games g
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 AND d . name = g . name AND d . v e r s i o n = g . v e r s i o n
5 GROUP BY c . c u s t o m e r i d , c . f i r s t n a m e , c . l a s t n a m e ;
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

The query below does not work in PostgreSQL (it does work in SQLite but such
queries could give wrong results).

1 SELECT c . c u s t o m e r i d , c . f i r s t n a m e , c . l a s t n a m e , SUM( g . p r i c e )
2 FROM c u s t o m e r s c , d o w n l o a d s d , games g
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 AND d . name = g . name AND d . v e r s i o n = g . v e r s i o n
5 GROUP BY c . f i r s t n a m e , c . l a s t n a m e ;

1 ERROR : column ” c . c u s t o m e r i d ” must a p p e a r i n t h e GROUP BY c l a u s e o r be u s e d i n an a g g r e g a t e f u n c t i o n


2 LINE 1 : SELECT c . c u s t o m e r i d , c . f i r s t n a m e , c . l a s t n a m e , SUM( g . p r i c e )
3 ˆ
4 SQL s t a t e : 42803
5 Character : 8
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

The following query displays the number of downloads by country of registration and
year of birth of the customers. EXTRACT() is a PostgreSQL function. STRFTIME() is a
SQLite function.

1 SELECT c . c o u n t r y , EXTRACT(YEAR FROM c . s i n c e ) AS r e g y e a r , COUNT( * ) AS t o t a l


2 FROM c u s t o m e r s c , d o w n l o a d s d
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 GROUP BY c . c o u n t r y , r e g y e a r
5 ORDER BY r e g y e a r , c . c o u n t r y ;

1 SELECT c . c o u n t r y , STRFTIME ( ’%Y ’ , c . s i n c e ) AS r e g y e a r , COUNT( * ) AS t o t a l


2 FROM c u s t o m e r s c , d o w n l o a d s d
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 GROUP BY c . c o u n t r y , r e g y e a r
5 ORDER BY r e g y e a r , c . c o u n t r y ;

country regyear count


···
”Thailand” ”2015” 6
”Vietnam” ”2015” 3
”Indonesia” ”2016” 998
”Malaysia” ”2016” 713
···
The Case Aggregate Functions Defining Groups HAVING Clause

Defining Groups

The order of columns in the GROUP BY clause does not change the meaning of the
query.

1 SELECT c . c o u n t r y , EXTRACT(YEAR FROM c . s i n c e ) AS r e g y e a r , COUNT( * ) AS t o t a l


2 FROM c u s t o m e r s c , d o w n l o a d s d
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 GROUP BY r e g y e a r , c . c o u n t r y
5 ORDER BY r e g y e a r , c . c o u n t r y ;

1 SELECT c . c o u n t r y , STRFTIME ( ’%Y ’ , c . s i n c e ) AS r e g y e a r , COUNT( * ) AS t o t a l


2 FROM c u s t o m e r s c , d o w n l o a d s d
3 WHERE c . c u s t o m e r i d = d . c u s t o m e r i d
4 GROUP BY r e g y e a r , c . c o u n t r y
5 ORDER BY r e g y e a r , c . c o u n t r y ;

country regyear count


···
”Thailand” ”2015” 6
”Vietnam” ”2015” 3
”Indonesia” ”2016” 998
”Malaysia” ”2016” 713
···
The Case Aggregate Functions Defining Groups HAVING Clause

HAVING Clause

Aggregate functions can be used in conditions.

1 SELECT c . c o u n t r y
2 FROM c u s t o m e r s c
3 WHERE COUNT( * ) >= 100
4 GROUP BY c . c o u n t r y ;

1 ERROR : a g g r e g a t e f u n c t i o n s a r e n o t a l l o w e d i n WHERE
2 LINE 4 : WHERE COUNT( * ) >= 100
3 ˆ
4 SQL s t a t e : 42803
5 C h a r a c t e r : 42

However aggregate functions are not allowed in the WHERE clause. This is because
they can only be evaluated after the groups are formed. The groups are formed after
rows are filtered by the WHERE clause.
The Case Aggregate Functions Defining Groups HAVING Clause

HAVING Clause

Instead, we use a new clause: the HAVING clause to add conditions to be checked
after the evaluation of the GROUP BY clause.

The HAVING clause can only involve aggregate functions, columns listed in the
GROUP BY clause and subqueries.
The Case Aggregate Functions Defining Groups HAVING Clause

HAVING Clause

The following query finds the countries in which there are more than 100 customers.

1 SELECT c . c o u n t r y
2 FROM c u s t o m e r s c
3 GROUP BY c . c o u n t r y
4 HAVING COUNT( * ) >= 1 0 0 ;

country
”Singapore”
”Thailand”
”Indonesia”
”Malaysia”
Copyright 2022 Stéphane Bressan. All rights reserved.

You might also like