SQL - Final
SQL - Final
for Analytics
SATYAJIT PATTNAIK
SQL Timeline
Joins
Normalization
Ordering Regex
Database
Database
Architecture
✔ Client request a file from the disk or a ✔ Desktop software requests data from the
file-server database
✔ Software opens the files, stores it in ✔ Data is held in memory but when a change
memory, makes changes in memory, and is made, a request (SQL statement) is
then saves it back to the disk drive (or file generated.
server) ✔ When you save the data these requests
✔ What happens if someone else opens the are passed to the database in the order
same file after you do, then you save your they were generated.
changes, and then they save their ✔ So, multiple people can work in the same
changes? table at the same time with no worries
○ Potential data loss ○ Risk of data loss is low
○ Solution is file-locking ○ Still possible for someone else to
○ File locking is very inconvenient change data after you change it.
however as it means that nobody ○ Versioning is an option to record
else can edit the file if someone the changes (DB Admins are
else has it open first. (Logically owner of these things in
good for small companies) production)
SQL Command Categories
Integer Character
Different Data Types
1 Numeric
Character
2
3 Date & Time
Introduction
What is SQL?
✔ SQL (pronounced "ess-que-el") stands for Structured Query Language
✔ SQL is used to communicate with a database.
✔ It is the standard language for relational database management systems
✔ SQL statements are used to perform tasks such as update data on a database, or retrieve data from a
database. Some common relational database management systems that use SQL are: Oracle, Sybase,
Microsoft SQL Server, Access, Ingres, etc.
✔ The standard SQL commands such as "Select", "Insert", "Update", "Delete", "Create", and "Drop" can be
used to accomplish almost everything that one needs to do with a database.
✔ SQL programming can be used to perform multiple actions on data such as :
○ Querying
○ Inserting
○ Updating
○ Deleting
○ Extracting etc.
Primary & Foreign Key
A Primary key is used to ensure data in the specific column is Unique and Not Null
A foreign key is a column or group of columns in a relational database table that provides a link
between data in two tables. It is a column (or columns) that references a column (most often the
primary key) of another table
Primary & Foreign Key - Contd..
Primary key:
○ no null values
○ unique identification
Foreign key:
○ correspond to the values of the primary key in another table
SQL – First Step
Create a Database
Create a Table
Insert Table Data
Update Table Data
Alter Database
Alter Table
Alter Column
Duplicate a Database
Drop a Database
Database Backup
Restore a Database
Rename Database
TABLE BASICS
A relational database system contains one or more objects called tables. The data or information for the
database are stored in these tables. Tables are uniquely identified by their names and are comprised of
columns and rows. Columns contain the column name, data type, and any other attributes for the column.
Rows contain the records or data for the columns. Here is a sample table called "employee".
The LIKE pattern matching operator can also be used in the conditional selection of the where clause. Like is a
very powerful operator that allows you to select only rows that are "like" what you specify. The percent sign
"%" can be used as a wildcard to match any possible character that might appear before or after the
characters specified.
This SQL statement will match any first names that For example:
start with 'Er'. Strings must be in single quotes. select first, last, city
Or you can specify: from empinfo
where first LIKE 'Er%';
This will only select rows where the select * from empinfo
first name equals 'Eric' exactly where first = 'Eric';
SELECTING THE DATA
Example:
It is important to make sure you use an open parenthesis before the beginning table, and a closing
parenthesis after the end of the last column definition.
Make sure you separate each column definition with a comma. All SQL statements should end with a ";".
The table and column names;
Do not use any SQL reserved keywords as names for tables or column names (such as "select", "create", "insert", etc).
Data types specify what the type of data can be for that particular column. If a column called "Last_Name", is to be used to
hold names, then that particular column should have a "varchar" (variable-length character) data type.
Here are the most common Data types:
number(size, Number value with a maximum number of digits of "size" total, with a
maximum number of "d" digits to the right of the decimal.
d)
What are constraints?
When tables are created, it is common for one or more columns to have constraints associated with them. A constraint is
basically a rule associated with a column that the data entered into that column must follow.
- For example, a "unique" constraint specifies that no two records can have the same value in a particular column. They
must all be unique.
- The other two most popular constraints are "not null" which specifies that a column can't be left blank, and "primary
key". A "primary key" constraint defines a unique identification of each record (or row) in a table.
All of these and more will be covered in the future Advanced release of this Tutorial. Constraints can be entered in this SQL
interpreter, however, they are not supported in this Intro to SQL tutorial & interpreter. They will be covered and supported in the
future release of the Advanced SQL tutorial - that is, if "response" is good.
It's now time for you to design and create your own table. You will use this table throughout the rest of the tutorial. If you decide
to change or redesign the table, you can either drop it and recreate it or you can create a completely different one. The SQL
statement drop will be covered later.
Create Table Exercise
You have just started a new company. It is time to hire some employees. You will need to create a
table that will contain the following information about your new employees:
After you create the table, you should receive a small form on the screen with the appropriate column
names. If you are missing any columns, you need to double check your SQL statement and recreate
the table. Once it's created successfully, go to the "Insert" lesson.
IMPORTANT: When selecting a table name, it is important to select a unique name that no one else
will use or guess. Your table names should have an underscore followed by your initials and the digits
of your birth day and month.
For example, Tom Smith, who was born on November 2nd, would name his table
myemployees_ts0211 Use this convention for all of the tables you create. Your tables will remain on a
shared database until you drop them, or they will be cleaned up if they aren't accessed in 4-5 days
INSERT
The insert statement is used to insert or add a
insert into "tablename"
row of data into the table.
(first_column,...last_column)
To insert records into a table, enter the key words values (first_value,...last_value);
insert into followed by the table name, followed by
an open parenthesis, followed by a list of column
names separated by commas, followed by a closing Example:
parenthesis, followed by the keyword values,
insert into empinfo
followed by the list of values enclosed in
(first, last, id, age, city, state)
parenthesis. The values that you enter will be held
values ('Luke', 'Duke', 45454,
in the rows and they will match up with the column
'22', 'Hazard Co', 'Georgia');
names that you specify.
Note: All strings should be enclosed
Strings should be enclosed in single quotes,
between single quotes: 'string'
and numbers should not.
Insert statement exercise
It is time to insert data into your new employee table.
Your first three employees are the following:
Jonie Weber, Secretary, 28, 19500.00
Potsy Weber, Programmer, 32, 45300.00
Dirk Smith, Programmer II, 45, 75020.00
Enter these employees into your table first, and then insert at least 5 more of your own list of employees in the table.
After they're inserted into the table, enter select statements to:
Select all columns for everyone in your employee table.
Select all columns for everyone with a salary over 30000.
Select first and last names for everyone that's under 30 years old.
Select first name, last name, and salary for anyone with "Programmer" in their title.
Select all columns for everyone whose last name contains "ebe".
Select the first name for everyone whose first name equals "Potsy".
Select all columns for everyone over 80 years old.
Select all columns for everyone whose last name ends in "ith".
Create at least 5 of your own select statements based on specific information that you'd like to retrieve.
UPDATE
The update statement is used to update or change records that match a
specified criteria. This is accomplished by carefully constructing a where
clause.
Examples:
update "tablename"
set "columnname" = update phone_book
"newvalue"
set area_code = 623 update employee
[,"nextcolumn" =
"newvalue2"...] where prefix = 979; set age = age+1
where "columnname" where first_name='Mary' and
update phone_book
OPERATOR "value" last_name='Williams';
[and|or "column" set last_name = 'Smith',
OPERATOR "value"]; prefix=555, suffix=9292
[] = optional where last_name = 'Jones';
Update statement exercises
After each update, issue a select statement to verify your changes.
1. Jonie Weber just got married to Bob Williams. She has requested that her last name be updated to
Weber-Williams.
2. Dirk Smith's birthday is today, add 1 to his age.
3. All secretaries are now called "Administrative Assistant". Update all titles accordingly.
4. Everyone that's making under 30000 are to receive a 3500 a year raise.
5. Everyone that's making over 33500 are to receive a 4500 a year raise.
6. All "Programmer II" titles are now promoted to "Programmer III".
7. All "Programmer" titles are now promoted to "Programmer II".
ALTER TABLE
tablename ALTER TABLE
CUST_DETAILS
ADD columnname
ADD AGE INT;
datatype;
ALTER TABLE
ALTER TABLE CUST_DETAILS
tablename DROP COLUMN AGE;
DROP COLUMN
columnname ;
Data Importing
1. Manual Importing
Data Importing
1. Importing through command line
● Open MySQL Workbench, Create a new database to store the tables you'll import (eg- FacilitySerivces) → Then create the table using CREATE query
● Copy the MySQL bin directory path: C:\Program Files\MySQL\MySQL Server 8.0\bin
● Go to the folder in command line by using: cd path
● Connect to MySQL database: mysql -u root -p (root is basically your username)
● If you are logged in successfully, then set the global variables by using below command so that the data can be imported from local computer folder.
○ mysql> SET GLOBAL local_infile = 1;
○ Query OK, 0 rows affected (0.00 sec)
○ (you've just instructed MySQL server to allow local file upload from your computer)
● Quit current server connection (mysql> quit)
● Load the file from CSV file to the MySQL database. In order to do this, please follow the commands: (We'll connect with the MySQL server again with the local-infile system
variable. This basically means you want to upload data into a file from a local machine)
○ mysql --local-infile=1 -u root -p (give password)
○ Show Databases; (It'll show all the databases in MySQL server.)
○ mysql> USE dbase; (makes the database that you had created in step 1 as default schema to use for the next sql scripts)
● Note: VERY IMP - Please replace single backward (\) slash in the path with double back slashes (\\) instead of single slash
Data Exporting
Server → Data Export
Aggregate Functions
✔ Sometimes we examine & analyse data of varying magnitudes, hence we realise
the need of grouping similar types of values together & look them at as one bunch.
✔ min(), max()→ Finding the minimum & maximum values for a particular column
Suppose your manager asks you to count all the employees whose salaries are more than the average
salary in that particular department.
Now, intuitively, you know that two aggregate functions would be used here, namely, count() and avg(). You
decide to apply the 'where' condition on the average salary of the department, but to your surprise, the query
fails. This is exactly what the having clause is for.
The 'having' clause is typically used when you have to apply a filter condition on an 'aggregated value'. This
is because the 'where' clause is applied before aggregation takes place, and thus, it is not useful when you
want to apply a filter on an aggregated value.
The HAVING clause was added to SQL because the WHERE keyword cannot be used with aggregate
functions.
Exercise
Keywords in SQL
1. group by
2. order by
3. select
4. where
5. from
6. limit
7. having
Answer is:
STRING FUNCTIONS
Used to manipulate the string data and make it more understandable for analysis.
For example: amitabhbachchan, or Amitabh Bachchan, which one of them is more readable, obviously the
later one right.
https://www.w3schools.com/sql/sql_ref_mysql.asp
DATE & TIME FUNCTIONS
Used to manipulate the date & time columns
For example: If you want to change the date format, or just want to see the exact day, or so on.
datediff → Return the number of days between the two date values
SELECT DATEDIFF(sysdate(), order_date) from transaction_details
date_format → Format a date variable
SELECT DATE_FORMAT("2017-06-15", "%Y");
day → Return the day of the month for a date.
SELECT DAY("2017-06-15");
quarter → Return the quarter of the year for a date.
SELECT QUARTER("2017-06-15");
addday → Add days to the date variable
SELECT ADDDATE("2017-06-15", INTERVAL 10 DAY);
and so on….
https://www.w3schools.com/sql/func_mysql_adddate.asp
REGEX
So far you have already known about the wildcards like “like” operator, but in cases wildcards may fall short
for some advanced use cases, regular expressions comes into picture.
Regex, or Regular expressions, is a sequence of characters, used to search and locate specific sequences
of characters that match a pattern.
Example 3: Find the customers, which email address containing characters from ‘x’ to ‘z’
.
TRIGGERS
A trigger is a stored procedure in database
which automatically invokes whenever a
create trigger [trigger_name]
special event in the database occurs. For [before | after]
example, a trigger can be invoked when a {insert | update | delete}
row is inserted into a specified table or on [table_name]
when certain table columns are being [for each row]
updated. [trigger_body]
● DDL Trigger
● DML Trigger
● Logon Trigger
VIEWS
Views are virtual tables that do not store any data
of their own but display data stored in other
tables. create view newEmp as SELECT
lastName,firstName
Advantages: FROM employees
WHERE
● Hide the complexity of data.
office_Code IN (SELECT
● Act as aggregated tables.
office_Code
● If you are doing an user level access FROM
control, you can give an user access to a offices
view without giving access to the tables WHERE
behind it country = 'USA');
● It can allow for massive performance
improvements.
VIEWS
Disadvantages:
● If done wrong, it can result in performance create view newEmp as SELECT
issues. lastName,firstName
● You may not be able to update the view, FROM employees
forcing you back to the original tables. WHERE
office_Code IN (SELECT
office_Code
FROM
offices
WHERE
country = 'USA');
VIEWS
WINDOW FUNCTION
Window functions applies aggregate and ranking
functions over a particular window (set of rows). SELECT coulmn_name1,
OVER clause is used with window functions to window_function(cloumn_name2),
OVER([PARTITION BY column_name1] [ORDER BY
define that window. OVER clause does two column_name3]) AS new_column
things : FROM table_name;
window_function= any aggregate or ranking function
● Partitions rows into form set of rows. column_name1= column to be selected
coulmn_name2= column on which window function is
(PARTITION BY clause is used) to be applied
column_name3= column on whose basis partition of
● Orders rows within those partitions into a
rows is to be done
particular order. (ORDER BY clause is new_column= Name of new column
table_name= Name of table
used)
WINDOW FUNCTION
● pymysql → This package contains a pure-Python MySQL client library, based on PEP 249
→ pip install pymysql
Leave your top three takeaways from this session in the comments section below.
Exercise
Write a query to retrieve the names of all employees who have an age greater than what Gus Gray has.
Answer is:
Exercise
Write a query to retrieve the names of all employees who have an age greater than what Gus
Gray has, and store the output in a view.
Answer is:
Exercise
Write a query to retrieve the names of all employees who have an age greater than what Gus Gray has, and
store the output in a view.
Answer is: