0% found this document useful (0 votes)
2 views39 pages

comp101-lect05

COMP101 Otago Uni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views39 pages

comp101-lect05

COMP101 Otago Uni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 39

COMP101:

Foundations
of Information
Systems

Lecture 5: Grant Dick


The Relational Model and Department of
Information Science
SQL (I)
Creating databases
Before we start…

• Lab this week — build SQL database

• Lab assessment via SQL submission

1
Today

• Brief introduction to the relational model


of databases (expanded in later lectures)

• Basics of Structured Query Language (SQL)


• Creating a table
• Data types

2
Logical data models

• Abstract away details of physical data storage


• i.e., bits, bytes, records, files, …
• This is known as (physical) data independence

• Normally have three parts:


• Structure defines how data are organised, e.g., files,
relations, objects
• Integrity rules define how data are kept correct and
consistent
• Operators define how data are manipulated
(“Data models in database management”, ACM SIGMOD Record 11(2), E. F. Codd, 1981)

3
Relational model of data
The inspiration behind SQL databases

4
The Relational Model of
Data E.F. Codd, 1923–2003

• Devised by Edgar Codd in early


1970s, and further developed by Chris
Date
C.J. Date, 1941-

• Predominant data model for databases

• Simple, logical structure based upon relations


(tables), tuples (rows) and attributes (columns)

• High correlation to ERDs (but “relation” ≠ “relationship”!)


5
The Information Principle

In a relational database, all information is


represented by relations (specifically, values
within attributes within tuples within relations)
(“A relational model of data for large shared data banks”, CACM 13(6), E. F. Codd, 1970)

• Everything identified by value or name (a special


kind of value)
• All collections are sets of unordered, distinct values
• No unnecessary duplication of data

6
Relational Model Components

• Attributes are named “value containers”:


• Each attribute contains a single value from a known
domain, describing aspects of a larger object

• Tuples are unordered sets of attributes that


group together to describe an object

• Relations are unordered sets of tuples


(describing a collection of like objects)

7
Properties of Relational Model

• All the tuples of a relation must have the same


attributes

• Tuples are uniquely identified by a special sets


of attributes (called keys)

• A relation has one primary key

• Tuples in one relation are linked through foreign


keys that reference a primary key of another
relation

8
Properties of Relational Model

• Manipulation of data through set


operations (e.g., union, intersection, set
difference)

• Because of the uniqueness property in the


model, manipulating data can be rigorously
defined (e.g., relational algebra, relational
calculus)

9
Mapping from an ERD to relational
(more detail shortly)

Lectures 3-4 Today Lectures 5-8


ERDs Relational Model SQL
Entity type Relation Table
Entity instance Tuple Row
Attribute Attribute Column
Unique identifier Primary key Primary key
Relationship Foreign key† Foreign key†

† 1:1 and 1:M relationships only (i.e., must transform M:N)

10
Database Management
Systems (DBMS)
Implementation and management of logical database models

11
Every database has a schema

• Definition of the database’s structure


• In some language, e.g., SQL

• Processed to create instance of database


• By a database management system (DBMS)

12
Database management system

• Software to create and manage databases

• Implements a logical data model (e.g., relational)

• Provides infrastructure for security, concurrent


access, and failure recovery

• Often provide facilities to integrate data validation,


integrity rules, and triggered/automated data
operations.
13
COMP 101 uses PostgreSQL DBMS

• “The World's Most Advanced Open Source


Relational Database”

• High standards compliance

• Available on Windows, Linux, and MacOS

• Stand-alone server implementation (we


provide one in COMP 101) (demo)

14
SQL
Structured Query Language

15
Structured Query Language

• Special-purpose language for


• creating (DDL — Data Definition Language),
• managing (DCL — Data Control Language), and
• manipulating and querying (DML — Data Manipulation Language)
databases

• Not (strictly speaking) pure relational:


• Different terminology (tables, columns, rows)
• Breaks some aspects (e.g., allows duplicate rows)
• ”Good enough” for most purposes

• Designed as a stop-gap prototype (1974), became de facto


standard (1986)

16
Some notes on SQL
syntax
• Generally not case-sensitive:
• cReAtE tAbLe somEtaBLE (a INTeGer); is valid
• String comparison is case-sensitive ('A' ≠ 'a')

• Single quotes for strings ('abc' not "abc")


• Watch out for ‘ ’ (invalid) vs. ' ' (valid)

• Whitespace isn’t significant (cf. Python)

• Statements end with semicolon (;)

• -- for line comments

• List syntax (a, b, c) is common

17
Creating databases in SQL
SQL Data Definition Language (DDL)

18
The essential DDL statements

CREATE <schema

object> ALTER

<schema object> DROP

<schema object>

• e.g., CREATE TABLE, ALTER TABLE, DROP


19
TABLE

20
Types of schema “object” in SQL
(Note: not “objects” as in Java!)

• In general, each has its own CREATE, ALTER,


and
DROP statement

Logical (schema) Administrative Physical


TABLE USER INDEX
Focus in
COMP 101 VIEW ROLE CLUSTER
DOMAIN SCHEMA TABLESP
ACE
… … …

21
Creating tables

CREATE TABLE <name> (


<column-name> <data-type>
[DEFAULT
<expression>]
[<inline-constraint>],
<column-name> <data-type>
[<default>]
[<inline-constraint>],
...,
[<out-of-line-constraint>, ...]
);
Note
1. Things inside angle brackets should be replaced appropriately
(e.g.
<name> with Student, Employee etc.).
2. Anything inside square brackets [] is optional. 21
COMP 101 SQL conventions

• Use clear singular nouns


• e.g., Employee instead of Employees

• Use proper nouns (i.e., capitalise!)


• e.g., Student, Customer, Patient, …

• e.g., Student ID ⇒ Student_ID


• Replaces spaces between nouns with underscores

• SQL may be case insensitive, but you aren’t!


• Use ALL CAPS for SQL keywords

22
SQL demonstration
Creating a table

23
A simple student table (Oracle)
-- Create a table called Student
CREATE TABLE Student (
Student_ID
VARCHAR(10),
First_Name
VARCHAR(50),
Last_Name
VARCHAR(50)
);

-- Insert a row into the table


INSERT INTO Student (Student_ID, First_Name, Last_Name)
VALUES ('1121978', 'Grant', 'Dick');

-- Retrieve and display the data stored in the Student table


SELECT *
FROM Student;

-- Remove the table entirely

24
DROP TABLE Student;

25
Dealing with different types of
data: SQL data types
A whirlwind tour — this doesn’t really become well-understood until
you start using them in table definitions (essentially, this is reference
material for the labs)

26
Data types

• SQL standard defines many data types

• Most DBMSs don’t implement all of the


standard types

• Most DBMSs add their own proprietary, non-


standard types

27
Text data
types
CHAR(<n>)
• Fixed-length of <n> bytes, padded with blanks
• Use this if value length never changes
• e.g., to store Paper_Code at Otago (e.g. COMP101), use CHAR(7)

VARCHAR(<n>)
• Variable-length text, not blank-padded
• Must specify maximum length (of <n> bytes) — DBMS typically limits to a few
thousand
• e.g., to store First_Name of a customer, use VARCHAR(50)

TEXT
• Character Large Object (CLOB)
• “long” variable-length text (multiple thousands of bytes)
• e.g., to store a blog article or email, use TEXT

28
Numbers: Exact integer

SMALLINT (smaller than) INTEGER (smaller than) BIGINT


NUMERIC(<p>), DECIMAL(<p>)
• <p> = precision = number of significant digits
• e.g., NUMERIC(4) can store values –9999 to 9999

Only store values as a numeric data type if you


need to do calculations with them (otherwise
CHAR/VARCHAR)

29
Numbers: Exact decimal

NUMERIC(<p>,<s>), DECIMAL(<p>,<s>)
• NUMERIC may be slightly preferable (but often identical)
• <p> = precision = number of significant digits
• <s> = scale = number of digits after decimal point
(effectively)

• e.g., NUMERIC(5,2) can store values –999.99 to


999.99
• Important to get scale & precision correct
(especially for money!)

30
Numbers: Approximate decimal

REAL (also DOUBLE PRECISION)

CAUTION: Floating point values are always an


approximation (Lectures 17-19) and shouldn’t
be used for anything requiring precise
calculations (e.g., money)

31
Boolean (True/False) data types

BOOLEAN

Two possible states (TRUE, FALSE), represented


by:
• TRUE: TRUE, 'yes', 't', 'y', 1, 'on'
• FALSE: FALSE, 'no', 'f', 'n', 0, 'off'

32
Dates and times

DATE
TIME [WITH TIME ZONE]
TIMESTAMP [WITH TIME ZONE]

Notes:
• DATE has precision of one day
• Range 1/1/4713 BC → 31/12/294276 AD (no year 0)
• Date/time literal values expressed as strings (e.g., '12-MAY-16', '2016-05-12’)
• CURRENT_DATE returns current date
• Date/Time manipulation by arithmetic operators (+, -)

33
See “The Problem with Time & Timezones” in
the Computerphile channel on YouTube for
the full insanity of handling dates and times

https://www.youtube.com/watch?v=-5wpm-gesOY

Dates and times are


extremely easy to get
wrong — be warned!
Time is heavily dependent on context!

34
What next in SQL? (in COMP 101)

• Add constraints to tables (e.g., primary keys,


foreign keys, “check” clauses)

• CRUD (create, update, retrieve, delete)

• Views to create permanent queries

• Not covered: user management, security,


performance management
35
Summary

• Databases store data in a more organised and consistent


form than ad hoc solutions using an underlying data model

• SQL databases are heavily inspired by the relational model

• Creating tables in SQL requires careful thought about


appropriate data types

• We’ve still only covered structural aspects — more SQL to


come in labs and lectures (constraints and querying)

36
Thanks!
Questions?

You might also like