AL ICT - Databse
AL ICT - Databse
Competency 8
Designs and develops database systems to manage data efficiently and effectively.
8.1 The basics of information and data, and the need for databases.
Data is a raw and unorganized fact that is required to be processed to make it meaningful
whereas Information is a set of data that is processed in a meaningful way according to the
given requirement.
Structured data is highly-organized and formatted so that it's easily searchable in relational
databases. Unstructured data has no predefined format or organization, making it much
more difficult to collect, process, and analyze.
Database
Database models
A flat file database is basically a giant collection of data in which the tables and
records have no relation between any other tables. In fact, one could have a single
table.
A hierarchical database model is a data model in which the data are organized into a
tree-like structure. The data are stored as records which are connected to one
another through links. A record is a collection of fields, with each field containing
only one value.
1
N.K Waruna Devaka(BIT – UCSC)
Object Oriented models and has support for data types, tabular structures etc. like
Relational data model.
Relations / Tables
A table is a collection of logically related information treated as a unit. Tables are organized
by rows and columns.
Record or Tuple: Each row of a table is known as record. It is also known as tuple.
Attributes or Columns : Each column in a Table. Attributes are the properties which
define a relation. e.g., Student_Rollno, Name, Address.
Degree :The total number of attributes which in the relation is called the degree of
the relation.
Cardinality :Total number of rows present in the Table.
2
N.K Waruna Devaka(BIT – UCSC)
Types of Constraints
Constraints are the set of rules that ensures that when an authorized user modifies the
database they do not disturb the data consistency.
A NOT NULL Constraint : Ensure that a given column of a table is never assigned the
null value(empty value).
A Unique Constraint : Ensures that all values in a column are different.
A Primary Key Constraint : The primary key constraint uniquely identifies each
record in a table. Primary keys must contain unique values, and cannot contain null
values.
A Foreign Key Constraint : The foreign key constraint is used to prevent actions that
would destroy links between tables. A foreign key is a field (or collection of fields) in
one table, that refers to the primary key in another table.
A (Table) Check Constraint : The check constraint is used to limit the value range
that can be placed in a column.
Some DBMS examples include MySQL, PostgreSQL, Microsoft Access, SQL Server, FileMaker,
Oracle, RDBMS, dBASE, Clipper, and FoxPro.
Structured Query Language is a computer language for storing, manipulating and retrieving
data stored in a relational database. SQL is the standard language for Relational Database
System. SQL statements are divided into two major categories: data definition language
(DDL) and data manipulation language (DML).
SQL
3
N.K Waruna Devaka(BIT – UCSC)
1) CREATE Statement
The SQL CREATE DATABASE Statement : The CREATE DATABASE statement is used to create
a new SQL database.
The SQL CREATE TABLE Statement : The CREATE TABLE statement is used to create a new
table in a database.
Syntax Example
o VARCHAR(size) : A VARIABLE length string (can contain letters, numbers, and special
characters).
o CHAR(size) : A FIXED length string (can contain letters, numbers, and special
characters).
o INT(size): A medium integer.
o DATE: A date. Format: YYYY-MM-DD.
4
N.K Waruna Devaka(BIT – UCSC)
The ALTER TABLE statement is used to add, delete, or modify columns in an existing
table.
Syntax Example
The ALTER TABLE statement is also used to add and drop various constraints on an
existing table.
5
N.K Waruna Devaka(BIT – UCSC)
3) DROP Statement
The DROP DATABASE : The DROP DATABASE statement is used to drop an existing SQL
database.
The DROP TABLE : The DROP TABLE statement is used to drop an existing table in a
database.
Note: Be careful before dropping a database. Deleting a database will result in loss of
complete information stored in the database!
A data manipulation language (DML) is used for adding (inserting), deleting, and modifying
(updating) data in a database.
The SQL INSERT INTO Statement : It is possible to write the INSERT INTO statement in two
ways:
2. If you are adding values for all the columns of the table, you do not need to specify
the column names in the SQL query.
Example –
6
N.K Waruna Devaka(BIT – UCSC)
2) UPDATE Statement
The UPDATE statement is used to modify the existing records in a table. Be careful when
updating records. If you omit the WHERE clause, ALL records will be updated!
Syntax Example
3) DELETE Statement
The DELETE statement : The DELETE statement is used to delete existing records in a table.
Note: Be careful when deleting records in a table! Notice the WHERE clause in the DELETE
statement. The WHERE clause specifies which record(s) should be deleted. If you omit the
WHERE clause, all records in the table will be deleted!
Ex : - The following SQL statement deletes the customer "Alfreds Futterkiste" from the
"Customers" table:
It is possible to delete all rows in a table without deleting the table. This means that the
table structure, attributes, and indexes will be intact:
4) SELECET Statement
The SQL SELECT Statement : The SELECT statement is used to select data from a database.
Syntax:
7
N.K Waruna Devaka(BIT – UCSC)
Here, column1, column2, ... are the field names of the table you want to select data from. If
you want to select all the fields available in the table, use the following syntax:
The following SQL statement selects the "CustomerName" and "City" columns from the
"Customers" table:
The following SQL statement selects all the columns from the "Customers" table:
The SQL WHERE Clause : The WHERE clause is used to filter records. It is used to extract
only those records that fulfill a specified condition.
Note: The WHERE clause is not only used in SELECT statements, it is also used in UPDATE,
DELETE, etc.!
The following SQL statement selects all the customers from the country "Mexico", in the
"Customers" table:
8
N.K Waruna Devaka(BIT – UCSC)
The following SQL statement selects all products with a price between 10 and 20:
The following SQL statement selects all customers with a CustomerName starting with "a":
The following SQL statement selects all customers that are located in "Germany", "France"
or "UK":
The WHERE clause can be combined with AND, OR, and NOT operators.
The following SQL statement selects all fields from "Customers" where country is NOT
"Germany":
The following SQL statement selects all fields from "Customers" where country is
"Germany" AND city is "Berlin":
The following SQL statement selects all fields from "Customers" where country is
"Germany" OR "Spain":
The following SQL statement selects all customers from the "Customers" table, sorted
ASCENDING by the "Country" column:
9
N.K Waruna Devaka(BIT – UCSC)
The following SQL statement selects all customers from the "Customers" table, sorted
DESCENDING by the "Country" column:
The MAX() function returns the largest value of the selected column.
The COUNT() function returns the number of rows that matches a specified criterion.
10
N.K Waruna Devaka(BIT – UCSC)
"Products" table
The INNER JOIN keyword selects records that have matching values in both tables.
Syntax :-
SELECT column_name(s)
FROM table1
INNER JOIN table2
ON table1.column_name = table2.column_name;
11
N.K Waruna Devaka(BIT – UCSC)
The following SQL statement selects all orders with customer information:
SQL Constraints
SQL constraints are used to specify rules for data in a table. Constraints are used to limit the
type of data that can go into a table. This ensures the accuracy and reliability of the data in
the table. If there is any violation between the constraint and the data action, the action is
aborted.
Syntax :-
The following SQL ensures that the "ID", "LastName", and "FirstName" columns will NOT
accept NULL values when the "Persons" table is created:
12
N.K Waruna Devaka(BIT – UCSC)
The UNIQUE constraint ensures that all values in a column are different.Both the UNIQUE
and PRIMARY KEY constraints provide a guarantee for uniqueness for a column or set of
columns. A PRIMARY KEY constraint automatically has a UNIQUE constraint.
The PRIMARY KEY constraint uniquely identifies each record in a table. Primary keys must
contain UNIQUE values, and cannot contain NULL values. A table can have only ONE primary
key; and in the table, this primary key can consist of single or multiple columns (fields).
The FOREIGN KEY constraint is used to prevent actions that would destroy links between
tables.A FOREIGN KEY is a field (or collection of fields) in one table, that refers to the
PRIMARY KEY in another table.The table with the foreign key is called the child table, and
the table with the primary key is called the referenced or parent table.
Notice that the "PersonID" column in the "Orders" table points to the "PersonID"
column in the "Persons" table.
The "PersonID" column in the "Persons" table is the PRIMARY KEY in the "Persons"
table.
The "PersonID" column in the "Orders" table is a FOREIGN KEY in the "Orders" table.
The FOREIGN KEY constraint prevents invalid data from being inserted into the
foreign key column, because it has to be one of the values contained in the parent
table.
13
N.K Waruna Devaka(BIT – UCSC)
The CHECK constraint is used to limit the value range that can be placed in a column.If you
define a CHECK constraint on a column it will allow only certain values for this column.If you
define a CHECK constraint on a table it can limit the values in certain columns based on
values in other columns in the row.
The following SQL creates a CHECK constraint on the "Age" column when the "Persons"
table is created. The CHECK constraint ensures that the age of a person must be 18, or older:
14
N.K Waruna Devaka(BIT – UCSC)
ER Diagrams contain different symbols that use rectangles to represent entities, ovals to
define attributes and diamond shapes to represent relationships.
Entities
Attributes
Relationships
15
N.K Waruna Devaka(BIT – UCSC)
ER Diagram Examples
For example, in a University database, we might have entities for Students, Courses, and
Lecturers. Students entity can have attributes like Rollno, Name, and DeptID. They might
have relationships with Courses and Lecturers.
Entity
A real-world thing either living or non-living that is easily recognizable and nonrecognizable.
It is anything in the enterprise that is to be represented in our database. It may be a physical
thing or simply a fact about the enterprise or an event that happens in the real world.
An entity can be place, person, object, event or a concept, which stores data in the
database. The characteristics of entities are must have an attribute, and a unique key. Every
entity is made up of some ‘attributes’ which represent that entity.
Examples of entities:
16
N.K Waruna Devaka(BIT – UCSC)
Weak Entities
A weak entity is an entity that depends on the existence of another entity. In more technical
terms it can be defined as an entity that cannot be identified by its own attributes. It uses a
foreign key combined with its attributed to form the primary key.
It contains a Primary key represented by the It contains a Partial Key which is represented
underline symbol. by a dashed underline symbol.
The member of a strong entity set is called as The member of a weak entity set called as a
dominant entity set. subordinate entity set.
In the ER diagram the relationship between The relationship between one strong and a
two strong entity set shown by using a weak entity set shown by using the double
diamond symbol. diamond symbol.
The connecting line of the strong entity set The line connecting the weak entity set for
with the relationship is single. identifying relationship is double.
17
N.K Waruna Devaka(BIT – UCSC)
Attributes
It is a single-valued property of either an entity-type or a relationship-type. For example, a
lecture might have attributes: time, date, duration, place, etc. An attribute in ER Diagram
examples, is represented by an Ellipse
18
N.K Waruna Devaka(BIT – UCSC)
Relationship
A relationship describes how entities interact. For example, the entity “Carpenter” may be
related to the entity “table” by the relationship “builds” or “makes”. Relationships are
represented by diamond shapes and are labeled using verbs.
Recursive Relationship
If the same entity participates more than once in a relationship it is known as a recursive
relationship. In the below example an employee can be a supervisor and be supervised, so
there is a recursive relationship.
Cardinality
Defines the numerical attributes of the relationship between two entities or entity sets.
1.One-to-one (1:1):
One entity from entity set X can be associated with at most one entity of entity set Y and
vice versa.Example: One student can register for numerous courses. However, all those
courses have a single line back to that one student.
2.One-to-many(1:M):
One entity from entity set X can be associated with multiple entities of entity set Y, but an
entity from entity set Y can be associated with at least one entity.For example, one class is
consisting of multiple students.
3. Many to One(M:1):
More than one entity from entity set X can be associated with at most one entity of entity
set Y. However, an entity from entity set Y may or may not be associated with more than
one entity from entity set X. For example, many students belong to the same class.
4. Many to Many(M:N):
One entity from X can be associated with more than one entity from Y and vice versa.For
example, Students as a group are associated with multiple faculty members, and faculty
members can be associated with multiple students.
20
N.K Waruna Devaka(BIT – UCSC)
Participation Constraints
In addition to the same concepts that ordinary ER diagrams encompass, EERDs include:
Super class shape has sub groups: Triangle, Square and Circle. Sub classes are the group of
entities with some unique attributes.Sub class inherits the properties and attributes from
super class.
21
N.K Waruna Devaka(BIT – UCSC)
1) Generalization
In the above example, Tiger, Lion, Elephant can all be generalized as Animals.
2) Specialization
Specialization is a process that defines a group entities which is divided into sub
groups based on their characteristic.
It is a top down approach, in which one higher entity can be broken down into two
lower level entity.
It maximizes the difference between the members of an entity by identifying the
unique characteristic or attributes of each member.
It defines one or more sub class for the super class and also forms the
superclass/subclass relationship.
22
N.K Waruna Devaka(BIT – UCSC)
In the above example, Employee can be specialized as Developer or Tester, based on what
role they play in an Organization.
Category or Union
Category represents a single super class or sub class relationship with more than one
super class.
It can be a total or partial participation.
For example Car booking, Car owner can be a person, a bank (holds a possession on
a Car) or a company. Category (sub class) → Owner is a subset of the union of the
three super classes → Company, Bank, and Person. A Category member must exist in
at least one of its super classes.
4) Aggregation
In the above example, the relation between College and Course is acting as an Entity in
Relation with Student.
23
N.K Waruna Devaka(BIT – UCSC)
Disjointness Constraint
Specifies that all subclasses of a specialization must be disjoint. In other words, if an entity
belongs to one subclass of a specialization, then it cannot belong to another subclass of the
same specialization.
Example :
Overlapping Subclasses
Sometimes an entity can belong to more than one subclass of the same specialization, this
means that subclasses overlap with each other.
Example :
In a bakery, cake can be made with nuts or with fruits. However, a cake can also be made
with nuts and fruits at the same time.
24
N.K Waruna Devaka(BIT – UCSC)
Completeness Constraint
1. Total Specialization
2. Partial Specialization
1) Total Specialization:
– Every entity that belongs to a superclass must belong to at least one subclass of the
specialization.
In EER, a total specialization is represented by using a double lines that is going out of the
superclass.
25
N.K Waruna Devaka(BIT – UCSC)
2) Partial Specialization
– Some entities might not belong to any of the subclasses of the specialization. In EER, it is
represented as one line going out of the superclass.
Example:
Disjoint, partial
specialization
Disjointness and Completeness constraints are independent, so you might have the
following combinations of specializations:
Disjoint, total
Disjoint, partial
Overlapping, total
Overlapping, partial
26
N.K Waruna Devaka(BIT – UCSC)
Database Schema
A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are
associated. It formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a
descriptive detail of the database, which can be depicted by means of schema diagrams. It’s
the database designers who design the schema to help programmers understand the
database and make it useful.
Physical Database Schema − This schema pertains to the actual storage of data and its form
of storage like files, indices, etc. It defines how the data will be stored in a secondary
storage.
Logical Database Schema − This schema defines all the logical constraints that need to be
applied on the data stored. It defines tables, views, and integrity constraints.
View schema can be defined as the design of the database at the view level, which generally
describes end-user interaction with database systems.
27
N.K Waruna Devaka(BIT – UCSC)
Relational schema
Relation schema describes the column heads for the table. The schema specifies the
relation’s name, the name of each field (or column, or attribute), and the domain of each
field. A domain is referred to in a relation schema by the domain name and has a set of
associated values. Field named sid has a domain named string. It is clear that every student
record should satisfy domains constraint.
Relation Instance
The relation instance is a table, An instance of a relation is a set of tuples, also called
records, in which each tuple has the same number of fields as the relation schema. A
relation instance can be thought of as a table in which each tuple is a row, and all rows have
the same number of fields. (The term relation instance is often abbreviated to just relation).
Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a
candidate key. For Example, STUD_NO in STUDENT relation.
Primary Key: There can be more than one candidate key in relation out of which one can be
chosen as the primary key. For Example, STUD_NO, as well as STUD_PHONE both, are
candidate keys for relation STUDENT but STUD_NO can be chosen as the primary key (only
one out of many candidate keys).
Composite Key: A primary key having two or more attributes is called composite key. It is a
combination of two or more columns.
Alternate Key: The candidate key other than the primary key is called an alternate key. For
Example, STUD_NO, as well as STUD_PHONE both, are candidate keys for relation STUDENT
but STUD_PHONE will be an alternate key (only one out of many candidate keys).
Foreign Key is a column or group of columns in a relational database table that provides a
link between data in two tables. It acts as a cross-reference between tables because it
references the primary key of another table, thereby establishing a link between them.
28
N.K Waruna Devaka(BIT – UCSC)
Domain constraints
Domain constraints can be defined as the definition of a valid set of values for an attribute.
The data type of domain includes string, character, integer, time, date, currency, etc. The
value of the attribute must be available in the corresponding domain.
29
N.K Waruna Devaka(BIT – UCSC)
The ER diagram represents the conceptual level of database design meanwhile the
relational schema is the logical level for the database design.
An entity type within ER diagram is turned into a table. Each attribute turns into a column
(attribute) in the table. The key attribute of the entity is the primary key of the table which
is usually underlined. It can be composite if required but can never be null.
Email Lastname
2. Multi-Valued Attributes
If you have a multi-valued attribute, take the attribute and turn it into a new entity or table
of its own.
Phone
Phone(Personid, Phone)
30
N.K Waruna Devaka(BIT – UCSC)
3.Weak Entity
Tel
4. Composite Attributes
5. 1:1 Relationship
1 1
Student Desk
has
St_Name Desk_Id
St_Id
31
N.K Waruna Devaka(BIT – UCSC)
6. 1:M Relationship
1 M
Customer Order
Place
Name OrdNo
Cus_Id
Marks
7. M:N Relationship
SubName
M Obtain N
Student Subject
Marks for
SubID
StId StName
32
N.K Waruna Devaka(BIT – UCSC)
33
N.K Waruna Devaka(BIT – UCSC)
9. EER diagrams
34
N.K Waruna Devaka(BIT – UCSC)
Normalization is the process of organizing data in a database. This includes creating tables
and establishing relationships between those tables according to rules designed both to
protect the data and to make the database more flexible by eliminating redundancy and
anomalies.
Anomalies
1- Update Anomaly: Let say we have 10 columns in a table out of which 2 are called
employee Name and employee address. Now if one employee changes it’s location then we
would have to update the table. But the problem is, if the table is not normalized one
employee can have multiple entries and while updating all of those entries one of them
might get missed.
2- Insertion Anomaly: Let’s say we have a table that has 4 columns. Student ID, Student
Name, Student Address and Student Grades. Now when a new student enroll in school,
even though first three attributes can be filled but 4th attribute will have NULL value
because he doesn't have any marks yet.
Database normalization eliminates duplicate data which can lead to data anomalies.
Data normalization can be divided into different types of normal forms. The most popular
ones are 1NF, 2NF, 3NF, and BCNF.
35
N.K Waruna Devaka(BIT – UCSC)
ZERO NORMAL FORM (0 NF) : Zero normal form (ZNF) is simply a set of data that has not
gone through the process of normalization.
The first normal form (1NF) requires that the values in each column of a table are atomic. By
atomic we mean that there are no sets of values within a column. Repeating groups should
be removed.
36
N.K Waruna Devaka(BIT – UCSC)
Where the First Normal Form deals with atomicity of data, the Second Normal Form (or
2NF) deals with relationships between composite key columns and non-key columns.The
second normal form (or 2NF) any non-key columns must depend on the entire primary key.
In the case of a composite primary key, this means that a non-key column cannot depend on
only part of the composite key(a partial dependency).
Partial Dependency - attribute in a table depends on only a part of the primary key
and not on the whole key.
37
N.K Waruna Devaka(BIT – UCSC)
Third Normal Form (3NF) requires that all non key columns depend directly on the primary
key. Tables violate the Third Normal Form when one column depends on another column,
which in turn depends on the primary key (a transitive dependency).
Example - we have unitCode as our primary key, we also have a courseName that is
dependent on courseCode.
So following the steps, remove courseName with a copy of course code to another relation
and make courseCode the primary key of the new relation. In the original table mark
courseCode as our foreign key.
38
N.K Waruna Devaka(BIT – UCSC)
………………………………………………………………………………………………………………………………………….
………………………………………………………………………………………………………………………………………….
………………………………………………………………………………………………………………………………………….
………………………………………………………………………………………………………………………………………….
………………………………………………………………………………………………………………………………………….
………………………………………………………………………………………………………………………………………….
39