Data Base
Data Base
Can use the normal set operations Union, Intersection, and Difference to combine the
results of two or more queries into a single result table.
• Union (combined) of two tables, A and B,
The following table contains all the rows in the A or B or both.
• Intersection (sliced)
The following table contains all the rows that are common to A and B.
• Difference
The following table contains all the rows in A but not in B.
1
Use of UNION
(SELECT city
FROM Branch
City WHERE IS NOT NULL) UNION
(SELECT city
FROM PropertyForRentWHERE city IS NOT NULL);
or
(SELECT *
FROM Branch
City WHERE IS NOT NULL)
Corresponding UNION BY city
(SELECT *
FROM PropertyForRent
City WHERE IS NOT NULL);
Produces tables result from both queries and merges both tables together.
2
Use of INTERSEC
List all cities where there is both a branch office and a property.
INTERSECT
SELECT b.city
or
WHERE EXISTS
3
Use of EXCEPT
List of all cities where there is a branch office but no properties.
EXCEPT
or
or
SELECT DISTINCT city FROM Branch b
4
subqueries
SQL - Data Manipulation (Multi Table Queries)
subqueries
Example:
FROM Staff
WHERE branchNo =
(SELECT branchNo
FROM Branch
Inner SELECT finds branch number for branch at '163 Main St' ( 'B003').
Outer SELECT then retrieves details of all staff who work at this branch.
FROM Staff
5
Subquery with Aggregate
Whose list of all staff salary is greater than the average salary, and show by how much.
SELECT staffNo, fname, lname, position,
FROM Staff
subquery Rules
• ORDER BY clause can not be used in a subquery (although it can be used in the
Outermost SELECT).
• Subquery SELECT list must Consist of a single column name or expression, except for
subqueries that use EXISTS.
• By default, column names refer to table name in the FROM clause of a subquery.
Can refer to a table in the FROM using an alias.
• When subquery is an operand in a comparison, subquery must Appear on right-hand
side.
• A subquery may not be used as an operand in an expression.
6
Example:
FROM PropertyForRent
StaffNo WHERE IN
(SELECT staffNo
FROM Staff
WHERE branchNo =
(SELECT branchNo
FROM Branch
7
EXISTS and NOT EXISTS
As (NOT) EXISTS simply checks for the presence or absence of rows in the results table subquery, the
subquery can contain any number of columns.
(SELECT * ...)
Example:
Find all the staff who work in a London branch.
SELECT staffNo, fname, lname, position
(SELECT *
FROM Branch b
city = 'London');
8
Query using EXISTS
Note, find conditions s.branchNo = b.branchTidak need to consider the correct branch records for
each member of staff.
9
ANY / SOME and ALL
ANY and ALL can be used with subqueries that produce a single column of numbers.
With ALL, condition will only be true if it is satisfied by all values produced by the subquery.
With the use of ANY, the condition will be true if met by whatever value produced by the
subquery.
If the subquery is empty, ALL returns true, ANY returns false.
SOME can be used in place of ANY.
Use of All
Example:
FROM Staff
(SELECT salary
FROM Staff
Inner query produces a set {12000, 18000, 24000} and outer query selects Reviews those staff Whose Salaries
are greater than any of the values in this set.
10
normalization
Normalization is a technique to produce a set of relationships with the desired properties,
considering the needs of the enterprise data. Or in other wordsNormalization is a technique
for organizing data into tables to meet the needs of users in an organization.
The purpose of normalization is to identify a suitable set of relations and support the
enterprise data needed. The characteristics of a suitable set of relations that include:
Minimize the number of attributes required to support the data requirements of the
company.
Attribute the close logical relationship (described as functional dependency) is found
in the same relationship
Minimal redundancy, with each attribute is represented only once, with the
important exception of the attributes that make up all or part of the foreign keys
that are very important to combine relations involved.
Figure 7.1 How Normalization can be used to support the design database
11
Data Redundancy and Update Anomalies
The main objective of relational database design is to group attributes into relations to minimize the
data redundancy
Relationships that have data redundancy has a problem called Update Anomalies which are
classified as Insertion, Deletion, or Modification anomalies.
insertion Anomalies
Insertion Anomaly, an error or mistake that occurs as a result of the operation insert (insert)
tuple / record on a relationship.
(So if in table 1 insert something, in table 2 relating sm table 1 of data to its guns come into
the insert as well)
Example: there are new subjects (CS-600), which will be taught, then the course \
tersebuttidak can insert / pasted into Relations Lecture at the top until there
mahasiswayang take these courses.
Delete Anomalies
Delete Anomaly is an error or mistake that occurs as a result operasipenghapusan (delete)
against tupe / record of a relationship.
Example: students with NIM 92 425 (in relation Lecture above), decided to opt out of
college CS400, because he is the only participant of the course, then when the record / tuple
is deleted / delete will result in a loss of information that courses CS400, the cost is 150.
12
Modification Anomalies
Update Anomaly is an error or mistake that occurs as a result of the change operation
(update) tuple / record of a relationship.
Example: the tuition fee for the course CS-200 (in relation to college above) will be increased
from 75 to 100, then it should be done several times a modification of the records or tuples
students taking courses CS-200, in order of data teap consistent ,
Based on the theory of normalization (to be discussed later), then the above relation lecture
should be split into two separate relationships as follows.
13
functional Dependencies
One of the main concepts related to normalization is functional dependency (functional
dependence). A functional dependency describe relationship / relationships between attributes.
determinant
Determinant lead to the attribute or collection of attributes in the left side of functional dependency
arrows. Example as shown above. A is the determinant of B.
Transitive dependecy
A condition in which A, B, and C are attributes of a relation so that if A B and B C, then C is
transitive dependencies from A through B (Provided that A is not functionally dependent on B or C).
Transitive dependencies:
No-estab-city codes
No-estab City
14
FULL FUNCTIONAL DEPENDENCY
Full functional dependency indicates that if A and B are attributes of a relation, a full functional
dependency B to A. If B is functionally dependent on A, but not in every part of the A.
Functional dependency:
15
Normalization process
UNF
In the process of normalization of UNF us to show all of the fields or attributes that exist in a form to
be normalized.
1NF
A relation is in 1 NF if the relationship does not contain repeating attribute field calculation results is
eliminated and already has a primary key.
2NF
A relation is in 2NF if the relation is in 1NF and for each non-key attribute fully functionally
dependent to the primary key. So on we will eliminate the dependence 2NF partial / partial
dependence of certain fields only to one key composite.
Example: Table Students (Nim, Name, Address) Name and address depends on the meaning of the
Nim Nim we can specify the name or address otherwise the name / address does not specify nim,
then it means that the name and address depends partially to nim.
3NF A relation is in 3NF if this relation in 1NF and 2NF and no non-key attributes are dependent non-
key attributes functionally to the other (transitive dependencies).
Example: Table Employees (NoPegawai, honor, KdProyek, Date) KdProyek & Date is a non-key
attributes. But the date depends on the KdProyek. The solution is to split into two relations: Projects
(KdProyek, Date) PegProyek (Nopegawai, honor, KdProyek)
EXAMPLE
16
SalesInvoice(SalesInvoiceId, MemberID, memberName, Membership,
{ItemCode, ITEMNAME, Category, UnitPrice, Qty, Subtotal},
In total, PaymentType, StaffName, StaffPosition)
1NF
-Pisahkan repetitive data
-Muncul header detail
-Determine PK, FK => FK FOR SURE THERE IN DETAIL
-Hilangkan calculation result data
2NF
-Hilangkan partial dependence (partial dependencies)
-Pecahkan of SalesInvoice (second table) / break detail
-Pecahkan that has dependency with PK
3NF
-Hilangkan transitive dependency (dependency)
-Pecahkan header thus giving a new table
The same information -hilangkan
17
SalesInvoiceDetail (SalesInvoiceId, ItemCode, Qty)
18
Entity Relationship Modeling (ER)
For example: Staff, property, Customer, Supplier, Product, Sales Record (all instances except in the
form of a physical object that is conceptual Sale record).
Branch staff
Relationship type is a set of associations between entities that have meaning. Each relationship type
given names that describe the relationship between the entities involved. This name is usually a verb
and should be unique untuksetiap ER models
tend
Branch staff
Fig 2. The representation of the relationship type between the entity and the entity
Staff Branch named Has
19
Branch staff
(BranchNo) tend
(StaffNo)
erloi
r1
B003 SG37
r2
SG14
B007 r3 SA9
Degree of relationship type is the sum of the number of entities that participate in a relationship.
If the degree is greater than two (binary) then used a diamond shape to describe their relationships.
client
Recursive Relationship: referring to her own self because it has a role - a different role. Example:
staff supervisor, he oversees and supervised
20
attribute is the property of the entity or relationship type. StaffNo example, name, position, and
salary. Attribute domain is a value that is allowed for one or more attributes.
Single Valued Attributeis an attribute that contains a single value for each occurrence of
entity.Mayoritas type of attribute is single-valued. Ex: email (although we got a lot of emails,
which in cantumin 1)
multivalued attributesis an attribute that has a lot of value for each occurrence of an entity.
Ex: employee (have a name, ttl, position, title)
Keys:
Candidate Keysis an attribute that has a unique identification of each occurrence of the
entity. For example: branchNo. Each value of branchNo certainly different for each entity
branch and its value should not be NULL.
primary Keyis a candidate key selected. Selection of visits based on the length primary key
attribute when there are two or more attributes of the candidate keys.
composite Keyis the key of the entity that has some attribute. 1 table has two unique key.
ex: detailPenyewaan (contents sewaID, movieID)
21
Foreign keyis the key liaison at another table. There is a foreign key in the child table. Goals
make the foreign key is to connect it to the parent table contained in a database.
Super keyis an attribute or set of attributes that are unique to identify tuples in the relation.
ex: nim and the name of the merged
Alternate key is a key candidate who failed to become the primary key.
Entity Type:
From the diagram above, the newspapers can advertise more than one propery for rent and one
property for rent can be advertised by more than one newspaper.
22
ER Diagram:
Example: 0. , 1
0 = participation
1 = cardinality
23
Problems on ER models are divided into two, namely:
Is where the model represents a relationship between entity types are, but the trend among some
instances the entity is ambiguous.
A fan trap may appear at first intercourse: * out of the same entity.
Is where the model refers to an existence of a relationship between entity types are, but there is no
pattern between the occurrence of a particular entity.
A chasm trap may occur when one or more relationships with distinguished multiplicity minimum of
zero (that which is optional participation) that form part of a pattern among the entities that are
interconnected.
image represents the fact that a single branch has one or a few staff who supervise zero or more
properties for rent.
note: not all the staff overseeing the property and all property that is seen by a member of staff,
11:21 (b) examine some of the events of the relationship and the Overseas Has using the values for
the primary key attribute of the entity type Branch, Staff, and PropertyForRent .11.22 (a) association
representing the relationship between the entity's truthfulness.
This model ensures that every time the properties associated with each branch which are known,
including property that has not been allocated to members of staff.11.22 (b) examine the incidence
of type Has relations, Overseas, and Offers.
24
Example Problem ER:
Fruitsta is a store that sells fruits in bulk, where Fruitsta has its own fruit orchard for growing fruits
sold. Customers who want to buy fruit from Fruitsta, are required to register themselves first in
order to become a permanent member. Beginning of the process, the customer will provide the data
himself to the receptionist, then receptionist will register a customer to become a member. After
becoming a member, the member can order fruit and sales staff will take the order into the order
form, but before put in the form, the sales staff will check the stock of these fruits. If the stock fruit
in the message is not sufficient then the member can choose to reduce the amount of the order or
cancel the order. Order form will be taken to the warehouse to pick up the order members. When
Members make a payment, the cashier will make a proof of payment to the member.
Answer:
25
Exercises
Create ER Diagram
BRAC company providing transportation services in Jakarta. Vehicle needs a safe and convenient
location within easy reach. Customers who want to rent can be by telephone, came to the location.
Section Customer Service (CS) will ask for name, Subscriber Identity, the choice of the type of car,
how many days of rental, chauffeur services. Part CS will do recording and storage of the identity of
new customers in rental transactions. CS section records transactions car rental and driver (if any), in
print 3 copies (rangkap1 to archive, rangkap2 to the cashier, rangkap3 to customers).
Customers must make a payment based on the selected car hire. Payment is made directly
at the ATM transfer or cashier. Payments via ATMs, customers should fax the bank receipt to the
cashier. After receiving payment, the cashier will print a payment receipt as proof rangkap3 keel
(rangkap1 archives, rangkap2 of Operations, Customer rangkap3). After receiving proof of payment,
Operations Section handing the car to the customer and create a form vehicle expenses (Rangkap2)
which will be taken while removing the driver of the car and handed to the customer.
When the car was finished in use, then the customer returns directly to BRAC or leave in
returning the pickup point specified during car rental. When the car returned, Operations Section
create a form reception car (rangkap2), rangkap1 archives, rangkap2 customers.
Each month, part of CS reporting leasing and cashier section to report cash receipts
submitted to the Finance Manager
26
Enchance Entity Relationship Modeling (EER)
Specialization / generalization
GeneralisasiThe process minimizes the difference between the entities to identify their
common characteristics. Ex: cars, motorcycles dijadiin vehicle (bottom up / from the
particular to the general)
For example: Students consists of nim and name attribute. Then students macem2 itukan
there s1 s2, now that the student is minimized mksdnya s1 and s2 is the same attribute kan
neem and doang name. So gausah specified again.
Participation constraint:
27
4 categories constraints of specialization and generalization:
Type of Relationship:
cd Class M odel
cd Class M odel
M obil
M obil
Roda Spion
28
3.2 Problems example
1. In a company there are many workers in which his role was in the form of manager,
customer service, sales. If every worker should have a role. Describe the relationship and
their class!
In a company there are many workers in which his role was in the form of manager,
customer service, sales. If every worker should not have a role or a maximum of one role.
Describe the relationship and their class!
Make the relational schema of the following cases! (Classes, attributes, multiplicity)
SI DVD Rental "Spazio"
Customers who would like to become a member must register in bg.CS by paying a
registration fee of Rp 30.000, -. After registering, customers will be given the Member Card.
Members can choose the DVD that was about to be hired and brought the DVD that has
been chosen to bg.Pembayaran.
Bg.Pembayaran will record a DVD rented and calculate the total pay as well as the rental
time limit, in which the time limit for each DVD rental is 2 days. After that, members will
make payments and bg.Pembayaran will print proof of payment. At the time of return,
bg.Pembayaran will check the rental time limit allowed.
If the payback time has passed the prescribed time limit, it will be subject to a fine of Rp
1,000, - / DVD
29
30
Exercises
Make the relational schema of the following cases! (Classes, attributes, multiplicity)
PT. Bendi Caris a company engaged in the car rental company All transactions are still done
manually. The following are the activities of activities carried out by officers in carrying out
transactions in the car rental company.
Car renters who want to borrow can look at a list price car rental car rental rates. Tenants can use
the services of a driver or not in accordance with the needs of tenants themselves. Each type of
vehicle has rental prices vary so does the rental price chauffeur services for the Jabodetabek area
and outside of the bead is different. After that tenants Rental Form (FS) along with a copy of their
identity. Then leasing the completed form along with full payment to be made submitted to the
cashier receipts as proof of payment.
On return of the vehicle by the tenant, the clerk check the condition of the vehicle if there is damage
or not and made a refund form. If available (eg rearview broken, body dents, scratched paint, etc.), it
is calculated and charged to the tenant replacement then made a form of damage. However When
tenants are late in repayment, the amount of delay of the car and driver will be charged to the
tenant. After paying damages and delays, the officer made a receipt as proof of payment of fines
At the end of the month following the rental officer made a report on damage or delay penalties
incurred and vehicle reports. The report was handed over to the owner of rental car Bendi.
31
Concept Data Warehouse (DWH)
The data warehouse is a subject-oriented data set (Object-Oriented), integrated (Integrated), Time-
Variant, and non-volatile to support the decision-making process.
Data Mart is part of a data warehouse that supports decision-making at the sub-parts of a company.
Data Mining is the data taken from the collection of data in very much and taken the information
contained therein to be used as tools in decision making.
The data warehouse is object-oriented where DWH is designed to analyze data based on a specific
subject within the company and not on a process or specific application functions. This is due to the
need of a data warehouse for storing data that is supporting a decision or in other words, the data
stored is oriented on the subject and not on the process. Data Warehouse is organized by the data
subjects related to the company such as customer, claims, and product shipment. And this allows
the contrast with the majority of systems Online Transaction Processing (OLTP) are more process-
oriented.
Integrated (Integrated)
Data warehousecan store data coming from separate sources into a consistent format and
integrated with one another. Data can not be broken because the data is an entity that supports the
entire data warehouse concept itself.
32
Terms of the integration of data sources can be met in a manner consistent in naming variables,
variable size, and physical attributes of the data.
Time-Variant
Data stored in the data warehouse contains a space-time that may be used as a business
record for each particular time, the data warehouse stores the history (historical data).
Compare this with the needs of the operational system that almost everything is up to date!
Time is of the type or piece of data that is critical in the data warehouse.
In the data warehouse is often stored various times, such as when a transaction occurs /
changed / canceled, when effective, when entered into the computer, when entered into
the data warehouse; also almost always saved version, for example, a change in the
definition of zip code, then the old and the new exist all the data in our warehouse. Again, a
good data warehouse is the store's history.
Non-Volatile
Data Warehouse is not updated continuously different OLTP systems are updated in real-
time. Data in the Warehouse periodically be uploaded in the same time period (eg every
morning or every end of the month).
33
Data Warehouse Architecture
34
ETL Process
ETL is a collection of data from operational process of preparing a source for data. This process
consists dariextracting, transforming, loading, and some processes are carried out before they are
published to the data warehouse. So, ETL, or extract, transform, load is the processing phase of data
from the data source into the data warehouse. ETL purpose is to collect, filter, process and combine
the relevant existing data from different sources to be stored to dalamdata warehouse.
ETL can also be used to integrate with existing systems. Results of the ETL process is generates data
that meets the criteria of data warehouseseperti historical data, integrated, summarized, static and
has a structure that is designed for the purposes of the analysis process. ETL process consists of
three stages:
extract
The first step of the ETL process is the process of retrieving data from one or more operating
systems as a data source (can be taken from the OLTP system, but can also be from a data source
outside systemdatabase). Most of the data warehouse project to combine data from different
sources. In essence, the extraction process is a process of disassembly and cleaning of the extracted
data to obtain a pattern or structure of the desired data.
Transform
The process of cleaning the data that has been taken in the process of extract that data in
accordance with the structure of the data warehouse or data mart. Things that can be done in a
stage of transformation:
35
7. Difficulties that occur in the process of transformation is data to be combined from multiple
separate systems, must be cleaned so consistent and should be aggregated to accelerate the
analysis.
load
Load phase is a stage that serves to enter data into the final target, ie into a data warehouse. Time
and range to replace or add data depends on the design of the data warehouse at the time to
analyze the information requirements. Phase load interact with a database, constraint defined in the
schema databasesebagai a trigger that is activated at the time to load the data (for example:
uniqueness, referential, integrity, mandatory fields), which also contributes to the overall look and
quality of the data from the ETL process.
CASE
ERD + Normalization
36