0% found this document useful (0 votes)
124 views

Data Base

Multi Table Query allows combining results from multiple queries using union, intersect, and except operators. Union returns all rows from both queries, intersect returns rows common to both queries, and except returns rows in the first query that are not in the second. Subqueries can be used in where and having clauses, and with operators like exists, any, some, and all. Normalization is a technique for organizing data into tables to minimize redundancy and support relationships using functional dependencies.

Uploaded by

annisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views

Data Base

Multi Table Query allows combining results from multiple queries using union, intersect, and except operators. Union returns all rows from both queries, intersect returns rows common to both queries, and except returns rows in the first query that are not in the second. Subqueries can be used in where and having clauses, and with operators like exists, any, some, and all. Normalization is a technique for organizing data into tables to minimize redundancy and support relationships using functional dependencies.

Uploaded by

annisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Multi Table Query

Union, Intersect, and Difference (Except)

Can use the normal set operations Union, Intersection, and Difference to combine the
results of two or more queries into a single result table.
• Union (combined) of two tables, A and B,
The following table contains all the rows in the A or B or both.
• Intersection (sliced)
The following table contains all the rows that are common to A and B.
• Difference
The following table contains all the rows in A but not in B.

Two tables must be compatible with the union


The format of the set clause operator in each case is:
op [ALL] [Corresponding [BY {column1 [, ...]}]]

If Corresponding BY determined, set operations performed on a column named (s).


If the corresponding determined but not BY clause, the operation is performed on a
common column.
If ALL is specified, the results can include duplicate rows.

1
Use of UNION

List all cities where there is either a branch office or a property.

(SELECT city

FROM Branch
City WHERE IS NOT NULL) UNION
(SELECT city
FROM PropertyForRentWHERE city IS NOT NULL);

or

(SELECT *
FROM Branch
City WHERE IS NOT NULL)
Corresponding UNION BY city
(SELECT *
FROM PropertyForRent
City WHERE IS NOT NULL);

Produces tables result from both queries and merges both tables together.

2
Use of INTERSEC
List all cities where there is both a branch office and a property.

(SELECT city FROM Branch)

INTERSECT

(SELECT city FROM PropertyForRent);

Could this query rewrite without INTERSECT operator:

SELECT b.city

FROM Branch b PropertyForRent p

WHERE b.city = p.city;

or

SELECT DISTINCT city FROM Branch b

WHERE EXISTS

(SELECT * FROM PropertyForRent p

WHERE p.city = b.city);

3
Use of EXCEPT
List of all cities where there is a branch office but no properties.

(SELECT city FROM Branch)

EXCEPT

(SELECT city FROM PropertyForRent);

or

(SELECT * FROM Branch)

Corresponding EXCEPT BY city

(SELECT * FROM PropertyForRent);

Could this query rewrite without EXCEPT:

SELECT DISTINCT city FROM Branch

City WHERE NOT IN

(SELECT city FROM PropertyForRent);

or
SELECT DISTINCT city FROM Branch b

WHERE NOT EXISTS

(SELECT * FROM PropertyForRent p

WHERE p.city = b.city);

4
subqueries
SQL - Data Manipulation (Multi Table Queries)

subqueries

Some may have a SELECT SQL statement embedded in it.


A subselect can be used in the WHERE and HAVING clause of a SELECT the outside, where it
is called a subquery or nested query.
Subselects may also appear in INSERT, UPDATE, and DELETE statements.

Example:

Subquery with Equality

List staff who work in branch at '163 Main St'

SELECT staffNo, fname, lname, position

FROM Staff

WHERE branchNo =

(SELECT branchNo

FROM Branch

WHERE street = '163 Main St');

Inner SELECT finds branch number for branch at '163 Main St' ( 'B003').

Outer SELECT then retrieves details of all staff who work at this branch.

Outer SELECT then Becomes:

SELECT staffNo, fname, lname, position

FROM Staff

WHERE branchNo = 'B003';

5
Subquery with Aggregate

Whose list of all staff salary is greater than the average salary, and show by how much.
SELECT staffNo, fname, lname, position,

salary - (SELECT AVG (salary) FROM


Can not write 'WHERE salary> AVG (salary)'
Staff) As SalDiff
Instead, use a subquery to find average salaries
FROM Staff
(17000), and then use the SELECT OUTER to find staff
WHERE salary> with a salary greater than this:

(SELECT AVG (salary) FROM Staff);

SELECT staffNo, fname, lname, position,


salary - 17000 As salDiff

FROM Staff

WHERE salary> 17000;

subquery Rules

• ORDER BY clause can not be used in a subquery (although it can be used in the
Outermost SELECT).
• Subquery SELECT list must Consist of a single column name or expression, except for
subqueries that use EXISTS.
• By default, column names refer to table name in the FROM clause of a subquery.
Can refer to a table in the FROM using an alias.
• When subquery is an operand in a comparison, subquery must Appear on right-hand
side.
• A subquery may not be used as an operand in an expression.

6
Example:

List properties handled by staff at '163 Main St'.

SELECT propertyNo, street, city, postcode, type, rooms, rent

FROM PropertyForRent

StaffNo WHERE IN

(SELECT staffNo

FROM Staff

WHERE branchNo =

(SELECT branchNo

FROM Branch

WHERE street = '163 Main St'));

7
EXISTS and NOT EXISTS

EXIST and NOT EXIST only be used on subqueries.


Generate simple true / false
True if and only if there exists at least one row in the result table Returned by subquery.
False if the subquery returns an empty result table.
NOT EXISTS is the opposite of EXISTS.

As (NOT) EXISTS simply checks for the presence or absence of rows in the results table subquery, the
subquery can contain any number of columns.

Common for subqueries following (NOT) EXISTS to be of the form:

(SELECT * ...)

Example:
Find all the staff who work in a London branch.
SELECT staffNo, fname, lname, position

FROM Staff s WHERE EXISTS

(SELECT *

FROM Branch b

WHERE AND s.branchNo = b.branchNo

city = 'London');

8
Query using EXISTS

Note, find conditions s.branchNo = b.branchTidak need to consider the correct branch records for
each member of staff.

If omitted, all the records staff will be listed as a subquery:

SELECT * FROM Branch WHERE city = 'London'

will always be true and the query would be:


SELECT staffNo, fname, lname, position FROM Staff
WHERE true;
Also could write this query using a join construct

SELECT staffNo, fname, lname, position


FROM Staff s, Branch b
WHERE AND s.branchNo = b.branchNo
city = 'London';

9
ANY / SOME and ALL

ANY and ALL can be used with subqueries that produce a single column of numbers.
With ALL, condition will only be true if it is satisfied by all values produced by the subquery.
With the use of ANY, the condition will be true if met by whatever value produced by the
subquery.
If the subquery is empty, ALL returns true, ANY returns false.
SOME can be used in place of ANY.

Use of All

Example:

SELECT staffNo, fname, lname, position, salary

FROM Staff

WHERE salary> SOME

(SELECT salary

FROM Staff

WHERE branchNo = 'B003');

Inner query produces a set {12000, 18000, 24000} and outer query selects Reviews those staff Whose Salaries
are greater than any of the values in this set.

SELECT staffNo, fname, lname, position, salary


FROM Staff
WHERE salary> ALL
(SELECT salary
FROM Staff
WHERE branchNo = 'B003');

10
normalization
Normalization is a technique to produce a set of relationships with the desired properties,
considering the needs of the enterprise data. Or in other wordsNormalization is a technique
for organizing data into tables to meet the needs of users in an organization.

The purpose of normalization is to identify a suitable set of relations and support the
enterprise data needed. The characteristics of a suitable set of relations that include:

Minimize the number of attributes required to support the data requirements of the
company.
Attribute the close logical relationship (described as functional dependency) is found
in the same relationship
Minimal redundancy, with each attribute is represented only once, with the
important exception of the attributes that make up all or part of the foreign keys
that are very important to combine relations involved.

Figure 7.1 How Normalization can be used to support the design database

11
Data Redundancy and Update Anomalies
The main objective of relational database design is to group attributes into relations to minimize the
data redundancy

Relationships that have data redundancy has a problem called Update Anomalies which are
classified as Insertion, Deletion, or Modification anomalies.

insertion Anomalies
Insertion Anomaly, an error or mistake that occurs as a result of the operation insert (insert)
tuple / record on a relationship.
(So if in table 1 insert something, in table 2 relating sm table 1 of data to its guns come into
the insert as well)

Example: there are new subjects (CS-600), which will be taught, then the course \
tersebuttidak can insert / pasted into Relations Lecture at the top until there
mahasiswayang take these courses.

Delete Anomalies
Delete Anomaly is an error or mistake that occurs as a result operasipenghapusan (delete)
against tupe / record of a relationship.

Example: students with NIM 92 425 (in relation Lecture above), decided to opt out of
college CS400, because he is the only participant of the course, then when the record / tuple
is deleted / delete will result in a loss of information that courses CS400, the cost is 150.

12
Modification Anomalies
Update Anomaly is an error or mistake that occurs as a result of the change operation
(update) tuple / record of a relationship.

Example: the tuition fee for the course CS-200 (in relation to college above) will be increased
from 75 to 100, then it should be done several times a modification of the records or tuples
students taking courses CS-200, in order of data teap consistent ,

Based on the theory of normalization (to be discussed later), then the above relation lecture
should be split into two separate relationships as follows.

13
functional Dependencies
One of the main concepts related to normalization is functional dependency (functional
dependence). A functional dependency describe relationship / relationships between attributes.

Functional dependencies (functional dependencies) describes the relationship / relationship


between these attributes with relation. For example: If A and B are attributes of relation R. B is said
to be functionally dependent (functionally dependent) to A (denoted by AB), if each value of A in
relation R pairing-an appropriately with the value of B in relation R. The dependence between
attributes A and B can be seen in the picture below.

determinant

Determinant lead to the attribute or collection of attributes in the left side of functional dependency
arrows. Example as shown above. A is the determinant of B.

Transitive dependecy
A condition in which A, B, and C are attributes of a relation so that if A B and B  C, then C is
transitive dependencies from A through B (Provided that A is not functionally dependent on B or C).

Figure 7.1 Transitive Dependency

Transitive dependencies:

No-estab-city codes

City-city code, then

No-estab City

14
FULL FUNCTIONAL DEPENDENCY
Full functional dependency indicates that if A and B are attributes of a relation, a full functional
dependency B to A. If B is functionally dependent on A, but not in every part of the A.

Functional dependency:

No-estab -> Na-estab

No-bar, no-ent -> Number (full thd Depending keynya)

15
Normalization process

UNF
In the process of normalization of UNF us to show all of the fields or attributes that exist in a form to
be normalized.

1NF
A relation is in 1 NF if the relationship does not contain repeating attribute field calculation results is
eliminated and already has a primary key.

2NF
A relation is in 2NF if the relation is in 1NF and for each non-key attribute fully functionally
dependent to the primary key. So on we will eliminate the dependence 2NF partial / partial
dependence of certain fields only to one key composite.

Example: Table Students (Nim, Name, Address) Name and address depends on the meaning of the
Nim Nim we can specify the name or address otherwise the name / address does not specify nim,
then it means that the name and address depends partially to nim.

3NF A relation is in 3NF if this relation in 1NF and 2NF and no non-key attributes are dependent non-
key attributes functionally to the other (transitive dependencies).

Example: Table Employees (NoPegawai, honor, KdProyek, Date) KdProyek & Date is a non-key
attributes. But the date depends on the KdProyek. The solution is to split into two relations: Projects
(KdProyek, Date) PegProyek (Nopegawai, honor, KdProyek)

EXAMPLE

UNF (Un-normalize Form)

- Record all data including repetitive data


- {} for repetitive data -> usually for details

16
SalesInvoice(SalesInvoiceId, MemberID, memberName, Membership,
{ItemCode, ITEMNAME, Category, UnitPrice, Qty, Subtotal},
In total, PaymentType, StaffName, StaffPosition)

1NF
-Pisahkan repetitive data
-Muncul header detail
-Determine PK, FK => FK FOR SURE THERE IN DETAIL
-Hilangkan calculation result data

SalesInvoice (SalesInvoiceId, MemberId, memberName, membertype, PaymentType, StaffName,


StaffPosition)
SalesInvoiceDetail (SalesInvoiceId, ItemCode, ITEMNAME, Category, UnitPrice, Qty)

2NF
-Hilangkan partial dependence (partial dependencies)
-Pecahkan of SalesInvoice (second table) / break detail
-Pecahkan that has dependency with PK

SalesInvoice (SalesInvoiceId, MemberId, memberName, membertype, PaymentType, StaffName,


StaffPosition)

SalesInvoiceDetail (SalesInvoiceId, ItemCode, ITEMNAME, Category, UnitPrice, Qty)

Ms_Item (ItemCode, ITEMNAME, Category, UnitPrice)

3NF
-Hilangkan transitive dependency (dependency)
-Pecahkan header thus giving a new table
The same information -hilangkan

SalesInvoice (SalesInvoiceId, MemberId, StaffI, PaymentType)

17
SalesInvoiceDetail (SalesInvoiceId, ItemCode, Qty)

Ms_Item (ItemCode, ITEMNAME, Category, UnitPrice)

Ms_Member (MemberId, memberName, membertype)

Ms_Staff (StaffId, `` StaffName, StaffPosition, Gender)

18
Entity Relationship Modeling (ER)

Entity type is a set of objects that have the same properties.

For example: Staff, property, Customer, Supplier, Product, Sales Record (all instances except in the
form of a physical object that is conceptual Sale record).

Branch staff

Figure 1. Representation of entity types Branch and Staff

Relationship type is a set of associations between entities that have meaning. Each relationship type
given names that describe the relationship between the entities involved. This name is usually a verb
and should be unique untuksetiap ER models

tend
Branch staff

Fig 2. The representation of the relationship type between the entity and the entity
Staff Branch named Has

Relationship occuranceis a sub-part of a set of associations or sub-part of the relationship type.


Namely the relationship between the entity occurance (instance of an entity). Relationship
occurance represented by semantic net as illustrated below.

19
Branch staff
(BranchNo) tend
(StaffNo)
erloi
r1
B003 SG37

r2
SG14

B007 r3 SA9

Fig 3. Semantic net that describes the relationship occurance of a relationship


type Has

Degree of relationship type is the sum of the number of entities that participate in a relationship.

• binaryrelationship = 2 entities. (CMN use box)


• ternary relationship = 3 entities.
• quaternary = 4 entities

If the degree is greater than two (binary) then used a diamond shape to describe their relationships.

staff Registers Branch

client

Fig 4. Staff register the Client into a Branch

Recursive Relationship: referring to her own self because it has a role - a different role. Example:
staff supervisor, he oversees and supervised

20
attribute is the property of the entity or relationship type. StaffNo example, name, position, and
salary. Attribute domain is a value that is allowed for one or more attributes.

Type - the type attribute:

Simple attributesis an attribute that is composed of a single component with a free


existence. Simple attribute could not be broken down into smaller components. Simple
attributes often called atomic attributes. Ex: CUSTOMERNAME, price

composite attributesis a combination of multiple components, each having a free existence.


Composite attributes can be divided into smaller components of. Ex: address (no city, street,
etc)

Single Valued Attributeis an attribute that contains a single value for each occurrence of
entity.Mayoritas type of attribute is single-valued. Ex: email (although we got a lot of emails,
which in cantumin 1)

multivalued attributesis an attribute that has a lot of value for each occurrence of an entity.
Ex: employee (have a name, ttl, position, title)

derivative AttributesAn attribute that has a value of attributes associated attribute or


combination of attributes. Ex: age (didapet of date of birth)

Keys:

Candidate Keysis an attribute that has a unique identification of each occurrence of the
entity. For example: branchNo. Each value of branchNo certainly different for each entity
branch and its value should not be NULL.

primary Keyis a candidate key selected. Selection of visits based on the length primary key
attribute when there are two or more attributes of the candidate keys.

composite Keyis the key of the entity that has some attribute. 1 table has two unique key.
ex: detailPenyewaan (contents sewaID, movieID)

21
Foreign keyis the key liaison at another table. There is a foreign key in the child table. Goals
make the foreign key is to connect it to the parent table contained in a database.

Super keyis an attribute or set of attributes that are unique to identify tuples in the relation.
ex: nim and the name of the merged

Alternate key is a key candidate who failed to become the primary key.

Entity Type:

Strong (Strong) = can stand alone (master table)


Weak (Weak) = rely on other entities (table transactions)

Structural Constraints (Multiplicity)

One - to - one (1: 1) relationships

One - to - many (1: *) relationships

Many - to - many (*: *) relationships

From the diagram above, the newspapers can advertise more than one propery for rent and one
property for rent can be advertised by more than one newspaper.

Tersebutlah relationship called many - to - many

22
ER Diagram:

Cardinality and participation

cardinality describes the maximum number of possibilities existing relationship


participation determine whether all or only a portion of incident visible entity in a
relationship

Example: 0. , 1

0 = participation

1 = cardinality

23
Problems on ER models are divided into two, namely:

Fan Trap (its cardinality if smsm many to many)

Is where the model represents a relationship between entity types are, but the trend among some
instances the entity is ambiguous.

A fan trap may appear at first intercourse: * out of the same entity.

Chasm Trap (which actually is not always associated) (if Gaada)

Is where the model refers to an existence of a relationship between entity types are, but there is no
pattern between the occurrence of a particular entity.

A chasm trap may occur when one or more relationships with distinguished multiplicity minimum of
zero (that which is optional participation) that form part of a pattern among the entities that are
interconnected.

image represents the fact that a single branch has one or a few staff who supervise zero or more
properties for rent.

note: not all the staff overseeing the property and all property that is seen by a member of staff,
11:21 (b) examine some of the events of the relationship and the Overseas Has using the values for
the primary key attribute of the entity type Branch, Staff, and PropertyForRent .11.22 (a) association
representing the relationship between the entity's truthfulness.

This model ensures that every time the properties associated with each branch which are known,
including property that has not been allocated to members of staff.11.22 (b) examine the incidence
of type Has relations, Overseas, and Offers.

24
Example Problem ER:

Fruitsta is a store that sells fruits in bulk, where Fruitsta has its own fruit orchard for growing fruits
sold. Customers who want to buy fruit from Fruitsta, are required to register themselves first in
order to become a permanent member. Beginning of the process, the customer will provide the data
himself to the receptionist, then receptionist will register a customer to become a member. After
becoming a member, the member can order fruit and sales staff will take the order into the order
form, but before put in the form, the sales staff will check the stock of these fruits. If the stock fruit
in the message is not sufficient then the member can choose to reduce the amount of the order or
cancel the order. Order form will be taken to the warehouse to pick up the order members. When
Members make a payment, the cashier will make a proof of payment to the member.

Describe relational sekma!


Determine PK, FK and their attributes!
Define relationships and multiplicitinya!

Answer:

25
Exercises
Create ER Diagram

BRAC company providing transportation services in Jakarta. Vehicle needs a safe and convenient
location within easy reach. Customers who want to rent can be by telephone, came to the location.
Section Customer Service (CS) will ask for name, Subscriber Identity, the choice of the type of car,
how many days of rental, chauffeur services. Part CS will do recording and storage of the identity of
new customers in rental transactions. CS section records transactions car rental and driver (if any), in
print 3 copies (rangkap1 to archive, rangkap2 to the cashier, rangkap3 to customers).

Customers must make a payment based on the selected car hire. Payment is made directly
at the ATM transfer or cashier. Payments via ATMs, customers should fax the bank receipt to the
cashier. After receiving payment, the cashier will print a payment receipt as proof rangkap3 keel
(rangkap1 archives, rangkap2 of Operations, Customer rangkap3). After receiving proof of payment,
Operations Section handing the car to the customer and create a form vehicle expenses (Rangkap2)
which will be taken while removing the driver of the car and handed to the customer.

When the car was finished in use, then the customer returns directly to BRAC or leave in
returning the pickup point specified during car rental. When the car returned, Operations Section
create a form reception car (rangkap2), rangkap1 archives, rangkap2 customers.

Each month, part of CS reporting leasing and cashier section to report cash receipts
submitted to the Finance Manager

Describe relational sekma!


Determine PK, FK and their attributes!
Define relationships and multiplicitinya!

26
Enchance Entity Relationship Modeling (EER)

Specialization / generalization

GeneralisasiThe process minimizes the difference between the entities to identify their
common characteristics. Ex: cars, motorcycles dijadiin vehicle (bottom up / from the
particular to the general)

For example: Students consists of nim and name attribute. Then students macem2 itukan
there s1 s2, now that the student is minimized mksdnya s1 and s2 is the same attribute kan
neem and doang name. So gausah specified again.

specialization :The process of maximizing the difference between members of an entity to


identify the characteristics that distinguish them. Ex: no bank account and checking account
and savings account (top-down / dr general to specific) (using normalization)

(This was his ss tasya ok?)

superclass : Class parrent, group / groups that possess something in common


subclass : Child classes, attributes and methods inherited from parrent

two Constraint that may apply to a specialization / generalization:

participation -> and means both


disjoint -> or mean only one course

Participation constraint:

- Mandatory -> mandatory

- optional -> there should be no

27
4 categories constraints of specialization and generalization:

1. mandatory and disjoint (mandatory, or)


2. optional and disjoint (optional, or)
3. mandatory and nondisjoint (mandatory, and)
4. optional and nondisjoint. (Optional, and)

Type of Relationship:

composite: (strong) - Aggregation: (weak)

Ownership (there should be) Part of

cd Class M odel
cd Class M odel

M obil
M obil

Roda Spion

Examples of composite are the details:

28
3.2 Problems example

1. In a company there are many workers in which his role was in the form of manager,
customer service, sales. If every worker should have a role. Describe the relationship and
their class!

In a company there are many workers in which his role was in the form of manager,
customer service, sales. If every worker should not have a role or a maximum of one role.
Describe the relationship and their class!

Make the relational schema of the following cases! (Classes, attributes, multiplicity)
SI DVD Rental "Spazio"
Customers who would like to become a member must register in bg.CS by paying a
registration fee of Rp 30.000, -. After registering, customers will be given the Member Card.
Members can choose the DVD that was about to be hired and brought the DVD that has
been chosen to bg.Pembayaran.
Bg.Pembayaran will record a DVD rented and calculate the total pay as well as the rental
time limit, in which the time limit for each DVD rental is 2 days. After that, members will
make payments and bg.Pembayaran will print proof of payment. At the time of return,
bg.Pembayaran will check the rental time limit allowed.
If the payback time has passed the prescribed time limit, it will be subject to a fine of Rp
1,000, - / DVD

29
30
Exercises

Make the relational schema of the following cases! (Classes, attributes, multiplicity)

PT. Bendi Caris a company engaged in the car rental company All transactions are still done
manually. The following are the activities of activities carried out by officers in carrying out
transactions in the car rental company.

Car renters who want to borrow can look at a list price car rental car rental rates. Tenants can use
the services of a driver or not in accordance with the needs of tenants themselves. Each type of
vehicle has rental prices vary so does the rental price chauffeur services for the Jabodetabek area
and outside of the bead is different. After that tenants Rental Form (FS) along with a copy of their
identity. Then leasing the completed form along with full payment to be made submitted to the
cashier receipts as proof of payment.

On return of the vehicle by the tenant, the clerk check the condition of the vehicle if there is damage
or not and made a refund form. If available (eg rearview broken, body dents, scratched paint, etc.), it
is calculated and charged to the tenant replacement then made a form of damage. However When
tenants are late in repayment, the amount of delay of the car and driver will be charged to the
tenant. After paying damages and delays, the officer made a receipt as proof of payment of fines

At the end of the month following the rental officer made a report on damage or delay penalties
incurred and vehicle reports. The report was handed over to the owner of rental car Bendi.

Make a schematic diagram EER! use your assumptions if necessary


Complete with its attributes
Create relationships with multiplicity

31
Concept Data Warehouse (DWH)

The data warehouse is a subject-oriented data set (Object-Oriented), integrated (Integrated), Time-
Variant, and non-volatile to support the decision-making process.
Data Mart is part of a data warehouse that supports decision-making at the sub-parts of a company.
Data Mining is the data taken from the collection of data in very much and taken the information
contained therein to be used as tools in decision making.

4 Characteristics Data Warehouse


Oriented Subject (Subject-Oriented Data)

Operational Data Data Warehouse


Designed oriented only on specific Designed based on certain subjects
applications and functions
Focus on database design and process Focus on the data modeling and data
design
Contains details of data Contains history data to be used for
process analysis

The data warehouse is object-oriented where DWH is designed to analyze data based on a specific
subject within the company and not on a process or specific application functions. This is due to the
need of a data warehouse for storing data that is supporting a decision or in other words, the data
stored is oriented on the subject and not on the process. Data Warehouse is organized by the data
subjects related to the company such as customer, claims, and product shipment. And this allows
the contrast with the majority of systems Online Transaction Processing (OLTP) are more process-
oriented.

Integrated (Integrated)

Data warehousecan store data coming from separate sources into a consistent format and
integrated with one another. Data can not be broken because the data is an entity that supports the
entire data warehouse concept itself.

32
Terms of the integration of data sources can be met in a manner consistent in naming variables,
variable size, and physical attributes of the data.

Time-Variant

Data stored in the data warehouse contains a space-time that may be used as a business
record for each particular time, the data warehouse stores the history (historical data).
Compare this with the needs of the operational system that almost everything is up to date!
Time is of the type or piece of data that is critical in the data warehouse.

In the data warehouse is often stored various times, such as when a transaction occurs /
changed / canceled, when effective, when entered into the computer, when entered into
the data warehouse; also almost always saved version, for example, a change in the
definition of zip code, then the old and the new exist all the data in our warehouse. Again, a
good data warehouse is the store's history.
Non-Volatile
Data Warehouse is not updated continuously different OLTP systems are updated in real-
time. Data in the Warehouse periodically be uploaded in the same time period (eg every
morning or every end of the month).

Benefit Data Warehouse


• Return Of Investment (ROI) of high potential
• Competitive Advantage
• Increasing the productivity of decision -maker
End-User Access Tools Data Warehouse
• Traditional Reporting & Query
• Online Analytical Processing (OLAP)
• Data Mining
Problem Data Warehouse
• High maintenance
• Long project duration
• Complexity of Integration
• High demand for resources
• Data Ownership

33
Data Warehouse Architecture

Picture 2 Data Warehouse Architecture

Component Data Warehouse


Data warehouses have components consisting of:

1. Operational Data Resources taken from RDBMS


2. Load Manager: Tasked to load into the data warehouse
3. Warehouse Manager: Perform the operation to the data warehouse
4. Query Manager: It is a backend component Management
5. End-User Access Tools: work to menediakan information to the user

34
ETL Process
ETL is a collection of data from operational process of preparing a source for data. This process
consists dariextracting, transforming, loading, and some processes are carried out before they are
published to the data warehouse. So, ETL, or extract, transform, load is the processing phase of data
from the data source into the data warehouse. ETL purpose is to collect, filter, process and combine
the relevant existing data from different sources to be stored to dalamdata warehouse.

ETL can also be used to integrate with existing systems. Results of the ETL process is generates data
that meets the criteria of data warehouseseperti historical data, integrated, summarized, static and
has a structure that is designed for the purposes of the analysis process. ETL process consists of
three stages:

extract

The first step of the ETL process is the process of retrieving data from one or more operating
systems as a data source (can be taken from the OLTP system, but can also be from a data source
outside systemdatabase). Most of the data warehouse project to combine data from different
sources. In essence, the extraction process is a process of disassembly and cleaning of the extracted
data to obtain a pattern or structure of the desired data.

Transform

The process of cleaning the data that has been taken in the process of extract that data in
accordance with the structure of the data warehouse or data mart. Things that can be done in a
stage of transformation:

1. Just choose specific columns to be inserted into the data warehouse.


2. Translating the value in the code (eg, the source database storing the value 1 for men and 2
for women, but the data warehouse stores M for male and F for female). Automated process
performed is called a data cleansing, no cleaning manually during the ETL process.
3. Encode the values into a free form (eg, mapping the "male", "I", and "Mr. into" M ").
4. Calculating new values (eg sale_amount = qty * unit_price).
5. Combining data from at sources together.
6. Make a summary of a set of rows of data (eg, total sales for each section).

35
7. Difficulties that occur in the process of transformation is data to be combined from multiple
separate systems, must be cleaned so consistent and should be aggregated to accelerate the
analysis.

load

Load phase is a stage that serves to enter data into the final target, ie into a data warehouse. Time
and range to replace or add data depends on the design of the data warehouse at the time to
analyze the information requirements. Phase load interact with a database, constraint defined in the
schema databasesebagai a trigger that is activated at the time to load the data (for example:
uniqueness, referential, integrity, mandatory fields), which also contributes to the overall look and
quality of the data from the ETL process.

CASE

ERD + Normalization

36

You might also like