0% found this document useful (0 votes)

6 views

Week001-Module (1) Merged

This document provides an introduction to Data Mining, covering its techniques, processes, and applications in various fields. It discusses the Knowledge Discovery in Databases (KDD) process, machine learning algorithms, and the CRISP-DM methodology for data mining projects. Additionally, it introduces Weka as a data mining tool and includes a laboratory exercise focused on query optimization in Oracle's Human Resources database.

Uploaded by

rmnrqlprem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Week001-Module (1) Merged

Uploaded by

rmnrqlprem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 122

Advanced Database Management System

1
Data Mining

Introduction to Data Mining

This module provides an introduction to Data Mining, a new powerful

technology that extracts hidden information from large databases. Data
mining discovers hidden patterns and predicts future trends and behavior
that helps businesses make knowledge-driven decisions.
After studying this lesson, you should be able to:
1. Understand the use of data-mining techniques in determining hidden
patterns from the data.
2. Identify the steps in data mining procedures
3. Use WEKA as a data mining tool for analyzing data.

What is Knowledge Discovery and Data Mining

The Knowledge Discovery in Databases (KDD) is a computer science field
which main focused is to determine a learning useful pattern coming from a
large or digitized data.
One of the steps in KDD process is data mining in which focused on
discovering patterns useful for prediction. The data mining has been applied
in different area of disciplines which include medicine, agriculture, business
and education.
Data mining is an interdisciplinary subfield of computer science. It is the
computational process of discovering patterns in huge data sets involving
methods at the intersection of machine learning algorithms (Christopher,
2010).
Data-mining approaches can be separated into two categories.
1. Supervised learning
o for prediction and classification.
2. Unsupervised learning
o to detect patterns and relationships in the data.

What is Machine Learning Algorithm?

The term machine learning refers to the automated detection of meaningful
patterns in data. In the past couple of decades it has become a common tool
in almost any task that requires information extraction from large data sets.
The algorithms can build models used to make data driven decision and
predictions

Course Module
Knowledge Discovery Process Models
Sequential Structure of Knowledge Discovery of Databases

S
h
o
u
l
d
Figure 1 Sequential structure of knowledge discovery of databases

1. Input Data
 These are sets of data or attributes that need to be analyzed
2. Step/Procedures
 There are series of steps in treating the data. It includes data
preparation, cleaning, transformation and data mining
algorithms.
3. Knowledge
 it refers to the generated models extracted by specified machine
learning algorithms.

CRoss Industry Standard Process for Data Mining (CRISP-DM)

CRISP-DM is the most used methodology for developing DM projects
(KdNuggets.com, 2002; KdNuggets.com, 2004; KdNuggets.com, 2007a).

Figure 2 Cross industry standard process for data mining (Chapman et al., 2000)

The CRISP-DM KDP model (see Figure 2) consists of six steps, which are
summarized below:
1. Business understanding. This step focuses on the understanding of
objectives and requirements from a business perspective. It also
Advanced Database Management System
3
Data Mining

converts these into a DM problem definition, and designs a

preliminary project plan to achieve the objectives.

2. Data understanding. This step starts with initial data collection and
familiarization with the data. Specific aims include identification of
data quality problems, initial insights into the data, and detection of
interesting data subsets.
3. Data preparation. This step covers all activities needed to construct
the final dataset, which constitutes the data that will be fed into DM
tool(s) in the next step. It includes Table, record, and attribute
selection; data cleaning; construction of new attributes; and
transformation of data.
4. Modeling. At this point, various modeling techniques are selected and
applied. Modeling usually involves the use of several methods for the
same DM problem type and the calibration of their parameters to
optimal values.
5. Evaluation. After one or more models have been built that have high
quality from a data analysis perspective, the model is evaluated from a
business objective perspective. A review of the steps executed to
construct the model is also performed. A key objective is to determine
whether any important business issues have not been sufficiently
considered. At the end of this phase, a decision about the use of the
DM results should be reached.
6. Deployment. Now the discovered knowledge must be organized and
presented in a way that the customer can use. Depending on the
requirements, this step can be as simple as generating a report or as
complex as implementing a repeatable KDP.

Supervised Learning
The goal of supervised learning technique is to develop a model that predicts
a value for a continuous outcome or classifies a categorical outcome.
Supervised learning is the machine learning task of inferring a function
from labeled training data (Mohri et al 2012). The training data consist of a
set of training examples. In supervised learning, each example is
a pair consisting of an input object and target variable.
The figure below indicates how a supervised learning is being processed.

Figure 3 Supervised learning process

Course Module
The figure show that data sets are already classified and labeled. The data
sets will be processed using machine learning classification algorithms to
derive data models. The models will be assessed using accuracy and error
measure.

Classification Algorithms
One form of machine learning algorithms that can be used for extracting
models describing important classes or to predict future data trends is
classification. Classification models predict categorical class labels; and
prediction models predict continuous valued functions

Figure 4 Classification example

Predictors: Outlook, Temp, Humidity, Windy

Target variable: Play
The figure shows that the given the sets of predictors and observations. It can
create a model to derive the patterns for playing No and playing Yes.

Classification Trees
 Partition a data set of observations into increasingly smaller and more
homogeneous subsets.
 At each iteration, a subset of observations is split into two new
subsets based on the values of a single variable.
 Series of questions that successively narrow down observations into
smaller and smaller groups of decreasing impurity.
Advanced Database Management System
5
Data Mining

Figure 5 Decision tree model

In decision tree algorithm it can determine which attribute in a given set of

training feature vectors is most useful for discriminating between the classes
to be learned.
Information gain tells us how important a given attribute of the feature
vectors is.
Unsupervised learning is a type of machine learning algorithm used to
draw inferences from datasets consisting of input data without labeled
responses. The goal is to use the variable values to identify relationships
between observations

Clustering Algorithms
The goal of the clustering algorithms is to segment observations into similar
groups based on the observed variables.

K- Means Algorithm

Figure 6 K-Means results

Course Module
Clustering analysis has been widely used in many fields, for example, for
microarray gene expression data (Thalamuthu et al., 2006), mainly for
exploratory data analysis or class novelty discovery.
In the absence of a class label, clustering analysis is also called unsupervised
learning, as opposed to supervised learning that includes classification and
regression.

Classification Accuracy
By counting the classification errors on a sufficiently large validation set
and/or test set that is representative of the population, we will generate an
accurate measure of the model’s classification performance.
Classification confusion matrix: Displays a model’s correct and incorrect
classifications.
Table 1 Classification table

The true positives (TP) and true negatives (TN) are correct classifications. A
false positive (FP) occurs when the outcome is incorrectly predicted as yes (or
positive) when it is actually no (negative). A false negative (FN) occurs when
the outcome is incorrectly predicted as negative when it is actually positive.
The true positive rate is TP divided by the total number of positives, which is
TP + FN; the false positive rate is FP divided by the total number of negatives,
FP + TN. The overall success rate is the number of correct classifications
divided by the total number of classifications.

Introduction to Weka
Weka is a collection of machine learning algorithms for data mining tasks.
The algorithms can either be applied directly to a dataset or called from your
own Java code. Weka contains tools for data pre-processing, classification,
regression, clustering, association rules, and visualization. It is also well-
suited for developing new machine learning schemes.
Advanced Database Management System
7
Data Mining

Figure 7 The Interface of Weka

Data Format - Can work with a wide variety of data files including its own
“.arff” and csv file extensions
Classifiers in WEKA are models for predicting nominal or numeric quantities

Implemented learning schemes include:

Decision trees and lists, instance-based classifiers, support vector machines,
multi-layer perceptrons, logistic regression, Bayes’ nets

Weka Tutorial
Kindly refer to the link below for the tutorial of Weka:
https://weka.waikato.ac.nz/dataminingwithweka/preview

References
Chapman, P., et al. (2000). CRISP-DM 1.0 a step-by-Step data mining guide,
Germany: CRISP-DM Consortium.
Cios, K.J (2007). Data Mining A knowledge Discovery Approach

Course Module
Clifton, Christopher (2016). "Encyclopædia Britannica: Definition of Data
Mining". Retrieved 2016-12-09.
Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012). Foundations
of Machine Learning, The MIT Press ISBN 9780262018258.
Thalamuthu, I. Mukhopadhyay, X. Zheng and G.C. Tseng (2006). Evaluation
and comparison of gene clustering methods in microarray analysis.
Bioinformatics: 22:2405-2412.
Shai Shalev-Shwartz and Shai Ben-David (2014). Understanding Machine
Learning: From Theory to Algorithms. Published 2014 by Cambridge
University Press.
https://weka.waikato.ac.nz/dataminingwithweka/preview
http://www.springer.com/978-0-387-33333-5
Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 1 of 7

Query Optimizer in Oracle

1. The Human Resources (HR) Database

In this laboratory exercise, Human Resource (HR) database is used for test database. Download and
unzip the sql.rar and you will see 4 sql scipts. Run first the hr_drop.sql to remove existing tables then run
hr_main.sql to create the 7 tables. The scripts utlxplan.sql and utlxpls.sql will be used later in this exercise.

In the human resource records, each employee has an identification number, email address, job
identification code, salary, and manager. Some employees earn a commission in addition to their salary.

The company also tracks information about jobs within the organization. Each job has an identification
code, job title, and a minimum and maximum salary range for the job. Some employees have been with the
company for a long time and have held different positions within the company. When an employee switches
jobs, the company records the start date and end date of the former job, the job identification number, and the
department.

The sample company is regionally diverse, so it tracks the locations of not only its warehouses but also
of its departments. Each company employee is assigned to a department. Each department is identified by a
unique department number and a short name. Each department is associated with one location. Each location
has a full address that includes the street address, postal code, city, state or province, and country code.

For each location where it has facilities, the company records the country name, currency symbol,
currency name, and the region where the county resides geographically.
Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 2 of 7

2. Analyze

a) Run SQL*Plus for Command Line Interface.

b) Then connect to Oracle Database.

SQL> CONNECT [your username]/[your password]

c) Check Oracle Dictionary information to view the statistics used in query optimizer, i.e., USER_TABLES,
USER_INDEXES, USER_TAB_COLUMNS, USER_TAB_HISTOGRAMS.

**Note: For every statement that you execute starting from step C onwards, provide a screen shot of
the output.

i. See the statistics of EMPLOYEE table.

SQL> DESC user_tables

Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 3 of 7

SQL> SELECT table_name, num_rows, blocks, avg_row_len

2 FROM user_tables
3 WHERE table_name = 'EMPLOYEES';

Note: The table name must be capitalized since domain is case sensitive.

ii. See the statistics of EMP_EMP_ID_PK index.

SQL> DESC user_indexes

SQL> SELECT index_name, blevel, leaf_blocks, distinct_keys, num_rows

2 FROM user_indexes
3 WHERE index_name = 'EMP_EMP_ID_PK';

d) Delete and Compute the statistics of EMPLOYEE table, and check the statistics using user_tables.

i. Delete the statistics of EMPLOYEE using ANALYZE command, and then check the statistics from
user_tables.

SQL> ANALYZE TABLE EMPLOYEES DELETE STATISTICS;

SQL> SELECT table_name, num_rows, blocks, avg_row_len

2 FROM user_tables
3 WHERE table_name = 'EMPLOYEES';

ii. Create the statistics of EMPLOYEE using ANALYZE command, and then check created statistics
from user_tables.

SQL> ANALYZE TABLE EMPLOYEES COMPUTE STATISTICS;

SQL> SELECT table_name, num_rows, blocks, avg_row_len

2 FROM user_tables
3 WHERE table_name = 'EMPLOYEES';
Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 4 of 7

3. DBMS_STATS

You practice DBMS_STATS package with a new table NEW_EMPLOYEES.

e) Create NEW_EMPLOYEES table.

i. Create NEW_EMPLOYEES table, and compute its statistics using DBMS_STATS.

SQL> CREATE TABLE new_employees

2 AS SELECT *
3 FROM employees;

**YOUR_USERNAME should be replaced by your own Oracle Account.

SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS('[YOUR_USERNAME]','NEW_EMPLOYEES');

SQL> SELECT table_name, num_rows

2 FROM user_tables
3 WHERE table_name = 'NEW_EMPLOYEES';

ii. Make NEW_EMPLOYEES table larger, and then check the statistics.

SQL> INSERT INTO new_employees

2 SELECT * FROM new_employees;

SQL> SELECT table_name, num_rows

2 FROM user_tables
3 WHERE table_name = 'NEW_EMPLOYEES';

f) Using DBMS_STATS, update all statistics for your Oracle account. The created statistics should be saved
at STATS table.

SQL> EXECUTE DBMS_STATS.CREATE_STAT_TABLE('[ORACLE_USER]','STATS');

SQL> SELECT table_name, num_rows

2 FROM user_tables
3 WHERE table_name = 'NEW_EMPLOYEES';
Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 5 of 7

g) Using DBMS_STATS, delete all statistics for your Oracle account. Then, recover it from STATS table.

SQL> EXECUTE DBMS_STATS.DELETE_SCHEMA_STATS('[ORACLE_USER]');

SQL> SELECT table_name, num_rows

2 FROM user_tables
3 WHERE table_name = 'NEW_EMPLOYEES';

SQL> EXECUTE DBMS_STATS.IMPORT_SCHEMA_STATS('[ORACLE_USER]','STATS');

SQL> SELECT table_name, num_rows
2 FROM user_tables
3 WHERE table_name = 'NEW_EMPLOYEES';

4. View Execution Plan 1: EXPLAIN PLAN command

Before you can use EXPLAIN PLAN you must have a suitable table in which the plan results are stored.
Oracle supplies a script to create this table called 'utlxplan.sql'. You can access this file in the SQL folder that you
have downloaded.

h) Create PLAN_TABLE (utlxplan.sql).

SQL> @[LOCATION]\utlxplan Example: @C:\sql\utlxplan

i) Change your session into Rule-Base Optimizer (RBO) using ALTER SESSION.

SQL> ALTER SESSION SET OPTIMIZER_MODE = Rule;

j) Create Explain Plan for the following query.

SQL> EXPLAIN PLAN FOR

2 SELECT *
3 FROM employees
4 WHERE employee_id < '215';

k) Query PLAN_TABLE to see the generate Explain Plan (utlxpls.sql).

SQL> @[LOCATION]\utlxpls Example: @C:\sql\utlxpls

Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 6 of 7

l) OR, you can see the explain plan using DBMS_XPLAN package

SQL> SELECT *
2 FROM table (DBMS_XPLAN.DISPLAY);

m) Truncate PLAN_TABLE (deleting rows), and change you session into Cost-based Optimizer using ALTER
SESSION. Then, Create Explain Plan for the same query

SQL> TRUNCATE TABLE plan_table;

SQL> ALTER SESSION SET OPTIMIZER_MODE = all_rows;

SQL> EXPLAIN PLAN FOR

2 SELECT *
3 FROM employees
4 WHERE employee_id < '215';

n) Query PLAN_TABLE to see the generate Explain Plan (utlxpls.sql).

SQL> @[LOCATION]\utlxpls Example: @C:\sql\utlxpls

5. View Execution Plan 2: AUTOTRACE

o) Drop plan_table, and set AUTOTRACE option in SQL*Plus.

SQL> DROP TABLE plan_table;

SQL> SET AUTOTRACE ON

p) Execute query to see Execution plan.

SQL> SELECT d.department_name, sum(e.salary)

2 FROM employees e, departments d
3 WHERE e.department_id = d.department_id
4 AND e.salary >= 3000
5 GROUP BY d.department_name;
Course Code MIT412

Description Advanced Database Systems

College / Department:
School of Graduate Studies Lab Exer No. 3
Online Education

LABORATORY EXERCISE Page 7 of 7

q) Change the option of AUTOTRACE as TRACEONLY EXPLAIN.

SQL> SET AUTOTRACE TRACEONLY EXPLAIN

6. ANSWER THE FOLLOWING:

r) What does EXPLAIN PLAN and AUTOTRACE do in query optimizer?

s) Discuss your observation when you execute the statements step by step in the Oracle Query Optimizer.

t) Try to execute two queries that generates the same results: one that uses JOIN operation and the other
that uses subquery. Write the queries together with the result. Compare them using the EXPLAIN PLAN
and AUTOTRACE then provide a screen shot of the comparison. Do they have the same result with the
plan and trace? Which query statement is better? Explain.

u) Give your conclusion regarding this laboratory exercise.

MODULE OF INSTRUCTION

Subprograms and Triggers

The previous module revealed the real power of relational model by

examining multiple table queries in some detail. This module provides
an understanding of the importance of SQL in developing an
application.

Subprograms are named PL/SQL blocks that aid application

development by isolating operations. They are used in creating
modularized code and maintainable applications. Like subprograms,
triggers are also stored in the database and fired automatically when
specified event occur.

The syntaxes for creating subprograms differs from one vendor to

another. Thus in this module, we will be using Oracle’s PL/SQL
syntax.

Learning Objectives
After studying this lesson, you should be able to:

• Concisely define key terms

• Understand the use of SQL in procedural languages
• Understand common uses of subroprograms and database triggers
• Differentiate between a procedure and a function
• Identify the available parameter-passing modes
• Create, call and remove subprograms
• Create and remove database triggers

Subprograms
Subprograms are named blocks which can be compiled and stored in
the database. It promotes modularized program that are extensible,
reusable and maintainable. These subprograms can either be
procedures or functions.

Advanced Database Systems 1

2.0 Subprograms and Triggers

Table 2.1 Difference between Procedures and Functions

Procedures

A procedure is used to perform an action that can be called with a set

of parameters (input, output, and input output parameters).

Syntax

The REPLACE option indicates that if the procedure exists, it is

dropped and replaced with the new version created by the statement.

Table 2.2 Emp

2
MODULE OF INSTRUCTION

Example 1

The procedure update_empsalary increases the salary of the employees

by 5% if employee’s salary is less than 10,000. Otherwise an increase
of 100 for each employees with salary equal to 10,000 or higher.

Invoking the update_empsalary procedure using an anonymous block

will give you the following results:

Parameters are used to transfer data values to and from the calling
environment. They are treated the sme way like local variables. The
mode option defines whether IN (default), OUT, IN OUT parameters
are used.

• IN parameter provides values for a subprogram to process

• OUT parameter returns a value to the calling environment
• IN OUT parameter supplies an input value, which may be
returned (output) as a modified value

Advanced Database Systems 3

2.0 Subprograms and Triggers

Table 2.3 Comparing the Parameter Modes

Example 2

Specify the Example 2 shows the modified version of Example 1 by passing

parameter data parameter value (IN mode) to the procedure update_empsalary.
type without any
precision.

The actual parameter 10,000 is being passed to the formal parameter

p_sal of the update_empsalary procedure. The value is then used in the
WHERE condition of the update statement. The results are the same.

Example 3

4
MODULE OF INSTRUCTION

The getsalary procedure consists of two parameters: p_id as an IN

parameter mode and p_sal as an OUT parameter mode. The
anonymous block in the succeeding statement invoked getsalary
procedure to process the parameters.

The value 100 of v_id is passed to p_id of the getsalary procedure. The
SELECT statement uses this value to retrieve the salary of the
employee, stores it to the out parameter p_sal and returns the value to
v_sal. The value of v_sal is then used to update the salary of the
employees.

Example 4

Example 4 simply creates a procedure named format_name that

changes the letter case of the employee name from the original one
(first letter is capitalized) to uppercase letters. The p_name is
processed both as an input and output parameter.

When we execute the preceding anonymous block, the SELECT

statement retrieves the name of the employee with ID 100 (that is,
Steven King) into v_name. The parameter p_name accepts the value
and returns the modified value to v_name. In this case, the employee
name is now STEVEN KING. The SERVEROUTPUT must be set to

Advanced Database Systems 5

2.0 Subprograms and Triggers

ON whenever you use the DBMS_OUTPUT package to display

results.

Example 5

The DROP command is used to remove the procedure from your

schema.

Functions

A function is similar to procedure except it returns a value. It has only

input parameters that is used to compute and return a value. A function
may be called as part of SQL expression or as part of PL/SQL
expression.

Syntax

Do not include
size specificationn
in the RETURN
data type .
The RETURN data type must not include a size specification. There
must be at least one RETURN expression statement.

Example 6

The function get_salgrade retrieves the salary grade of each employee.

It is used as part of the SQL SELECT statement as shown below.

6
MODULE OF INSTRUCTION

Each row of the salary column is passed to the function. If the salary is
higher than 20,000, then the function returns salary grade ‘A’. If the
the salary is lower than 20,001 but higher than 10,000, the function
returns ‘B’. Otherwise it returns salary grade ‘C’.

Result

Example 7

To remove a function, use the DROP command as shown in Example

Triggers
Triggers are similar to stored procedures. However, they differ in the
way that they are invoked. A procedure is explicitly run by a user,
application, or a trigger. Whereas triggers are implicitly
(automatically) fired by the database when a triggering event occurs.

Advanced Database Systems 7

2.0 Subprograms and Triggers

Syntax

Timing indicates when the trigger fires in relation to the triggering

event. Values are:

• BEFORE executes the trigger first before the triggering event

on a table
• AFTER fires after the triggering event on a table
• INSTEAD OF are used for views that are not otherwise
modifiable.

Trigger event types determine which DML statement causes the

trigger to fire. The possible values are INSERT, UPDATE [of
column], and DELETE. FOR EACH ROW designates that the trigger
is a row trigger. A WHEN clause applies a conditional predicate
evaluated for each row to determine whether or not to execute the
trigger body.

Common uses of triggers are for:

• Security
• Auditing
• Data integrity
• Referential integrity
• Table replication
• Computing derived data automatically
• Event logging

8
MODULE OF INSTRUCTION

Example 8

In the example, the secure_emp trigger prevents DML operations from

succeeding if the business hours (8:00am to 6:00pm Monday through
Friday) is violated. If a user attempts to make changes to the
EMPLOYEES table on Sunday, then the user sees an error message.

The RAISE_APPLICATION_ERROR is a predefined procedure that

returns an error message to the user.

Glossary
Function

- A type of subprogram that returns one value and has only input
parameters.

Procedure

- A type of subprogram that performs an action and promotes

reusability and maintainability.

Subprograms

- Are named PL/SQL statements that can be called with a set of

parameters and are stored in the database.

Trigger

- A named set of SQL statements that is executed when a

triggering DML operation occurs.

Advanced Database Systems 9

2.0 Subprograms and Triggers

References
Hoffer, J., Ramesh, V., Topi, H. (2013). Modern Database
Management, 11th Ed. New Jersey: Pearson Education, Inc.

Price, J. (2008). Oracle Database 11g SQL: Master SQL and PL/QL in
the Oracle Database. New York: McGraw-Hill.

10
Advanced Database Systems
1
Object-Oriented and Object-Relational Databases

Object-Oriented and Object-Relational

Databases

This module introduces object-based databases as a solution to the emerging

obstacle faced by programmers using the relational data model. This
technology allows programmers to deal with complex data types as required
by complex application domains.
Two approaches are used in practice, which will be discussed in this module:
object-oriented and object-relational database system.
After studying this lesson, you should be able to:
1. Concisely define key terms.
2. Describe the main features of Object-Oriented DBMS.
3. Model a real-world domain by using a Unified Modeling Language (UML)
class diagram.
4. List the features of Object-Relational DBMS.
5. Compare the object-oriented and object-relational DBMS.

Object-Oriented DBMS (OODBMS)

In the mid-80’s object-oriented databases emerged in response to the
inadequacy of certain classes of applications in relational databases. Object-
oriented model is becoming increasingly popular because of its ability to
thoroughly represent complex relationships, as well as to represent data and
system behavior in a consistent, integrated notation. The object-oriented
approach offers even more expressive power than the relational model.
Technopedia (2016) defined OODBMS as a database that subscribes to a
model with information represented by objects. Object-oriented databases
are a niche offering in the relational database management system (RDBMS)
field and are not as successful or well-known as mainstream database
engines.
Technopedia also explains that as the name implies, the main feature of
object-oriented databases is allowing the definition of objects, which are
different from normal database objects. Objects, in an object-oriented
database, reference the ability to develop a product, then define and name it.
The object can then be referenced, or called later, as a unit without having to
go into its complexities. This is very similar to objects used in object-oriented
programming.
Relations are not the central concept, classes and objects are the main
concept. Object-oriented DBMS are DBMS based on an object-oriented data
Course Module
model inspired by object-oriented programming languages. OODBMS are
capable of storing complex objects, i.e., objects that are composed of other
objects, and/or multivalued attributes.

Main Features:
 Powerful Type System
o Primitive types: integer, string, date, Boolean, float, etc.
o Structure type: attribute can be a record with a schema
o Collection type: attribute can be a Set, Bag, List, Array of other
types
o Reference type: attribute cane be a Pointer to another object
 Classes
o A class is in replacement of relation
 Object Identity
o OID is a unique identity of each object regardless of its content
o Easier for references
 Inheritance
o A class can be defined in terms of another one.
Strengths
The advantages of the object database approach are that applications which
define a rich type system for their data structures can carry this over to the
database when these data structures are made persistent. The impedance
mismatch is eliminated as the database becomes a “persistence extension” to
the language rather than an external service that one has to talk to through a
narrow and limited interface.

Weaknesses
Object systems introduce pointers linking objects to each other. This in turn
promotes a procedural style of navigation among the data items that can be
seen as a step backwards from the higher-level declarative approach
introduced by relational systems.

Unified Modeling Language (UML)

UML is a set of graphical notations backed by a common metamodel that is
widely used both for usiness modeling and for specifiying, deigning, and
implementing software system artifacts. To represent a complex system
effectively, the model you develop must consist of a set of independent views
or perspectives. UML allows you to represent multiple perspectives of a
system by providing different types of graphical diagrams, such as the use-
case diagram, class diagram, state diagram, sequence diagram, component
diagram, and deployment diagram. If these diagrams are used correctly
together in the context of a well-defined modeling process, UML allows you
to analyze, design, and implement a system based on one consistent
conceptual model.
Advanced Database Systems
3
Object-Oriented and Object-Relational Databases

Figure 1 Example of UML Classes

Object-Oriented Data Modeling

A class is an entity type that has a well-defined role in the application
domain about which the organization wishes to maintain state, behavior, and
identity. A class is a concept, an abstraction, or a thing that makes sense and
matters in an application context (Blaha and Rumbaugh, 2005). A class could
represent a tangible or visible entity type (e.g., a person, place, or thing); it
could be a concept or an event (e.g., Department, Performance, Marriage,
Registration, etc.); or it could be an artifact of the design process (e.g., User
Interface, Controller, Scheduler). An object is an instance of a class (e.g., a
particular person, place, or thing) that encapsulates the data and behavior we
need to maintain about that object. A class of objects shares a common set of
attributes and behaviors.
The state of an object encompasses its properties (attributes and
relationships) and the values those properties have, and its behavior
represents how an object acts and reacts (Booch, 1994). Thus, an object’s
state is determined by its attribute values and links to other objects. An
object’s behavior depends on its state and the operation being performed. An
operation is simply an action that one object performs in order to give a
response to a request. You can think of an operation as a service provided by
an object (supplier) to its clients. A client sends a message to a supplier,
which delivers the desired service by executing the corresponding
operation.
Consider an example of the Student class and a particular object in this class,
Mary Jones. The state of this object is characterized by its attributes, say,
name, date of birth, year, address, and phone, and the values these attributes
currently have. For example, name is “Mary Jones,” year is “junior,” and so on.
The object’s behavior is expressed through operations such as calcGpa, which
is used to calculate a student’s current grade point average. The Mary Jones
Course Module
object, therefore, packages its state and its behavior together. Every object
has a persistent identity; that is, no two objects are the same, and an object
maintains its own identity over its life. For example, if Mary Jones gets
married and the values of the attributes name, address, and phone change for
her, the same object will still represent her.
You can depict the classes graphically in a class diagram as in Figure 2. A
class diagram shows the static structure of an object-oriented model: the
object classes, their internal structure, and the relationships in which they
participate. The figure shows two classes, Student and Course, along with
their attributes and operations. All students have in common the properties
of name, dateOfBirth, year, address, and phone. They also exhibit common
behavior by sharing the calcAge, calcGpa, and registerFor (course)
operations.

Figure 2 Class Diagram

An operation, such as calcGpa in Student (see Figure 2), is a function or a

service that is provided by all the instances of a class. Typically, other objects
can access or manipulate the information stored in an object only through
such operations. The operations, therefore, provide an external interface to a
class; the interface presents the outside view of the class without showing its
internal structure or how its operations are implemented. This technique of
hiding the internal implementation details of an object from its external view
is known as encapsulation, or information hiding. So although we provide
the abstraction of the behavior common to all instances of a class in its
interface, we hide within the class its structure and the secrets of the desired
behavior.
Advanced Database Systems
5
Object-Oriented and Object-Relational Databases

Object-Relational Database Management System (ORDBMS)

Object-oriented model tries to bring the main concepts from relational model
to the OO domain (OO concepts with some extensions). While object-
relational model tries to bring the main concepts from the OO domain to the
relational model (relational model with some extensions through user-
defined types).
Technopedia (2016) define ORDBMS as a database management system with
that is similar to a relational database, exept that it has an object-oriented
database model. This system supports objects, classes and inheritance in
database schemas and query language.
It also explains that object relational database management systems provide
a middle ground between relational and object-oriented databases. In an
ORDBMS, data is manipulated using queries in a query language. These
systems bridge the gap between conceptual data modeling techniques such
as entity relationaship diagrams and object relational mapping using classes
and inheritance. ORDBMSs also support data model extensions with custom
data types and methods. This allows developers to raise the abstraction
levels at which problem domains are viewed.
Relational model extended with the following features:
 Type system with primitive ans structure types (UDT)
 Methods
 Identifiers for tuples
 References
Several major software vendors including IBM, Informix, Microsoft, Oracle,
and Sybase supports object-relational model in their DBMS. SQL-99 or SQL3
is the extended SQL standards for object-relational model.
User-defined types (UDT) replace the concept of classes which consist of the
following form:
Create type <name> as (attributes and method declarations)

Course Module
Figure 2 Example of UDT

Once type are created, we can create relations:

Create Table Moviestar OF StarType;
Now, let’s query object-relational database using SQL-99 (SQL3):

Figure 3 Querying an object-relational database

Comparison of Object-Oriented and Object-Relational Databases

When to consider OODBMS or ORDBMS:
 Complex Relationships
o A lot of many-to-many relationships, tree structures or
network (graph) structures.
 Complex Data
o Multi-dimensional arrays, nested structures, or binary data,
images, multimedia, etc.
 Distributed Databases
o Need for free objects without the rigid table structure.
 Repetitive use of Large Working Sets of Objects
o To make use of inheritance and reusability.
 Expensive Mapping Layer
o Expensive decomposition of objects (normalization) and re-
composition at query time.
Advanced Database Systems
7
Object-Oriented and Object-Relational Databases

Key benefits of ODBMS:

 Persistence & versioning
o Created objects are maintained across different database runs
(persistent)
o Different evolving copies of the same object can be created
over time (versioning)
 Sharing in highly distributed environment
o Easier to share and distribute objects than tables
 Better memory usage and less paging
o Bringing only objects of interest

Object-oriented vs. Object-relational DBMS

1. Object-relational databases are object-oriented databases built on top of
the relational model.
The declarative nature and limited power of the SQL language
provides good protection of data from programming errors, and makes the
high-level optimization, such as reducing I/O, relatively easy.
Object-relational systems aim at making data modeling and querying
easier by using complex data types. Typical applications include storage and
querying of complex data, including multimedia data.
2. Persistent programming language-based OODBs target applications of
that form that have high performance requirements: no data translation
needed, low-overhead access to persistent data, but more susceptible to
data corruption to programming errors and usually do not have a
powerful querying capability. Typical applications include CAD databases.
This contrasts with a declarative language which imposes a significant
performance penalty for certain kinds of applications that run primarily in
main memory and that perform a large number of accesses to the databases.
3. Summary:
1. relational systems: simple data types, powerful query languages,
high protection.
2. persistent programming language-based OODBs: complex data
types, integration with programming languages, high performance.
3. object-relational systems: complex data types, powerful query
languages, high protection.
Some systems blurs the boundary, e.g., some object-oriented database
systems built around a persistent programming language are implemented
on top of a relational database system.

Criteria RDBMS ORDBMS ODBMS

Defining standard SQL2 (ANSI X3H2) SQL3/4 (in process) ODMG-V2.0

Course Module
Support for OO Poor: programmers Limited mostly to Direct and extensive
programming spend 25% coding new data types
time mapping the
program object to
the database

Simplicity of use Table structures Same as RDBMS, OK for programmers;

easy to understand; with some confusing some SQL access for
many end-user tools extensions end users
available

Simplicity of Provides Provides Objects are a natural

development independence of independence of way to model; can
data from data from accommodate a wide
application, good for application, good for variety of types and
simple relationships simple relationships relationships

Extensibility and None Limited mostly to Can handle arbitrary

content new data types complexity; users
can write methods
and on any structure

Complex data Difficult to model Difficult to model Can handle arbitrary

relationships complexity; users
can write methods
and on any structure

Activities and Exercises

Answer the following questions:
1. What are ODL and OQL? Provide a comprehensive example for each.
2. Perform some research on the Internet on OODBMS products. Compare
various OODBMSs currently on the market in terms of features, capacity,
and scalability. How do they compare with RDBMS products?

References

References
Hoffer, J. A., Ramesh, V., Topi, H. (2013). Modern Database Management 11th
Ed., New Jersey: Prentice Hall.
Silberschatz, A., Korth, H. F., Sudarshan, S. (2011). Database System Concepts
th
6 Ed., New York: McGraw-Hill.
http://wps.prenhall.com/wps/media/objects/3310/3390076/hoffer_ch15.p
df
https://www.techopedia.com/definition/8639/object-oriented-database
https://www.techopedia.com/definition/8715/object-relational-database-
management-system-ordbms
Advanced Database Systems
9
Object-Oriented and Object-Relational Databases

http://web.cs.wpi.edu/~cs561/s12/Lectures/2-3/OO.pdf
http://argouml.tigris.org/

Course Module
MODULE OF INSTRUCTION

Query Optimization

This module describes the most important concepts relating to the

query optimizer. It discusses SQL processing, optimization methods,
and how the query optimizer chooses a specific plant to execute SQL.

The examples in this module uses the HR database schema which you
can download from this module. It consists of seven tables namely
COUNTRIES, DEPARTMENTS, EMPLOYEES, JOB_HISTORY,
JOBS, LOCATIONS, and REGIONS.

It is important that you understand how query optimizer processes

SQL statement and chooses the most efficient way to execute SQL
statement.

Learning Objectives
After studying this lesson, you should be able to:

• Concisely define key terms

• Understand how query are processed
• Understand how the query optimizer chooses a specific plan to
execute SQL statement

Query Optimizer
The query optimizer (called simply the optimizer) is built-in
database software that determines the most efficient method for a SQL
statement to access requested data.

Purpose of the Query Optimizer

The optimizer attempts to generate the best execution plan for a SQL
statement. The best execution plan is defined as the plan with the
lowest cost among all considered candidate plans. The cost
computation accounts for factors of query execution such as I/O, CPU,
and communication.

Advanced Database Systems 1

3.0 Query Optimization

The best method of execution depends on myriad conditions including

how the query is written, the size of the data set, the layout of the data,
and which access structures exist. The optimizer determines the best
plan for a SQL statement by examining multiple access methods, such
as full table scan or index scans, and different join methods such as
nested loops and hash joins.

Because the database has many internal statistics and tools at its
disposal, the optimizer is usually in a better position than the user to
determine the best method of statement execution. For this reason, all
SQL statements use the optimizer.

Consider a user who queries records for employees who are managers.
If the database statistics indicate that 80% of employees are managers,
then the optimizer may decide that a full table scan is most efficient.
However, if statistics indicate that few employees are managers, then
reading an index followed by a table access by rowid may be more
efficient than a full table scan.

Cost-Based Optimization

Query optimization is the overall process of choosing the most

efficient means of executing a SQL statement. SQL is a nonprocedural
language, so the optimizer is free to merge, reorganize, and process in
any order.

The database optimizes each SQL statement based on statistics

collected about the accessed data. When generating execution plans,
the optimizer considers different access paths and join methods.
Factors considered by the optimizer include:

 System resources, which includes I/O, CPU, and memory

 Number of rows returned
 Size of the initial data sets

The cost is a number that represents the estimated resource usage for
an execution plan. The optimizer’s cost model accounts for the I/O,
CPU, and network resources that the database requires to execute the
query. The optimizer assigns a cost to each possible plan, and then
chooses the plan with the lowest cost. For this reason, the optimizer is
sometimes called the cost-based optimizer (CBO) to contrast it with
the legacy rule-based optimizer (RBO).

2
MODULE OF INSTRUCTION

Execution Plans

An execution plan describes a recommended method of execution for

a SQL statement. The plans shows the combination of the steps Oracle
Database uses to execute a SQL statement. Each step either retrieves
rows of data physically from the database or prepares them for the user
issuing the statement.

An execution plan displays the cost of the entire plan, indicated on line
0, and each separate operation. The cost is an internal unit that the
execution plan only displays to allow for plan comparisons. Thus, you
cannot tune or change the cost value.

In Figure 3-1, the optimizer generates two possible execution plans for
an input SQL statement, uses statistics to estimate their costs,
compares their costs, and then chooses the plan with the lowest cost.

Figure 3-1 Execution Plans

Query Blocks

As shown in Figure 3-1, the input to the optimizer is a parsed

representation of a SQL statement. Each SELECT block in the original
SQL statement is represented internally by a query block. A query
block can be a top-level statement, subquery, or unmerged view.

Advanced Database Systems 3

3.0 Query Optimization

Example 3-1 Query Blocks

The following SQL statement consists of two query blocks. The

subquery in parentheses is the inner query block. The outer query
block, which is the rest of the SQL statement, retrieves names of
employees in the departments whose IDs were supplied by the
subquery. The query form determines how query blocks are
interrelated.

Query Subplans

For each query block, the optimizer generates a query subplan. The
database optimizes query blocks separately from the bottom up. Thus,
the database optimizes the innermost query block first and generates a
subplan for it, and then generates the outer query block representing
the entire query.

The number of possible plans for a query block is proportional to the

number of objects in the FROM clause. This number rises
exponentially with the number of objects. For example, the possible
plans for a join of five tables are significantly higher than the possible
plans for a join of two tables.

Analogy for the Optimizer

One analogy for the optimizer is an online trip advisor. A cyclist wants
to know the most efficient bicycle route from point A to point B. A
query is like the directive "I need the most efficient route from point A
to point B" or "I need the most efficient route from point A to point B
by way of point C." The trip advisor uses an internal algorithm, which
relies on factors such as speed and difficulty, to determine the most
efficient route. The cyclist can influence the trip advisor's decision by
using directives such as "I want to arrive as fast as possible" or "I want
the easiest ride possible."

In this analogy, an execution plan is a possible route generated by the

trip advisor. Internally, the advisor may divide the overall route into

4
MODULE OF INSTRUCTION

several subroutes (subplans), and calculate the efficiency for each

subroute separately. For example, the trip advisor may estimate one
subroute at 15 minutes with medium difficulty, an alternative subroute
at 22 minutes with minimal difficulty, and so on.

The advisor picks the most efficient (lowest cost) overall route based
on user-specified goals and the available statistics about roads and
traffic conditions. The more accurate the statistics, the better the
advice. For example, if the advisor is not frequently notified of traffic
jams, road closures, and poor road conditions, then the recommended
route may turn out to be inefficient (high cost).

Optimizer Components
The optimizer contains three components, which are as shown in
Figure 3-2.

Figure 3-2 Optimizer Components

Advanced Database Systems 5

3.0 Query Optimization

A set of query blocks represents a parsed query, which is the input to

the optimizer. The optimizer performs the following operations:

1. Query transformer

The optimizer determines whether it is helpful to change the

form of the query so that the optimizer can generate a better
execution plan.

2. Estimator

The optimizer estimates the cost of each plan based on

statistics in the data dictionary.

3. Plan Generator

The optimizer compares the costs of plans and chooses the

lowest-cost plan, known as the execution plan, to pass to the
row source generator.

Query Transformer

For some statements, the query transformer determines whether it is

advantageous to rewrite the original SQL statement into a semantically
equivalent SQL statement with a lower cost. When a viable alternative
exists, the database calculates the cost of the alternatives separately
and chooses the lowest-cost alternative. Query Transformations
describes the different types of optimizer transformations.

Fi gure 3-3 shows the query transformer rewriting an input query that
uses OR into an output query that uses UNIONALL.

6
MODULE OF INSTRUCTION

Figure 3-3 Query Transformer

Estimator

The estimator is the component of the optimizer that determines the

overall cost of a given execution plan.

The estimator uses three different measures to determine cost:

 Selectivity

The percentage of rows in the row set that the query selects,
with 0 meaning no rows and 1 meaning all rows. Selectivity is
tied to a query predicate, such as WHERE last_name LIKE
'A%', or a combination of predicates. A predicate becomes
more selective as the selectivity value approaches 0 and less
selective (or more unselective) as the value approaches 1.
Selectivity is an
internal calculation
that is not visible in  Cardinality
the execution plans.
The cardinality is the number of rows returned by each
operation in an execution plan. This input, which is crucial to
obtaining an optimal plan, is common to all cost functions. The

Advanced Database Systems 7

3.0 Query Optimization

estimator can derive cardinality from the table statistics

collected by DBMS_STATS, or derive it after accounting for
effects from predicates (filter, join, and so
on), DISTINCT or GROUP BY operations, and so on.
The Rows column in an execution plan shows the estimated
cardinality.

 Cost

This measure represents units of work or resource used. The

query optimizer uses disk I/O, CPU usage, and memory usage
as units of work.

As shown in Figure 3-4, if statistics are available, then the estimator

uses them to compute the measures. The statistics improve the degree
of accuracy of the measures.

Figure 3-4 Estimator

For the query shown in Example 3-1, the estimator uses selectivity,
estimated cardinality (a total return of 10 rows), and cost measures to
produce its total cost estimate of 3:

8
MODULE OF INSTRUCTION

Selectivity

The selectivity represents a fraction of rows from a row set. The row
set can be a base table, a view, or the result of a join. The selectivity is
tied to a query predicate, such as last_name = 'Smith', or a combination
of predicates, such as last_name = 'Smith' AND job_id ='SH_CLERK'.

A predicate filters a specific number of rows from a row set. Thus, the
selectivity of a predicate indicates how many rows pass the predicate
test. Selectivity ranges from 0.0 to 1.0. A selectivity of 0.0 means that
no rows are selected from a row set, whereas a selectivity of 1.0 means
that all rows are selected. A predicate becomes more selective as the
value approaches 0.0 and less selective (or more unselective) as the
value approaches 1.0.

The optimizer estimates selectivity depending on whether statistics are

available:

 Statistics not available

Depending on the value of the

OPTIMIZER_DYNAMIC_SAMPLING initialization
parameter, the optimizer either uses dynamic statistics or an
internal default value. The database uses different internal
defaults depending on the predicate type. For example, the
internal default for an equality predicate (last_name = 'Smith')
is lower than for a range predicate (last_name > 'Smith')
because an equality predicate is expected to return a smaller
fraction of rows.

 Statistics available

When statistics are available, the estimator uses them to

estimate selectivity. Assume there are 150 distinct employee
last names. For an equality predicate last_name = 'Smith',
selectivity is the reciprocal of the number n of distinct values
of last_name, which in this example is .006 because the query
selects rows that contain 1 out of 150 distinct values.

If a histogram exists on the last_name column, then the

estimator uses the histogram instead of the number of distinct
values. The histogram captures the distribution of different

Advanced Database Systems 9

3.0 Query Optimization

values in a column, so it yields better selectivity estimates,

especially for columns that have data skew.

Cardinality

The cardinality is the number of rows returned by each operation in

an execution plan. For example, if the optimizer estimate for the
number of rows returned by a full table scan is 100, then the
cardinality estimate for this operation is 100. The cardinality estimate
appears in the Rows column of the execution plan.

The optimizer determines the cardinality for each operation based on a

complex set of formulas that use both table and column level statistics,
or dynamic statistics, as input. The optimizer uses one of the simplest
formulas when a single equality predicate appears in a single-table
query, with no histogram. In this case, the optimizer assumes a
uniform distribution and calculates the cardinality for the query by
dividing the total number of rows in the table by the number of distinct
values in the column used in the WHERE clause predicate.

Example 3-2

For example, user hr queries the employees table as follows:

The employees table contains 107 rows. The current database statistics
indicate that the number of distinct values in the salary column is 58.
Thus, the optimizer calculates the cardinality of the result set as 2,
using the formula 107/58=1.84.

Cardinality estimates must be as accurate as possible because they

influence all aspects of the execution plan. Cardinality is important
when the optimizer determines the cost of a join. For example, in a
nested loops join of the employees and departments tables, the number
of rows in employees determines how often the database must probe

10
MODULE OF INSTRUCTION

the departments table. Cardinality is also important for determining the

cost of sorts.

Cost

The optimizer cost model accounts for the I/O, CPU, and network
resources that a query is predicted to use. The cost is an internal
numeric measure that represents the estimated resource usage for a
plan. The lower the cost, the more efficient the plan.

The execution plan displays the cost of the entire plan, which is
indicated on line 0, and each individual operation.

Example 3-3

For example, the following plan shows a cost of 7.

The cost is an internal unit that you can use for plan comparisons. You
cannot tune or change it.

The access path determines the number of units of work required to get
data from a base table. The access path can be a table scan, a fast full
index scan, or an index scan.

 Table scan or fast full index scan

During a table scan or fast full index scan, the database reads
multiple blocks from disk in a single I/O. Therefore, the cost of

Advanced Database Systems 11

3.0 Query Optimization

the scan depends on the number of blocks to be scanned and

the multiblock read count value.

 Index scan

The cost of an index scan depends on the levels in the B-tree,

the number of index leaf blocks to be scanned, and the number
of rows to be fetched using the rowid in the index keys. The
cost of fetching rows using rowids depends on the index
clustering factor.

The join cost represents the combination of the individual access costs
of the two row sets being joined, plus the cost of the join operation.

Plan Generator

The plan generator explores various plans for a query block by trying
out different access paths, join methods, and join orders. Many plans
are possible because of the various combinations that the database can
use to produce the same result. The optimizer picks the plan with the
lowest cost.

Figure 3-5 shows the optimizer testing different plans for an input
query.

Figure 3-5 Plan Generator

12
MODULE OF INSTRUCTION

The following snippet from an optimizer trace file shows some

computations that the optimizer performs:

The trace file shows the optimizer first trying the departments table as
the outer table in the join. The optimizer calculates the cost for three
different join methods: nested loops join (NL), sort merge (SM), and
hash join (HA). The optimizer picks the hash join as the most efficient
method:

Advanced Database Systems 13

3.0 Query Optimization

The optimizer then tries a different join order, using employees as the
outer table. This join order costs more than the previous join order, so
it is abandoned.

The optimizer uses an internal cutoff to reduce the number of plans it

tries when finding the lowest-cost plan. The cutoff is based on the cost
of the current best plan. If the current best cost is large, then the
optimizer explores alternative plans to find a lower cost plan. If the
current best cost is small, then the optimizer ends the search swiftly
because further cost improvement is not significant.

Glossary
Estimator

- Is the component of the optimizer that determines the overall

cost of a given execution plan.

Execution Plan

- Describes a recommended method of execution for a SQL

statement.

Plan Generator

- Explores various plans for a query block by trying out different

access paths, join methods, and join orders.

Query Optimization

- Is the overall process of choosing the most efficient means of

executing a SQL statement.

Query Optimizer

- Attempts to generate the best execution plan for a SQL

statement.

Query Transformer

- Determines whether it is advantageous to rewrite the original

SQL statement intoa semantically equivalent SQL statement
that can be processed more efficiently.

14
MODULE OF INSTRUCTION

References
Price, J. (2008). Oracle Database 11g SQL: Master SQL and PL/QL in
the Oracle Database. New York: McGraw-Hill.

https://docs.oracle.com/database/121/TGSQL/tgsql_optcncpt.htm#TG
SQL196

http://docs.oracle.com/cd/E25178_01/server.1111/e16638/optimops.ht
m

Advanced Database Systems 15

MODULE OF INSTRUCTION

Database System Architecture

This module describes the key concepts and all of the components of a
client/server system, parallel databases and distributed databases. This
chapter emphasizes the impact of these database system architecutres
and how to best take advantage of the resources and choices available.

It is important that you have a comprehensive view of the possibilities

of client/server, parallel, and distributed computing and the advantages
and disadvantages of these architectural structures.

Learning Objectives
After studying this lesson, you should be able to:

• Concisely define key terms

• Describe the components of client/server system, parallel
databases and distributed databases
• Compare and contrast the different database system architectures

Client/Server Architectures
Client/server systems operate in networked environments, splitting
the processing of an application between a front-end client and a back-
end processor. Generally, the client process requires some resource,
which the server provides to the client. Clients and servers can reside
in the same computer, or they can be on different computers that are
networked together. Both clients and servers are intelligent and
programmable, so the computing power of both can be used to devise
effective and efficient applications.

Figure 4-1 Client/Server Architecture

Advanced Database Systems

4.0 Database System Architecture

Client/server architectures can be distinguished by how application

logic components are distributed across clients and servers. There are
three components of application logic (see Figure 4-2). The first is the
input/output (I/O), or presentation logic, component. This component
is responsible for formatting and presenting data on the user’s screen
or other output device and for managing user input from a keyboard or
other input device. Presentation logic often resides on the client and is
the mechanism with which the user interacts with the system.

Figure 4-2 Application Logic Components

The second component is the processing logic. This handles data

processing logic, business rules logic, and data management logic.
Data processing logic includes such activities as data validation and
identification of processing errors. Business rules that have not been
coded at the DBMS level may be coded in the processing component.
Data management logic identifies the data necessary for processing the
transaction or query. Processing logic resides on both the client and
servers. The third component is storage, the component responsible
for data storage and retrieval from the physical storage devices
associated with the application. Storage logic usually resides on the
database server, close to the physical location of the data. Activities of
a DBMS occur in the storage logic component.

Client/server architectures are normally categorized into three types:

two-, three-, or n-tier architectures, depending on the placement of the
three types of application logic. There is no one optimal client/server
architecture that is the best solution for all business problems. Rather,

2
MODULE OF INSTRUCTION

the flexibility inherent in client/server architectures offers

organizations the possibility of tailoring their configurations to fit their
particular processing needs.

Application partitioning helps in this tailoring. It gives developers

the opportunity to write application code that they can later place
either on a client workstation or on a server, depending on which
location will give the best performance. It is not necessary to include
the code that will place the process being partitioned or to write the
code that will establish the connections to the process. Those activities
are handled by application partitioning tools.

Databases Two-tier Architecture

In a two-tier architecture, a client workstation is responsible for

managing the user interface, including presentation logic, data
processing logic, and business rules logic, and a database server is
responsible for database storage, access, and processing. Figure 4-3
shows a typical database server architecture. With the DBMS placed
on the database server, LAN traffic is reduced because only those
records that match the requested criteria are transmitted to the client
station, rather than entire data files. Some people refer to the central
DBMS functions as the back-end functions, whereas they call the
application programs on the client PCs front-end programs.

Figure 4-3 Two-tier Architecture

Advanced Database Systems

4.0 Database System Architecture

With this architecture, only the database server requires processing

power adequate to handle the database, and the database is stored on
the server, not on the clients. Therefore, the database server can be
tuned to optimize database-processing performance. Because fewer
data are sent across the LAN, the communication load is reduced.
User authorization, integrity checking, data dictionary maintenance,
and query and update processing are all performed at one location, on
the database server.

This architecture reduces network traffic, reduces the power required

for each client, and centralizes user authorization, integrity checking,
data dictionary maintenance, and query and update processing on the
database server. As companies have sought to gain expected benefits
from client/server projects, such as scalability, flexibility, and lowered
costs, they have had to develop new approaches to client/server
architectures.

Most two-tier applications are written in a programming language such

as Java, VB.NET, or C#.

Databases in Three-tier Architecture

In general, a three-tier architecture includes another server layer in

addition to the client and database server layers. Such configurations
are also referred to as n-tier, multitier, or enhanced client/server
architectures. The additional server in a three-tier architecture may be
used for different purposes. Often, application programs reside and are
run on the additional server, in which case it is referred to as an
application server. Or the additional server may hold a local database
while another server holds the enterprise database. Each of these
configurations is likely to be referred to as a three-tier architecture, but
the functionality of each differs, and each is appropriate for a different
situation. Advantages of the three-tier compared with the two-tier
architecture, such as increased scalability, flexibility, performance, and
reusability, have made three-layer architectures a popular choice for
Internet applications and net-centric information systems.

4
MODULE OF INSTRUCTION

Figure 4-4 Three-tier Architecture

Advantages of the three-tier architecture can include scalability,

technology flexibility, lower long-term costs, better matching of
systems to business needs, improved customer service, competitive
advantage, and reduced risk. But higher short-term costs, advanced
tools and training, shortages of experienced personnel, incompatible
standards, and lack of end-user tools are some of the challenges related
to using three-tier or n-tier architectures.

The Internet Database Environment

Figure 4-5 depicts the basic environment needed to set up both intranet
and Internet database-enabled connectivity. In the box on the right-
hand side of the diagram is a depiction of an intranet. The client/server
nature of the architecture is evident from the labeling. The network
that connects the client workstations, Web server, and database server
follows TCP/IP protocols. While multitier intranet structures are also
used, where a request from a client browser will be sent through the
network to the Web server, which stores pages scripted in HTML to be
returned and displayed throught the client browser. If the request
requires that data be obtained from the database, the Web server
constructs a query and sends it to the database server, which processes
the query and returns the results set when the query is run against the
database. Similarly, data entered at the client station can be passed
through and stored in the database by sending it to the Web server,

Advanced Database Systems

4.0 Database System Architecture

which passes it on to the database server, which commits the data to

the database.

Figure 4-5 A Database-enabled Intranet/Internet Environment

Web Application Components

The four key components must be used together to create a Web

application site includes:
1. A database server
This server hosts the storage logic for the application and hosts
the DBMS. The DBMS may reside either on a separate
machine or on the same machine as the Web server. Examples
are Oracle, Microsoft SQL Server, MySQL, Sybase, DB2, and
Informix.
2. A Web server
The Web server provides the basic functionality needed to
receive and respond to requests from browser clients. These
requests use HTTP or HTTPS as a protocol. Exmaples are
Apache and Microsoft Internet Information Server (IIS).
3. An application server
This software provides the building blocks for creating
dynamic Web sites and Web-based applications. Examples
include the .NET Framework from Microsoft; Java Platform,
Enterprise Edition (Java EE); and ColdFusion.
4. A Web browser
Microsoft’s Internet Explorer, Mozilla’s Firefox, Apple’s
Safari, Google’s Chrome, and Opera are examples.

6
MODULE OF INSTRUCTION

Parallel Databases
Parallel database is a DBMS running across multiple processors and
disks designed to execute operations in parallel, whenever possible, to
improve performance. Parallel DBMSs link multiple, smaller
machines to achieve same throughput as single, larger machine, with
greater sclalbility and reliability.

Two main performance measures of parallel database systems:

• Throughput – is the number of tasks that can be completed in a
givent time interval.
• Response time – is the amount of time it takes to complete a
single task from the time it is submitted.

Architecture of Parallel Databases

1. Shared Memory

Figure 4-6 Shared Memory

- Processors and disks have access to a common

memory, typically via a bus or through an
interconnection network.
- Extremely efficient communication between processors
— data in shared memory can be accessed by any
processor without having to move it using software.
- Downside – architecture is not scalable beyond 32 or 64
processors since the bus or the interconnection network
becomes a bottleneck
- Widely used for lower degrees of parallelism

2. Shared Disk

- All processors can directly access all disks via an

interconnection network, but the processors have private
memories.

Advanced Database Systems

4.0 Database System Architecture

• The memory bus is not a bottleneck

• Architecture provides a degree of fault-
tolerance — if a processor fails, the other
processors can take over its tasks since the
database is resident on disks that are accessible
from all processors.

Figure 4-7 Shared Disk

- Downside: bottleneck now occurs at

interconnection to the disk subsystem.
- Shared-disk systems can scale to a somewhat
larger number of processors, but communication
between processors is slower.

3. Shared Nothing

Figure 4-8 Shared Nothing

- Node consists of a processor, memory, and one or more disks.
Processors at one node communicate with another processor at
another node using an interconnection network. A node
functions as the server for the data on the disk or disks the node
owns.
- Data accessed from local disks (and local memory accesses)
do not pass through interconnection network, thereby
minimizing the interference of resource sharing.
- Shared-nothing multiprocessors can be scaled up to thousands
of processors without interference.
- Main drawback: cost of communication and non-local disk
access; sending data involves software interaction at both ends.

8
MODULE OF INSTRUCTION

4. Hierarchical

Figure 4-9 Hierarchical

- Combines characteristics of shared-memory, shared-disk, and

shared-nothing architectures.
- Top level is a shared-nothing architecture – nodes connected
by an interconnection network, and do not share disks or
memory with each other.
- Each node of the system could be a shared-memory system
with a few processors.
- Alternatively, each node could be a shared-disk system, and
each of the systems sharing a set of disks could be a shared-
memory system.
- Reduce the complexity of programming such systems by
distributed virtual-memory architectures
• Also called non-uniform memory architecture (NUMA)

Distributed Databases
A distributed database is a single logical database that is spread
physically across computers in multiple locations that are connected
by a data communications network. A distributed database is truly a
database, not a loose collection of files. The distributed database is still
centrally administered as a corporate resource while providing local
flexibility and customization. The network must allow the users to
share the data; thus a user (or program) at location A must be able to
access (and perhaps update) data at location B. The sites of a
distributed system may be spread over a large area (e.g., the United
States or the world) or over a small area (e.g., a building or campus).
The computers may range from PCs to large-scale servers or even
supercomputers.

Advanced Database Systems

4.0 Database System Architecture

Various business conditions encourage the use of distributed

databases:
• distribution and autonomy of business units
• data sharing
• data communications costs and reliability
• environments with multiple applications and vendors
• database recovery
• satisfying of both transaction and analytical processing

Distributed Database Environments

1. Homogenous
• Data are distributed across all nodes
• The same DBMS is used at each location
• All data are managed by the distributed DBMS
• All users access the database through one global
schema or database definition
• The global schema is simply the union of all local
database schemas
• Goal: provide a view of a single database, hiding details
of distribution

Figure 4-10 Homogenous Distributed Database Environment

2. Heterogenous
• Data are distributed across all nodes
• Different DBMSs may be used at each node

10
MODULE OF INSTRUCTION

• Some users require only local access to databases,

which can be accomplished by using only the local
DBMS and schema
• A global schema exists, which allows local users to
access remote data
• Goal: integrate existing databases to provide useful
functionality

Figure 4-11 Heterogenous Distributed Database Environment

There are numerous advantages to distributed databases:

• increased reliability and availability of data
• local control by users over their data
• modular (or incremental) growth
• reduced communications costs
• faster response to requests for data

There are also several costs and disadvantages of distributed

databases:
• software is more costly and complex
• processing overhead often increases
• maintaining data integrity is often more difficult
• if data are not distributed properly, response to requests for
data may be very slow.

There are several options for distributing data in a network:

• Data replication – a separate copy of the database (or part of
the database) is stored at each of two or more sites

Advanced Database Systems

4.0 Database System Architecture

• Horizontal partitioning – some of the rows of a relation are

placed at one site, and other rows are placed in a relation at
another site (or several sites)
• Vertical partitioning – distributes the columns of a relation
among different sites
• Combinations of these approaches

Distributed DBMS

Figure 4-12 shows one popular architecture for a computer system

with a distributed DBMS capability. Each site has a local DBMS that
manages the database stored at that site. Also, each site has a copy of
the distributed DBMS and the associated distributed data
dictionary/directory (DD/D). The distributed DD/D contains the
location of all data in the network, as well as data definitions. Requests
for data by users or application programs are first processed by the
distributed DBMS, which determines whether the transaction is local
or global. A local transaction is one in which the required data are
stored entirely at the local site. A global transaction requires
reference to data at one or more nonlocal sites to satisfy the request.
For local transactions, the distributed DBMS passes the request to the
local DBMS; for global transactions, the distributed DBMS routes the
request to other sites as necessary. The distributed DBMSs at the
participating sites exchange messages as needed to coordinate the
processing of the transaction until it is completed (or aborted, if
necessary).

Figure 4-12 Distributed DBMS Architecture

12
MODULE OF INSTRUCTION

Four key objectives of a distributed DBMS:

1. Location transparency
- A design goal for a distributed database, which says that a
user (user program) using data need not know the location
of the data.
2. Replication transparency
- Also called fragmentation transparency
- A design goal for a distributed database, which says that
although a given data item may be replicated at several
nodes in a network, a developer or user may treat the data
item as if it were a single item at a single node.
3. Failure transparency
- A design goal for a distributed database, which guarantees
that either all the actions of each transaction are committed
or else none of them is committed.
4. Concurrency transparency
- A design goal for a distributed database, with the property
that although a distributed system runs many transactions,
it appears that a given transactions is the only activity in
the system. Thus, when several transactions are processed
concurrently, the results must be the same as if each
traansaction were processed in serial order.

Glossary
Application Partitioning

- The process of assigning portions of application code to client

or server partitions after it is written to achieve better
performance and interoperability.

Client/Server System

- A networked computing model that distributes processes

between clients and servers, which supply the requested
services.

Database Server

- A computer that is responsible for database storage, access, and

processing in a client/server environment.

Advanced Database Systems

4.0 Database System Architecture

Distributed Database

- A single logical database that is spread physically across

computers in multiple locations that are connected by a data
communication link.

Parallel Database Systems

- Consist of multiple processors and multiple disks connected by

a fast interconnection network.

Three-tier Architecture

- A client/server configuration that includes three layers: a client

layer and two server layers.

References
Hoffer, J. A., Ramesh, V., Topi, H. (2013). Modern Database
Management 11th Ed., New Jersey: Prentice Hall.

Silberschatz, A., Korth, H. F., Sudarshan, S. (2011). Database System

Concepts 6th Ed., New York: McGraw-Hill

14
Advanced Database Systems
1
Data Warehousing

Data Warehousing

This module provides an understanding of the basic concepts of data

warehousing. A data warehouse is constructed by integrating data from
multiple heterogeneous sources. It supports analytical reporting, structured
and/or ad hoc queries and decision making.
You should have an understanding of basic database concepts such as
schema, ER model, structured query language and the like before you
proceed with this lesson.
After studying this lesson, you should be able to:
1. Concisely define key terms.
2. Differentiate between data warehouse (OLAP) and operational database
(OLTP).
3. List the different types of data warehouse.
4. Explain the functions of data warehouse tools and utilities.
5. Discuss the data warehouse processes.

Data Warehouse Overview

Bill Inmon first coined the term “Data Warehouse” in 1990. According to
Inmon, a data warehouse is a subject oriented, integrated, time-variant, and
non-volatile collection of data. This data helps analysts to take informed
decisions in an organization.
Data warehousing is the process of constructing and using a data warehouse.
A data warehouse is constructed by integrating data from multiple
heterogeneous sources that support analytical reporting, structured and/or
ad hoc queries, and decision making. Data warehousing involves data
cleaning, data integration, and data consolidations.
An operational database undergoes frequent changes on a daily basis on
account of the transactions that take place. Suppose a business executive
wants to analyze previous feedback on any data such as a product, a supplier,
or any consumer data, then the executive will have no data available to
analyze because the previous data has been updated due to transactions.
A data warehouses provides us generalized and consolidated data in
multidimensional view. Along with generalized and consolidated view of
data, a data warehouses also provides us Online Analytical Processing
(OLAP) tools. These tools help us in interactive and effective analysis of data
in a multidimensional space. This analysis results in data generalization and
data mining.

Course Module
Data mining functions such as association, clustering, classification, and
prediction can be integrated with OLAP operations to enhance the interactive
mining of knowledge at multiple level of abstraction. That's why data
warehouse has now become an important platform for data analysis and
online analytical processing.

Understanding a Data Warehouse:

 A data warehouse is a database, which is kept separate from the
organization's operational database.
 There is no frequent updating done in a data warehouse.
 It possesses consolidated historical data, which helps the organization
to analyze its business.
 A data warehouse helps executives to organize, understand, and use
their data to take strategic decisions.
 Data warehouse systems help in the integration of diversity of
application systems.
 A data warehouse system helps in consolidated historical data
analysis.

Why Data Warehouse is Separated from Operational Databases

A data warehouses is kept separate from operational databases due to the
following reasons:
 An operational database is constructed for well-known tasks and
workloads such as searching particular records, indexing, etc. In
contract, data warehouse queries are often complex and they present
a general form of data.
 Operational databases support concurrent processing of multiple
transactions. Concurrency control and recovery mechanisms are
required for operational databases to ensure robustness and
consistency of the database.
 An operational database query allows reading and modifying
operations, while an OLAP query needs only read only access of
stored data.
 An operational database maintains current data. On the other hand, a
data warehouse maintains historical data.

Data Warehouse Features

The key features of a data warehouse are discussed below:
 Subject Oriented - A data warehouse is subject oriented because it
provides information around a subject rather than the organization's
ongoing operations. These subjects can be product, customers,
suppliers, sales, revenue, etc. A data warehouse does not focus on the
ongoing operations, rather it focuses on modeling and analysis of data
for decision making.
 Integrated - A data warehouse is constructed by integrating data
from heterogeneous sources such as relational databases, flat files,
etc. This integration enhances the effective analysis of data.
Advanced Database Systems
3
Data Warehousing

 Time Variant - The data collected in a data warehouse is identified

with a particular time period. The data in a data warehouse provides
information from the historical point of view.
 Non-volatile - Non-volatile means the previous data is not erased
when new data is added to it. A data warehouse is kept separate from
the operational database and therefore frequent changes in
operational database are not reflected in the data warehouse.
Note: A data warehouse does not require transaction processing, recovery,
and concurrency controls, because it is physically stored and separate from
the operational database.

Data Warehouse Applications

As discussed before, a data warehouse helps business executives to organize,
analyze, and use their data for decision making. A data warehouse serves as a
sole part of a plan-execute-assess "closed-loop" feedback system for the
enterprise management. Data warehouses are widely used in the following
fields:
 Financial services
 Banking services
 Consumer goods
 Retail sectors
 Controlled manufacturing

Types of Data Warehouse

Information processing, analytical processing, and data mining are the three
types of data warehouse applications that are discussed below:
 Information Processing - A data warehouse allows to process the
data stored in it. The data can be processed by means of querying,
basic statistical analysis, reporting using crosstabs, tables, charts, or
graphs.
 Analytical Processing - A data warehouse supports analytical
processing of the information stored in it. The data can be analyzed by
means of basic OLAP operations, including slice-and-dice, drill down,
drill up, and pivoting.
 Data Mining - Data mining supports knowledge discovery by finding
hidden patterns and associations, constructing analytical models,
performing classification and prediction. These mining results can be
presented using the visualization tools.

Course Module
Figure 2 Comparison Between OLAP and OLTP

There are decision support technologies that help utilize the data available in
a data warehouse. These technologies help executives to use the warehouse
quickly and effectively. They can gather data, analyze it, and take decisions
based on the information present in the warehouse. The information
gathered in a warehouse can be used in any of the following domains:
 Tuning Production Strategies - The product strategies can be well
tuned by repositioning the products and managing the product
portfolios by comparing the sales quarterly or yearly.
 Customer Analysis - Customer analysis is done by analyzing the
customer's buying preferences, buying time, budget cycles, etc.
 Operations Analysis - Data warehousing also helps in customer
relationship management, and making environmental corrections.
The information also allows us to analyze business operations.

Functions of Data Warehouse Tools and Utilities

The following are the functions of data warehouse tools and utilities:
 Data Extraction - Involves gathering data from multiple
heterogeneous sources.
 Data Cleaning - Involves finding and correcting the errors in data.
 Data Transformation - Involves converting the data from legacy
format to warehouse format.
Advanced Database Systems
5
Data Warehousing

 Data Loading - Involves sorting, summarizing, consolidating,

checking integrity, and building indices and partitions.
 Refreshing – Involves updating from data sources to warehouse.
Note: Data cleaning and data transformation are important steps in
improving the quality of data and data mining results.

Metadata
Metadata is simply defined as data about data. The data that are used to
represent other data is known as metadata. For example, the index of a book
serves as a metadata for the contents in the book. In other words, we can say
that metadata is the summarized data that leads us to the detailed data.
In terms of data warehouse, we can define metadata as following:
 Metadata is a road-map to data warehouse.
 Metadata in data warehouse defines the warehouse objects.
 Metadata acts as a directory. This directory helps the decision support
system to locate the contents of a data warehouse.

Data Cube
A data cube helps us represent data in multiple dimensions. It is defined by
dimensions and facts. The dimensions are the entities with respect to which
an enterprise preserves the records.

Data Mart
Data marts contain a subset of organization-wide data that is valuable to
specific groups of people in an organization. In other words, a data mart
contains only those data that is specific to a particular group. For example,
the marketing data mart may contain only data related to items, customers,
and sales. Data marts are confined to subjects.

Virtual Warehouse
The view over an operational data warehouse is known as virtual warehouse.
It is easy to build a virtual warehouse. Building a virtual warehouse requires
excess capacity on operational database servers.

Data Warehouse Processes

There are four major processes that contributes to a daa warehouse:
 Extract and load data.
 Cleaning and transforming the data.
 Backup and archive the data.
 Managing queries and directing them to the appropriate data sources.

Course Module
Figure 2 Process flow in data warehouse

Extract and Load Process

Data extraction takes data from the source systems. Data load takes the
extracted data and loads it into the data warehouse.
Note: Before loading the data into the data warehouse, the information
extracted from the external sources must be reconstructed.
Controlling the Process
Controlling the process involves determining when to start data extraction
and the consistency check on data. Controlling process ensures that the tools,
the logic modules, and the programs are executed in correct sequence and at
correct time.
When to Initiate Extract
Data needs to be in a consistent state when it is extracted, i.e., the data
warehouse should represent a single, consistent version of the information to
the user.
For example, in a customer profiling data warehouse in telecommunication
sector, it is illogical to merge the list of customers at 8 pm on Wednesday
from a customer database with the customer subscription events up to 8 pm
on Tuesday. This would mean that we are finding the customers for whom
there are no associated subscriptions.

Loading the Data

After extracting the data, it is loaded into a temporary data store where it is
cleaned up and made consistent.
Note: Consistency checks are executed only when all the data sources have
been loaded into the temporary data store.
Clean and Transform Process
Once the data is extracted and loaded into the temporary data store, it is time
to perform Cleaning and Transforming. Here is the list of steps involved in
Cleaning and Transforming:
 Clean and transform the loaded data into a structure
 Partition the data
 Aggregation
Advanced Database Systems
7
Data Warehousing

Clean and Transform the Loaded Data into a Structure

Cleaning and transforming the loaded data helps speed up the queries. It can
be done by making the data consistent:
 within itself.
 with other data within the same data source.
 with the data in other source systems.
 with the existing data present in the warehouse.
Transforming involves converting the source data into a structure.
Structuring the data increases the query performance and decreases the
operational cost. The data contained in a data warehouse must be
transformed to support performance requirements and control the ongoing
operational costs.
Partition the Data
It will optimize the hardware performance and simplify the management of
data warehouse. Here we partition each fact table into multiple separate
partitions.
Aggregation
Aggregation is required to speed up common queries. Aggregation relies on
the fact that most common queries will analyze a subset or an aggregation of
the detailed data.

Backup and Archive the Data

In order to recover the data in the event of data loss, software failure, or
hardware failure, it is necessary to keep regular back ups. Archiving involves
removing the old data from the system in a format that allow it to be quickly
restored whenever required.
For example, in a retail sales analysis data warehouse, it may be required to
keep data for 3 years with the latest 6 months data being kept online. In such
as scenario, there is often a requirement to be able to do month-on-month
comparisons for this year and last year. In this case, we require some data to
be restored from the archive.

Query Management Process

This process performs the following functions:
 manages the queries.
 helps speed up the execution time of queris.
 directs the queries to their most effective data sources.
 ensures that all the system sources are used in the most effective way.
 monitors actual query profiles.
The information generated in this process is used by the warehouse
management process to determine which aggregations to generate. This
Course Module
process does not generally operate during the regular load of information
into data warehouse.

Activities and Exercises

Answer the following questions:
1. Research on the different Data Warehousing Software (at least 5).
Compare and contrast them in terms of:
a. Vendor
b. Functionality
c. ETL process
d. Platform
e. Best feature
f. Support
2. What is big data? What is Hadoop? Why is it important? What are the
challenges of using Hadoop?

References
Hoffer, J. A., Ramesh, V., Topi, H. (2013). Modern Database Management 11th
Ed., New Jersey: Prentice Hall.
Inmon, W. H. (2005). Building the Data Warehouse 4th Ed., Indianapolis:
Wiley Publishing Inc.
Kimball, R., Ross, M. (2002). The Data Warehouse Toolkit, 2nd Ed., A Complete
Guide to Relational Modeling, New York: John Wiley & Sons.
Silberschatz, A., Korth, H. F., Sudarshan, S. (2011). Database System Concepts
th
6 Ed., New York: McGraw-Hill.
https://www.tutorialspoint.com/dwh/index.htm
http://docs.oracle.com/cd/B10500_01/server.920/a96520/concept.htm
Chapter 1

Plan It Like You’re Patton: Determine

Your Battle Plan—Map Out Exactly
What You Want to Do and How You
Will Do It

I did not intend to inflict upon you Ben Franklin’s bromide about how
those who fail to plan, plan to fail, but I can’t avoid it.

So there.

Having dispensed with that obligation, let me stress that planning a pre-
sentation does not have to be an intricate or tedious process. Follow the
steps in this chapter, and you can draw up a plan in just a few minutes.
Your blueprint can incorporate strategies that have proven effective for
centuries—I mean that literally—and can save you stress, tedium, false
starts, and embarrassment.

Almost any plan is better than no plan. Plans don’t have to be complex.

Your plan can be as simple as “beginning, middle, and end.” That was
good enough for Aristotle, and he did okay in the presentation depart-
ment, I’ve been told. The most famous orator in history, Cicero, advo-
cated a grand total of six parts to a presentation: an introduction, a
summary statement of the case you are trying to make, major points in
2 Present Like a Pro

your case, refuting opposing points, crowing about how you have just
refuted those points, and a conclusion in which you show how you have
made your case.

There are many templates that have proven effective for master pre-
senters, and this chapter will show you how to pick one and adapt it to
your needs.

Some of the following ten steps involve structure, and others involve
planning for the physical presentation. Employ whichever tactics work
for you. Plan as far in advance as feasible, because having a general
idea of the structure and content puts your brain on autopilot during
the time leading up to the actual presentation. You’ll hear or read things
that will be perfect for your performance and robotically vacuum them
up; moments of inspiration will come to you while standing in line at the
bank or sitting in traffic.

Don’t get lost in the details, and don’t obsess about planning to the point
where you come down with a case of perfection paralysis. Just do it, and
start now.

To paraphrase Gen. George S. Patton, a good plan executed right now is

better than a perfect plan executed next week.

1. DETERMINE YOUR MAIN TAKEAWAY AND WRITE IT IN ONE

SENTENCE; IF YOU CAN’T, NO ONE WILL GET YOUR POINT

Here is my main takeaway for this chapter:

This chapter will show you how planning your presentation around some proven
steps and strategies can help you re-create the success of others and give your pre-
sentation a compelling structure.

You now know the main point, the primary benefit I promise to impart
in this chapter. You know what’s coming, I know where I’m going, and we
are both presumably happy about it.
To continue my militaristic riff on the importance of planning, note that
when Dwight Eisenhower (who had once worked as a speechwriter himself
and was later schooled in the importance of clear and evocative communi-
cation as supreme Allied commander in the Mediterranean and European
theaters) assumed the presidency, he demanded that his speechwriters
shape his presentations around one bottom line—one message the listen-
ers would take home with them. According to James C. Humes, an Eisen-
hower scholar as well as a master presidential speechwriter, Eisenhower
Plan It Like You’re Patton 3

told writers that if they could not put the bottom-line message on the back
of a matchbook before they sat down, they were essentially wasting their
time.1
So save yourself false starts, and draft your takeaway before you begin
anything else. Then make sure your entire presentation somehow relates to
the takeaway and reinforces it.
Your takeaway can be about 20 words:

• This is a marvelous product, and we will not only make a big profit but also
change the industry.
• I am the best candidate for this office because I am beholden to no one and
have only your best interests at heart.
• We must fight as hard as we can, because the stakes are so high and the con-
sequences so dire.
• If we set aside our differences, everyone can move forward and accomplish
great things.

2. REMEMBER THAT A PRESENTATION IS A JOURNEY: PLAN WHERE

YOU WANT TO START, WHERE YOU WANT TO GO, AND WHERE YOU
WANT TO END

Every compelling story has movement within its structure. You probably
can recognize these common structures from films and mythology:

• Boy meets girl, boy loses girl, boy retrieves girl.

• Mythic hero has a call to adventure, is convinced to step up to the plate by
a mentor, begins a quest, overcomes enemies, endures an ordeal, narrowly
escapes death, and returns to ordinary life a victor and a wiser person.

And every compelling presentation has a structure too. Here are a couple
of examples:

• Introduce your main argument, state your case, outline your main points,
prove your case, counter conflicting arguments, and conclude by showing
how you have made your case.
• Start with a story illustrating a problem, leave the audience in suspense as to
the resolution of the problem, describe possible solutions, refute objections,
funnel your audience toward agreeing with your proposed solution, and close
with the opening story—how the person you are using for illustration over-
came the problem.

None of these structures is complicated or new. The first example

above (Introduce your main argument, state your case, outline your main
4 Present Like a Pro

points, etc.) was devised more than two thousand years ago by the Roman
orator Cicero and became the basis for Western classical rhetoric—the art
of verbal persuasion and motivation.
We’ll look at structures in detail later in this chapter and throughout
the book, but for now just keep in mind that everything has a structure:
pop music, TV sitcoms, symphonies, and speeches. You may not be able to
perceive the structure because it seems so natural, but that’s the nature of a
useful structure—it seems natural and doesn’t call attention to itself.
Your favorite song on the radio doesn’t sound to you like intro/first
verse/chorus/second verse/second chorus/eight bars of variation on the
melody/third chorus/closing chorus, does it? But that is a very common
underlying structure—so common that people in the music industry will
abbreviate it and say, “Here’s a song that’s a plain old ABABCBB.” (If you
don’t believe me, listen to a few songs on the car radio on the way home;
you’ll see how many songs fall into this pattern.)
Again, one presentation structure is not inherently better than another,
although it might be better suited to a particular application. The impor-
tant factor is to have some sort of structure designed to carry you through
from beginning to middle to end with some sort of perceptible motion and
closure.
Note that this advice applies to any sort of communication that you can
conceivably classify as a presentation: speech, training session, on-camera
response to a media inquiry, podcast, and so forth.

3. INVENTORY THE KNOWLEDGE, NEEDS, AND INTEREST LEVEL

OF YOUR AUDIENCE

First figure out where to start in terms of the complexity of the information
you’ll present. You can’t start above the heads of your audience, but you cer-
tainly don’t want to tell them what they already know. To complicate matters,
audiences will often have mixed levels of expertise and experience, so you’ll
need to acknowledge that: “Some of you certainly know this already, but
some won’t, so let me briefly review.” It’s probably best to pitch the content a
little above the heads of some while giving occasional catch-up explanations.
Next, consider what the audience needs to get out of it, and gear your
approach accordingly. Do they need an understanding of a subject that will
allow them to pass a certification test? An understanding of how the test
works, so that they can apply existing knowledge and pass it? Or do they
need to be fired up? Or entertained? Or both?
How much does the audience care about your subject? If the answer is
“not a lot,” is it worth pursuing? If it’s worth it, how can you convince them
to care?
Plan It Like You’re Patton 5

Gathering this information isn’t that hard. Just talk with the organizers
or members of the group. Ask these questions:

• Why will the audience be here? Are they forced to come? Voluntarily coming
because they are interested?
• What do they need to know?
• Why do they need to know it? Curiosity? To make more money? Not to be at
a disadvantage when competing with others?
• How much do they know now?
• What’s the variation in the audience’s level of understanding? In other words,
will they all be novices? Experts? Half experts and half novices?

4. BE FOCUSED ON THE AUDIENCE’S NEEDS: WHAT’S IN IT

FOR THEM?

Your listeners will always be appreciative if you give them something

they want. To be fair, what they want might not always be obvious, and the
audience members might not be sure of it themselves. You need to clarify
the benefit to them, deliver, and make it clear that they will gain something
from what you offer.
For example, one of the best performances I ever saw was a presentation
on how to give a presentation. The salient point was to capture attention
with a good opening and not clutter it up by droning meaningless thank-
yous and chitchat at the beginning. The presenter was greeted with polite
applause as he took the lectern, but he cut it off and said, “I haven’t earned
that yet. But I will.”
And he went right into his speech—a crackling cascade of useful infor-
mation, beginning with the lesson to start the speech cold and not to thank
people. Save the thanks for later.
The point is that I had a general idea of what I needed to know—more
on how to polish a presentation—but not the specifics. The presenter
had a fair idea that most of the audience members were relatively expe-
rienced communicators who didn’t require elementary instruction but
needed some quick techniques to up their games. So he provided us with
a technique that works well in the hands of a reasonably competent pre-
senter, he did it right off the bat, and by doing so he captured our atten-
tion at the beginning so he could effectively lead us through the middle
and the end.
Businesses and organizations routinely do what they call “needs assess-
ments” to figure out what their customers want. It becomes a jargon-heavy
process, with “needs” sometimes defined as essential things for “well-being”
or things that will create a state of “deficiency or deprivation” if denied.2
6 Present Like a Pro

You can assess the needs of your audience by doing surveys or, using a low-
tech method that always appealed to me, just talking to people.
Advertising agencies typically convene focus groups to talk to people
and learn about the needs of their audience. Let me tell you one story that
perfectly illustrates what you’re trying to do in a needs assessment. Back in
the 1990s, something called the California Milk Processor Advisory Board
hired a veteran advertising executive to help stanch the loss of custom-
ers for milk. The problem was that choices of beverages were multiplying
exponentially with the introduction of new soda flavors and invented bev-
erages such as sports drinks. But there is not a lot you can do with milk.
You can flavor it with chocolate or strawberry, but beyond that, it would
just get weird. Nobody’s going to drink carbonated milk, for example. Milk
isn’t much of a sports drink, because it doesn’t keep forever; it is therefore
not really portable.
So, we know that most milk consumption takes place in the home—but
what spurs people to buy it? What are their needs? A San Francisco adver-
tising agency set up focus groups (supervised conversations with groups of
typical customers) to find out.
The groups didn’t produce much that was useful until something hap-
pened by accident: It was getting late, and one participant noted that if he
didn’t leave in time to make it to the store to buy milk, there would be hell
to pay the next morning.
And there you have one of the Eureka Moments of needs assessments.
What motivates many people to want to buy milk, the conveners of the
focus group learned, is the fear of what happens if they run out. (I don’t
know about you, but when my children were little and I didn’t have milk,
they appeared ready to call a social service agency and report me.)
You know the rest of the story, because you certainly remember the
“Got Milk” commercials: entertaining mini-dramas in which a man
can’t win a telephone quiz contest because when the phone rings, he
is chewing on a peanut butter sandwich and has run out of milk with
which to wash it down. Another commercial features a snarling busi-
ness executive who is run over by a bus and winds up in what appears
to be a big kitchen in the sky, fully stocked with milk cartons and giant
chocolate-chip cookies.
“Heaven,” he muses, stuffing himself. And then, naturally, he reaches for
the milk carton. But it’s empty. And so are all the rest.
The realization is horrifying. “Where am I?” he asks, as flames begin to
engulf the “Got Milk” logo.
And there you have the central lesson in needs assessment. Deduce what
scares people, what they feel deprived without.
Figure out what causes pain—and offer a solution.
Plan It Like You’re Patton 7

Maybe you’ll have to explain to your audience precisely why they should
feel deprived, or why they should worry, and that’s okay. But without that
hook, without that perceived need you are filling, the audience will have no
real interest in listening to you.
So, gauge as best you can what your audience needs. Talk with the event
organizer or with people who will be at the meeting. What bothers them?
What solution are they seeking, and what is the problem they want to solve?
Maybe it is something general and nonspecific: They are tense and upset,
and they need a laugh. Or it could be a precise fear: They are worried about
meeting sales quotas and need some advice on how to head off failure. Or
they are concerned about a new policy, and they need assurance that it will
be in their best interests. It could be the case that they are worried about
loss of productivity in their office, and they need the type of product you
just happen to sell. Or possibly they are concerned about giving a presenta-
tion, in which case you can advise them to buy this book.

5. RUTHLESSLY NARROW YOUR FOCUS AND THE AMOUNT

OF MATERIAL

Here are three immutable facts of life. Let’s call them Hausman’s laws:

Law 1: Everyone worries about having too little material for a presentation and
running short.
Law 2: Nobody ever has too little material and runs short.
But because people relentlessly continue to believe in Law 1, we inescap-
ably come to . . .

Law 3: A lot of people giving presentations therefore have way too much they
want to cover, they try to cram in and regurgitate too much information, and
thus they turn the occasion into a desultory data dump. And they talk too
quickly as a result of their panic to wedge everything in.

It’s reasonable that you will start with too much material to cram into a
presentation. In fact, that’s the way it should be. I won’t invent any more
laws, but if I did, the next one would be something like, “Whether in an
article, book, or presentation, you know the content is going to be good
when you have too much good information and it becomes difficult to pare
it down.” So feel free to collect a huge pile of facts and figures for your pre-
sentation, but chop down the pile relentlessly until every piece of material
sticks to the spine of your main takeaway, until they all fit into your orga-
nizational structure and are relevant to your audience’s knowledge level
and needs.
8 Present Like a Pro

6. DECIDE ON THE FORMAT THAT WORKS FOR YOU: READING FROM

A SCRIPT, BULLET POINTS, CUE CARDS, MEMORIZATION, OR
AD-LIBBING

This decision is based partly on the purpose and venue of your presen-
tation and partly on personal preference.
Some decisions regarding format are obvious. You can’t show up for a
TV interview reading from a script, for example. Some are trickier. If you
are giving a presentation about a controversial topic, you are well advised
to read from a prepared script so that you won’t accidentally say some-
thing provocative, and if you are misquoted, you can go back to the text to
prove what you said. This isn’t timidity—it’s good judgment, and you are in
good company. Edward R. Murrow followed this route when opening his
famous 1954 television takedown of Sen. Joseph McCarthy:

Good evening. Tonight See It Now devotes its entire half hour to a report on
Senator Joseph R. McCarthy told mainly in his own words and pictures . . .
Because a report on Senator McCarthy is by definition controversial, we
want to say exactly what we mean to say, and I request your permission to
read from the script whatever remarks Murrow and Friendly may make. If
the Senator feels that we have done violence to his words or pictures and so
desires to speak, to answer himself, an opportunity will be afforded him on
this program.3

Should you be giving a presentation that will largely be judged on your

verbal artistry, you would be well advised to memorize the whole thing.
You are probably familiar with TED Talks. TED is an organization that
originally held conferences on technology, entertainment, and design
(hence the acronym) and now features standout speakers on innovative
topics. (There is a TED Talk reprinted in Chapter 11, along with some more
information about the organization.) TED Talks are popular on Internet
video and on portable audio.
TED organizers generally insist that presenters memorize their speeches
and rehearse them many times, word for word. This makes sense because
TED talks are more like classical music than improvised jazz. Viewers will
be listening for the structure and swell of the presentation, and they expect
presenters to hit a bull’s-eye every time.
But your presentation actually might be more like a jazz session,
with room for improvisation, side trips, and your personal angle on
evolving events during the session, such as audience participation. If
you’re a good ad-libber, consider a series of bullet points to play to your
strengths.
Plan It Like You’re Patton 9

The physical form of your notes or script must also be considered. You
have many options:

• If you use PowerPoint, Prezi, Keynote, Google Slides, or similar software,

viewing your notes from the part of the window only visible to the presenter
• An outline on standard 8 1/2 × 11 paper
• Bullet points
• 3×5 index cards
• Cue cards written on poster board or other large pieces of stiff paper, held up
in the audience by a compatriot
• A prompting device, such as a TelePrompTer (TelePrompTer is a trade name
referring to a specific brand of prompting device), that electronically scrolls
through the script and usually presents it on a panel that is visible to you but
not the audience
• A script written in a font large enough for you to comfortably read at arm’s
length on a lectern. (Note that throughout this book, I’ll use “lectern” to indi-
cate the point from where you are speaking, even though it might not always
be a lectern. I tend to avoid the words “podium” and “dais” because techni-
cally they are structures you stand on.)

As with everything else in life, there are pluses and minuses to each
approach. The presenter view in slide software is convenient, in that it
comes packaged with the page you are displaying. However, this requires
you to be in a position to comfortably view a computer screen—and if
there is a problem with the slides, you have a problem with your notes, too.
The outline approach tethers you to several pieces of paper but does not do
all your thinking for you . . . you still have to ad-lib. Having said that, it does
provide you with a clear structure, making it easy to keep your place and
visualize what has been covered and what remains.
Numbers, Roman numerals, and letters can be used in an outline, as can
symbols and bullet points. I favor a very simple outlining process, with
plain text as the main thought, solid bullets for second-level thoughts, and
hollow bullets for third-level ones. This is the default modus operandi of
Microsoft Word, saving me the trouble of setting up any special formatting.
I recommend my outline method as a good general-purpose default. It
gives me clear prompts and is compact enough so that I can get a sense of
the organization of the presentation. For example, here is the opening of a
university lecture I give on freedom of the press:

Journalism’s existence based on First Amendment

What does it say? - Six clauses:

• Establishment
• Practice
10 Present Like a Pro

• Speech
• Press
• Assembly
• Petition

However . . .

• Does it mean what it says? How can it?

• Why the exceptions? How did it get to where it is today?

Start with the Backstory . . .

• Begins with tussle between freedom and oppression, and also technology and
oppression . . .
People not born with rights
Magna Carta 1215
Petition of Right, 1628, cited Magna Carta as precedent

Occasionally I have used bullet points on 3 × 5 index cards. Index cards

fit easily into a pocket and thus are inconspicuous, and the points can easily
be mixed and matched and shuffled around to customize the presentation.
Unfortunately, it’s easy for me to inadvertently shuffle the cards—generally
by dropping them—so I avoid this method.
Cue cards written on large pieces of cardboard are visible to an audience
and can appear awkward and even comical, but they remain a good low-
tech method for referring to notes when giving a presentation involving
video cameras (in other words, you are concerned about what the camera
sees, not what the audience, if any, sees). The trick is to be able to read from
them while not perceptibly diverting your eyes from the lens. To enable this,
keep the cameras and the cards as far from you, the speaker, as practicable.
The angle of the eyes is much less visible from a distance as compared to a
close-up. Another trick to maintaining the appearance of eye contact with a
camera is to mount (or have a compatriot hold) the cards above or below the
lens. Upward movement of the eyes is less perceptible than a sideways glance.
Prompting devices cure the problem of eye contact by projecting the
script onto a surface that is reflective to the reader but invisible to the audi-
ence. They are not readily available, although computer technology using
tablets and smartphones hooked up to a small mirror is making prompters
more accessible to the general public. The devices also take some practice
to use. As a general rule, I steer clear of them unless there is no reasonable
alternative. There are too many weak links in the prompting chain, includ-
ing errors on the part of the operator and technical glitches involving pro-
jection of the script and the speed at which it moves.
Plan It Like You’re Patton 11

Typing out the script and reading it provides the presenter with a crutch
but at a price: For most people, it is difficult to read a script smoothly and
naturally. Having said that, though, there are occasions when you will
want the wording to be exact, and printing out a copy in a large enough
font to comfortably read is simple. This is a major advantage of the com-
puter; when I first began inflicting myself on audiences, the only option
for a speaker who needed large type was a special “Orator” font on a ball
that fit into an IBM Selectric typewriter. It’s obvious to note that such is not
the case today, but less obvious is the fact that the myriad fonts available
on a word-processing program have widely varying levels of readability.
Experiment. Find the size and font that’s most effective for you, should
you go the route of reading off a script or from bullet points on a printed
page. Resist the temptation to use a huge font visible from an orbiting sat-
ellite, because you will have to flip pages too often, and you may lose the
sense of continuity provided by being able to scan many words on a page.
Personally, I like 18-point Times Roman. Times Roman is a serif font,
meaning that the letters have little feet. I find that this aids readability,
and as I have reasonably good distance vision, I don’t need the text to be
much bigger, and I like the idea that I can see a fairly large amount of text
and notes. Your mileage may vary, so do a lot of experimentation.

7. DECIDE WHETHER TO USE MEDIA, WHAT KIND, AND WHY YOU

WANT TO USE IT

Slides and video are terrific. Sometimes. And sometimes they are just
awful—even laughable. Google comedian “Don McMillan” and “Death by
PowerPoint,” and you’ll see what I mean.
He cuts pretty close to reality in the opening of his routine when he
shows his first PowerPoint slide and reads it word for word. The slide looks
like this:

Most Common PowerPoint Mistakes

(1) People tend to put every word they are going to say on their PowerPoint
slides. Although this eliminates the need to memorize your talk, ultimately
this makes your slides crowded, wordy, and boring. You will lose your audi-
ence’s attention before you even reach the bottom of your . . .4

Don’t be that guy. Use media sparingly and only when it adds something
unique or serves some purpose better than simply talking. We’ll look at
presentation media in more detail in later chapters, but at this point, here
is what you need to know.
12 Present Like a Pro

Use slides . . .

• For titles
• For main points, especially definition of technical terms
• For highlighting a theme or a question that will recur throughout the
presentation
• For visuals, when those visuals convey the image better than you can by
words alone

Do not use slides . . .

• To read from
• As a crutch
• As incessant decoration
• As the primary conveyor of information (If it’s essential that people know
something that is presented in a written format, give them a handout or a
Web link.)
• To echo what you are saying

Use video . . .

• When you have video of something to which you are referring (For example,
in Technique #6 above, I referred to Edward R. Murrow. Had I been giving a
presentation, it certainly would have been a good spot to insert a video clip.)
• When video can provide a brief punctuation (Had I been giving a presenta-
tion on the misuse of PowerPoint, I could have effectively used a clip from
Don McMillan.)

Do not use video . . .

• As a halfway measure between a video presentation and a live presentation

(If you want to show a video, fine—show it. If you want to give a presentation
with a little bit of video serving as a condiment, that’s also fine. But half talk,
half video leaves the audience confused.)
• When there is any question as to its relevance

8. REHEARSE LIKE A PROFESSIONAL: MAKE YOURSELF BETTER,

AND DON’T PRACTICE YOUR MISTAKES

Rehearsal helps—a lot, much more than you would expect. For one
thing, you don’t want to present a rough draft to an audience, and you don’t
want to have a lapse in concentration derail you during an ill-prepared
presentation.
Plan It Like You’re Patton 13

Some people reject the idea of rehearsing by insisting that by rehears-

ing, they will wind up with a rote, canned presentation, and they would
rather wing it and give the audience a unique experience. Now, there is
something to that argument, but within very limited strictures. I have to
admit that some of the greatest performers in history shunned rehearsal.
A noted example is Jackie Gleason, who terrified cast members by his last-
minute brinksmanship. But his brilliance emanated from walking that
same tightrope—the risk gave it an edge.
But remember that Gleason had “rehearsed” his “unrehearsed” perfor-
mance thousands of times before. He spent more time on stage than most
people spend in the office, so he was not exactly inventing each perfor-
mance anew. Aside from that, he was a genius.
So until you are firmly established as a genius, careful rehearsal is
advised. Rehearse your ad-libs, too. (Don’t be naïve: Of course experienced
performers rehearse clever off-the-cuff remarks. That’s how they sound
witty and off-the-cuff. See Chapter 9, Step 5 for more.)
Some people become discouraged with the results of rehearsal because
they do it all wrong. They rush through, delivering their lines in a whis-
per directed at their chest, ignore the coordination of technology they will
be using and what they will say, and get no feedback from an observer.
Remember: Without feedback you don’t get corrected, and without correc-
tion you practice your mistakes and become better at doing it wrong.
With those points in mind, here are guidelines for rehearsing a presen-
tation so that you’ll actually get better, not worse:

• Video yourself. In the era of the smartphone and the video camera on almost
every laptop sold, there is no reason not to take advantage of this option. Yes,
it can be painful to watch, but there is simply no better way to fix weak points.
From personal experience I can tell you that even the most masterful broad-
cast announcers weren’t born with their talent. Speaking and reading with
accuracy and eloquence is an acquired skill; you acquire that skill by practic-
ing, evaluating the results, and making changes and adjustments, just like hit-
ting balls at a driving range, seeing how far they go, and adjusting your stance.
• Get feedback. Continuing the golf analogy, anyone who has taken a golf les-
son knows the value of feedback. Sometimes you literally can’t tell if your own
arm is straight, or if you are rocking too much. You may have the same blind
spot for speaking too quickly or glossing over the main points. An informed
observer can help. If possible, get someone to watch your presentation and
provide feedback. (You have nothing to lose: If the person is a jerk, at least
you’ll have practice handling a heckler.)
• Rehearse in sections. In one of my many failed careers, I was an orchestral
musician. I attended a master class with a renowned clarinetist who would
listen to the student play and then offer advice and play the selection himself.
14 Present Like a Pro

Some of the music was very challenging, even beyond the capacity of a first-
chair musician to sight-read. So he learned it on the spot, in front of a large
group of students, and here’s how he did it: Instead of charging through the
passage, which was about a minute long, he slowly played the first phrase,
then speeded it up when he played it once more. He repeated the process
with the second phrase (maybe ten seconds of music), and so on. Then he
grafted the parts together, playing the first part and the second part, then
the first part and the second part and the third part, then the whole thing.
The point is that he polished the sections first, slowly and carefully. Then he
put them together, building up speed and expression. That’s what you want
to do when you rehearse a speech. Don’t try to run through the whole thing,
at least not at first. Do it slowly, at least at first, getting the details right so you
don’t practice your mistakes.
• Practice more than the words. Rehearse movements, pauses, inflections,
and integration of the technology.

9. WRITE YOUR DRAFT FOR THE EAR, NOT FOR THE EYE

Words that work well on paper do not necessarily translate gracefully

to being spoken aloud. If you are writing out your remarks word for word
for a formal speech or for a piece you are recording on audio or video, you
need to craft words that can comfortably be spoken and listened to.
This isn’t complex if you follow these rules:

• Keep sentences short. An occasional long sentence is all right (and in fact
tends to break up monotonous patterns), but err on the short side.
• Write in a conversational tone. Make sure that whatever you write sounds
like a normal speech pattern and not an academic journal or a newspaper
editorial. The easiest way to assess whether you pass this test is to read things
aloud as you write them. If a sentence sounds stilted, start over.
• Be clear. Don’t write, “As we can see from the latter example,” because the
listener has no way to replay what you have said to discern the former from
the latter.
• Watch pronouns. A listener can’t go back and figure out who the “he” refers
to if you have just mentioned several men. When in doubt, repeat a name
rather than use a pronoun.
• Use contractions. Forget the admonitions from your seventh-grade English
teacher. Contractions are natural in spoken communication. Having said
that, be careful of easily misheard words such as “can” and “can’t,” which
sound an awful lot alike when spoken. If it’s possible that a listener will be
confused, rephrase.
• Put attribution first. For example, “Our researchers say this will be the best
quarter in the past decade” is easier to say and listen to than “This will be the
best quarter in the past decade, our researchers say.”
Plan It Like You’re Patton 15

• Make very short sentences carry the weight. This is a potent technique and
is dirt simple: When you want to make an impact and get across a key point,
make it a very short sentence. Compare this uninspired wording . . .

I disagree with our critics who say we won’t be able to adapt to the changing
technologies because our business is rooted in face-to-face sales. In fact, I would
endeavor to say that we may surpass their expectations and turn in a perfor-
mance better than anything we have done before. By integrating the newest
database and contact management technology into our . . .

with this:

Some of our critics say we won’t be able to adapt to the changing technologies
because our business is rooted in face-to-face sales.
They are wrong.
We will succeed beyond anyone’s expectations except mine.
By integrating the newest database and contact management technology into
our . . .

10. GEAR EVERYTHING TO YOUR DESIRED OUTCOME—AND

REMEMBER THAT J. P. MORGAN HIT IT ON THE HEAD WHEN HE SAID
A PERSON HAS TWO REASONS FOR DOING SOMETHING: ONE THAT
SOUNDS GOOD, AND THE REAL REASON

Remember, you need everything in your presentation to serve your pur-

pose. If you are giving a campaign speech, for example, you want all of the
presentation’s elements to impel people to vote for you. If you are making a
presentation to a potential customer, you want to sell your product.
At the same time, remember that there are usually many desires and
motivations at play in anyone’s decision-making process. This really isn’t a
mysterious progression, and you can probably unravel many of the strings
you can pull to subtly influence your audience. Voters, for example, might
be looking for a candidate who not only shares their fundamental views
(the ostensible reason) but also personifies the deep anger they have at a
system that many feel has not been working (the real reason). A client
naturally wants to choose a product that is of high quality (the ostensi-
ble reason) but also wants to deal with a salesperson who seems reliable,
approachable, and likely to untangle the inevitable problems that come
with the supply chain (the real reason).
Think past the immediate task. Start with your desired result, and
then keeping that in mind, reverse-engineer the process of planning your
presentation.
Chapter 3

Magnify Your Message with

Credibility, Approachability, and
Listenability

We’ve covered the basics of planning, constructing, and scripting the

presentation (though there is more advanced information, including the
fine points of openings and closings, to come in later chapters).

This chapter will focus on the fundamentals of delivering a presentation

before a live audience. Using presentation technology and appearing on
media will be covered in later chapters.

Here are the steps to follow to maximize your delivery.

1. BEGIN WITH THE THREE RULES OF A COHERENT AND ENGAGING

TALK: SLOW DOWN, SLOW DOWN, AND SLOW DOWN SOME MORE

Speed kills credibility and audience engagement. Slow down!

Remember that in addition to causing incomprehensibility, “fast-talking”
is associated with dishonesty, so rattling through your presentation calls
your sincerity into question.
Too rapid a pace sometimes stems from fear that you’ll lose an audi-
ence if you don’t regurgitate information quickly enough. Realistically, that
won’t happen, and you stand a better chance of losing them if they can’t
28 Present Like a Pro

follow you. In other cases, machine-gun deliveries are symptomatic of a

nervous meltdown—an indicator that the speaker is losing control. It can
convey the image that you believe that what you have to say is unimportant
and you want to get it over with.
There is a bona fide speech disorder called “cluttering,” in which groups
of words are mashed together. If you suspect that is the case, you might
approach a speech-language pathologist (a specialist in voice and speech
with a master’s degree and a national certification).
But for most people, the cure for excessive speed is simply a matter of
reminding yourself—and forcing yourself—to slow down.
If you want a trick or device to slow you down, try these:

• Write a reminder directly into your script or notes. A visual reminder pop-
ping up from time to time is helpful. Even the best of us need a cue; President
Eisenhower was notorious for rambling on, so his handlers had a special plate
attached to his lectern that would light up and demand, “GET OFF NOW.” In
comparison, writing “SLOW DOWN’ on page 3 of your script is not such an
imposition.
• Breathe more deeply and more often. Breath control is essential to magnify-
ing voice power (among the topics covered in Chapter 7), but taking regular,
deep breaths calms you and effortlessly slows your pace, because you can’t
talk and breathe at the same time.
• Insert pauses, as recommended in Chapter 2, Step 6. Pauses offer the dual
benefit of adding drama and slowing down speech.
• Time yourself reading from a script, and observe your words-per-minute
rate so you will have a consistent measure to shoot for. I like to keep my
rate for audiobooks, online narration, and most public speaking at about 150
words per minute. Your mileage may vary depending on the circumstances,
but 150 is a good starting point. Hitting it is easier than it sounds: When you
rehearse, simply set your smartphone timer for a minute, count 150 words
into your presentation, mark that spot, and read. If you finish before hitting
the marker, slow down. If you drag on much longer than the marker, chug a
couple cups of coffee and rev up the speed. Keep practicing to hit the target,
and after a few sessions, that rhythm will be imprinted in your neurons and
you will be able to summon it naturally.

2. ADOPT THE POWER POSE THAT RESEARCH SHOWS TO BE MOST

EFFECTIVE

You are probably familiar with the work of Amy Cuddy, who became
famous for her TED Talk on body language. If you haven’t seen it, just
google her name and the phrase “power poses.” It’s well worth the 20-minute
time investment.
Magnify Your Message with Credibility, Approachability, and Listenability 29

In essence, Cuddy maintains that her research demonstrated that erect

poses with arms away from the body communicate a sense of power, so
much so that subjects who practiced these poses actually felt more pow-
erful. One of Cuddy’s favorites was the hands-on-hips “Wonder Woman”
pose (which, if I understand her research correctly, is of more use for
psyching yourself up than as an actual presentation posture).
In actuality, I believe that for the purposes of presentation, any posture
that involves keeping the head up, the shoulders squared, the feet firmly
planted, and the back straight will not only communicate an aura of power
but also help you maintain energy and project your voice.
It’s really about conveying that you are confident and in control. Any pos-
ture that sends that message will serve its purpose, and you don’t have to
stand like the Statue of Liberty to achieve it. You may remember William F.
Buckley, one of the great speakers of all time; he would slouch when stand-
ing, and sometimes when hosting his television show, he would recline
sideways in his chair to the point where he appeared ready to slide off. But
he was deliberate in these postures, not fidgeting or trying to find a place
to put his hands or hunching up his shoulders.
So concentrate on posture that communicates ease and command. Gen-
erally, head up, shoulders back, and back straight is as good as anything
else. Remember not to make continual fidgety movements, as that conveys
a lack of confidence and is visually distracting. Above all, do not lean into
the microphone in what I call the “nibbling bird” posture. Adjust the stand
so you can stand erect and talk comfortably. With most microphones, you
do not have to have your mouth extremely close to have it work properly.
And if proximity to the microphone is essential, you are better off holding
the microphone in your hand if you can.

3. MOVE WITH A PURPOSE: USE POWERFUL GESTURES AND WORK

THE ROOM GRACEFULLY

Appropriate and expressive gestures enhance engagement, but repetitive

gestures are distracting, fidgety gestures indicate a lack of self-confidence,
and exaggerated gestures make the speaker appear out of control.
Here are five guidelines for using gestures to your advantage:

• Remember to keep scale appropriate. If you are giving a speech in a hall

with two balconies, grand gestures are okay. But in a small room or—worse—
on television, expansive movement appears manic. Have a colleague stand
where you will be stationed, while you scope out the venue. You’ll get an
idea how much movement is needed. If you’re performing on video, find out
30 Present Like a Pro

what the standard shots will be and adjust your movements accordingly. In
general, less—much less—is more when it comes to televised gestures.
• Be careful of mannerisms. I used to have a professor who would punctuate
each and every sentence with a palsied karate chop. We would imitate him in
class, in the hall, and at frat parties. Don’t be that guy. A distinctive gesture is
fine—in fact, you might want to develop one as a trademark—but when it is
overused, the gesture becomes distracting at best and fodder for ridicule at
worst.
• Along the same lines, make sure gestures have purposes. If you want to
stab a finger at the audience to make an appropriate point (“and you have
been cheated time and time again by this insane policy . . .”) that’s fine. But
if the gesture habitually persists through a joke or a tribute to a fallen friend,
you are sending mixed and confusing signals. This might actually be a bigger
problem than you suspect, because sometimes people subconsciously read
things into inappropriate gestures, and they may not even be aware of their
reaction; all they know is that something is wrong. I once knew someone
who kept his fists clenched during even the lightest parts of his presentations.
I don’t know if the mismatched gesture was symptomatic of anything, and it
took me a while to figure out what the dissonance in his appearance was, but
I believe his habit made people uneasy even if they could not put their fingers
on what was bothering them.
• Maintain an arsenal of functional and appropriate gestures. Two that work
well are (1) hands in front of body and spread, palms out, when address-
ing the audience and (2) palms turned in when talking about yourself. I
like listing points from time to time and using fingers to count them out.
(Just don’t use this technique for 11 or more points.) Some speaking coaches
advise against using clenched fists or pointed fingers, but if those gestures are
used with sincerity and not in a hostile way, they can be very effective. An
arm outreached to the audience in a gesture that looks like you are inviting
someone to dance is an excellent device to implore listeners to join you in a
belief or idea.
• Full-body gestures, when appropriate for the venue, work well. For exam-
ple, if you want to make a final appeal to the audience, get out from behind
the lectern, move to the edge of the stage, and use the come-and-dance ges-
ture with one arm while holding the microphone in the other hand. If you are
wearing a microphone and are not tethered to one location, moving around
the stage or platform is an excellent option, as long as you don’t pace mechan-
ically. If you are comfortable doing so, opt for a location that does not plant
you behind a lectern; it’s just one more barrier between you and the audi-
ence. If you have notes but no formal lectern, you can put them on a music
stand and refer to them occasionally. I believe that one of the most physically
inviting setups for a speaker is a simple stool, a music stand, and a handheld
mic. It communicates to the audience that you are not afraid of them and not
desirous of a barrier, and it allows you the freedom to sit and then stand when
you want to punctuate an idea.
Magnify Your Message with Credibility, Approachability, and Listenability 31

4. MAINTAIN EYE CONTACT THROUGH PLANNED FOCUS POINTS

If an audience is large, obviously you cannot maintain eye contact with

every member. However, you can give the impression of engaging in indi-
vidual eye contact with these mechanisms:

• Pick several locations where you will routinely focus on an audience

member—direct front, left rear, right front, right rear center, etc. The point
is that you don’t want to forget about a section of the audience, something
that’s easy to do in the heat of battle. Practice your sectors in advance until
it becomes routine but not too routine. You don’t want your movements to
become predictable (a pattern that presentation coach Olivia Mitchell char-
acterizes as acting like a “tennis umpire” or a “lighthouse”)1. In sum, don’t
overcomplicate this; just make sure you regularly maintain eye contact with
people in different sections of the audience.
• Pick out one person in each sector with whom to make eye contact dur-
ing your talk. Hold eye contact until it is reciprocated. You might even get a
nod. Then move on. Be sure not to break eye contact in mid-sentence. You’d be
surprised how effectively this works. A few years ago a group of friends and I
attended the Broadway play Barrymore, starring Christopher Plummer. The play
is a monologue with much of the dialogue addressed to the audience. I noted
afterward how Plummer had, I believed, held eye contact with me for several
seconds. My friends, who sat in different sections of the audience, said the same
thing. Everybody thought Plummer was, at one point, looking right at them.
• Having said the above, remember that in smaller groups, some people
may be uncomfortable with eye contact. You can tell if that’s the case. Just
focus on another audience member.
• Also remember that the point of eye contact is to establish a relationship
between the presenter and the audience. We are obviously talking about
a generalized relationship here, and the audience’s view of that relationship
is more an overall impression than a tally of how many times you looked at
individuals. You can maintain an overall atmosphere of contact by looking at
the audience instead of at your slides, and by looking up from your notes as
much as possible.
• In relation to the latter suggestion about notes, be sure to finish a sentence
while you are looking up. Only then should you glance downward to your
notes.

5. USE PROVEN STRATEGIES TO INVITE AUDIENCE INVOLVEMENT

Audience involvement enhances listenability and your appeal as a

speaker, but it is a double-edged sword. If you encourage involvement, you
often wind up with a more engaged and entertained audience. However—
and this is a big, mighty scary “however”—you run the risk of encouraging
32 Present Like a Pro

the subspecies of audience members who are attention junkies and want to
take over the presentation. (I’ll show you how to deal with them in Chapter
4, Step 6.)
Having served up that disclaimer, let me note that experience, research,
and common sense demonstrate that audiences retain more and pay atten-
tion when they are involved in some fashion.
The most basic tool for encouraging participation is simply asking
questions. There are several ways to ask a question, and all carry specific
benefits and risks.
You can ask a question of the entire group and hope someone responds.
The upside is that if you get an answer, it is likely to be responsive rather
than reflexive. The downside is that if no one responds, you look a little
silly, and if a boorish attention junkie responds (sometimes repeatedly),
you may have to deal with disruption. One way around this is to ask for a
show of hands (“How many think this approach might work? Please raise
your hand”) and then call on one of the hand-raisers who looks like he or
she might have a lively and intelligent addition to the conversation.
Alternately, you can single out an individual. This can backfire if the
target is unresponsive or takes the question as an affront. However, if you
are in a position of authority over the group—say, delivering a mandatory
training—this technique can be a powerful motivation for audience mem-
bers to pay attention, because they know they could be next on the hot seat.
I can’t prove this, but I feel that subconsciously, many people like being put
on the spot in a competitive environment and take some satisfaction in
being held to task. So if you want to channel your inner Professor Kings-
field from The Paper Chase, give it a try if you believe your personality and
the situation lend themselves to the approach.
My favored no-risk mechanism is to frame the inquiry as a rhetorical
question and then call on people who respond or look as though they are
going to respond. You can fake this if you want:

Me: “But the question is, how do we make this approach work?”
(Pause. If there’s no response, just leave it as a rhetorical question and continue
with your presentation: “One method that consistently . . .”)
Or try this approach:
Me: “But the question is, how do we make this approach work?”
(Scan the room for signs of people who look like they can be made to offer a
contribution.)
Me: “Wow, I see a lot of people who look like they have ideas to offer.”
(The technical name we use in the business for this technique is “a lie,” but remem-
ber that you can’t get caught, because most members of the audience, if they are
seated facing you, can’t see the other members.)
Magnify Your Message with Credibility, Approachability, and Listenability 33

Me: “And I think I saw Bob in the last row ready to contribute.”
(Pick the person you think looked as though he or she had something to add. This
technique allows you to read the room and move on if the audience is dead, or to
select a responder in a nonthreatening way.)

There is one situation where you don’t want to get people talking, at least
right away: when you are trying to persuade them and possibly change
their opinions. There is more on this in Step 9 below, but note here that
people become much more intransigent once they have publicly stated an
opinion. In other words, if you allow or force them to oppose you publicly
in the beginning, you will never be able to change their views by the end.
If you do want to gauge the attitude of an audience, I have one partici-
pation tactic that usually works very well: Conduct an anonymous poll at
the beginning of the presentation. Paper handouts with one or two ques-
tions work well; 3 × 5 index cards work better. If you have an audience of
50, it will only take a helper five minutes or so to tabulate the questions
and maybe another five minutes to write down some of the more provoc-
ative responses. That translates to 10 minutes of your presentation, during
which the audience is in some suspense waiting for the results while your
helper tallies the numbers.
If you don’t have a stake in the outcome of who favors what view, or
even if you do and feel confident you will change some hearts and minds,
conduct a poll of attitudes at the beginning and end of the presentation. You
now have two suspense points—and I guarantee that the audience will be
curious about whether attitudes changed during the presentation.

6. GET QUESTIONS ROLLING WITH A PLANT IN THE AUDIENCE

The typical peak time for audience involvement is the end of the presen-
tation, often reserved for question-and-answer. Presenters have differing
opinions on the best placement of questions; some like to have audience
members ask during the session, while others prefer to have all questions
posed at the end. In my experience, end-of-session questions work best
with very large audiences, while smaller groups lend themselves to more
interaction. The subject matter and your general level of comfort play into
the optimal arrangement as well. There’s no reason why you can’t do both:
“We’ll have questions and answers at the end, but if there’s something you
think is important to clarify during the presentation, please feel free to ask.”
But nothing short of being tarred and feathered is so demoralizing
as concluding with, “And now, I’ll take your questions” and not getting
any. I see nothing morally wrong with having a confederate armed with a
question and getting the Q and A rolling. I’ve actually approached a total
34 Present Like a Pro

stranger in the audience a few minutes before the beginning and asked if
they would help move the session along at the end by asking the first ques-
tion. I might even have proposed a question or two. This isn’t falsification;
you are, after all, planning a legitimate addendum to the talk, and if your
plant lets out the secret, so what? You plead guilty to the crime of careful
planning and move on.

7. ELIMINATE RHETORICAL FLOURISHES

You will obliterate your credibility by sounding like a caricature of a bad

speaker. Don’t use hackneyed, clichéd, or flowery language. Don’t say these
things:

• “Last but not least . . .”

• “It gives me great pleasure to . . .”
• “We are gathered here today . . .” (unless you are performing a marriage
ceremony)
• “It is with a heavy heart . . .”
• “I’d like to thank the many people . . .”
• “I’d like to begin . . .” (This phrase is particularly hated by legendary public
speaking teacher Reid Buckley, who advised, “Begin, damn it. Don’t hem and
haw.”2)

8. PROJECT LUCIDITY AND ORGANIZATION BY MONITORING TIME

AND MILESTONES ALONG THE WAY TO MAINTAIN INTEREST; AVOID
RUNNING LONG OR CUTTING MATERIAL SHORT

Some presentations are delivered with strict time limits. The length of
others is governed only by common sense. Bearing in mind the maxim
“No one ever complained that a talk was too short,” keep within your limits.
Remember that if you don’t keep the components of your presentation
within a time frame with milestones along the way, ending on time will
be like pulling the plug on a TV show that is only two-thirds complete. If
you panic when you realize you have only five minutes left and try to cram
everything in, you will look exactly like someone who is panicking and is
trying to cram everything in.
So keep tabs on expended time by sections. If your presentation is
divided up into beginning, middle, and end, take note of when each seg-
ment should conclude and adjust accordingly. If it’s easier, pick three
or four event milestones—perhaps an anecdote, the demonstration of a
device, a particular slide projected on the screen, and the beginning of
question-and-answer.
Magnify Your Message with Credibility, Approachability, and Listenability 35

This can be difficult if there is no clock visible. It’s best not to look at your
watch while presenting (although there are worse things, from an audi-
ence’s standpoint, than a presenter concerned with time), so you might use
your phone or other device placed on the lectern. If you need only a couple
of time reminders, you can set your phone to vibrate and schedule two
alarms. With a little practice, you can simply reach in your pocket and end
the buzzing. If you have a cohort in the audience, he or she can give you a
planned, inconspicuous series of clues.
Remember, a speaker transparently cramming in material at the end
loses credibility. Virtually any technique you use to keep on time-track is
better than none at all. Even admitting you have a tendency to run long and
assigning a member of the audience to give you reminders (“Bill, would
you be sure to remind me to start questions at 10:30?”) is better than a
tailspin at the end.

9. WHEN TRYING TO PERSUADE, FRAME STATISTICS WITHIN

STORIES—BUT DON’T MISLEAD OR OVERREACH

We dealt with the concept of using stories instead of data dumps earlier
(Chapter 2, Step 2), but statistics need additional attention because of their
role in enhancing—or destroying—credibility. When we think the facts are
on our side, we have a tendency to believe that the numbers will some-
how make the truth self-evident. That’s not the case. In fact, bombarding
your listeners with data may have the opposite effect. In addition to being
boring and incomprehensible, you can appear downright deceptive. Most
audiences know full well that statistics can be tortured into saying whatever
the user wants, and when a speaker appears to abandon person-to-person
common sense in favor of a numerical fusillade, listeners become skeptical.
So when you use statistics, frame them and give them meaning. Here is
an example:

“About 45,000 people die in auto wrecks each year, and unfortunately that kind of
tragedy is so common that we tend to tune out. But think of it this way: That’s the
equivalent of a fully loaded passenger jet crashing, with no survivors, every day
for a year.”3

Be careful not to misinterpret statistics or stretch their meaning beyond

what can reasonably be interpreted. A correlation, for example, does not
necessarily imply cause and effect. Don’t do this:

“A new freeway will be good for the economy of our town. I know a lot of people
here are worried about the impact of this proposed development, but let me tell
36 Present Like a Pro

you about my friends Bill and Ellie. They live in a town the same size as ours in
Monroe County, about 50 miles from here, and they opposed a similar highway
there. But after the highway was built, their property values went up 25 percent.”

The statistic may be true, but because one thing happened and then
something happened after does not mean that the first event caused the
second. Property values could have gone up for many reasons—in fact,
building the road may have been a reaction to population increases in the
town, and property values may have escalated simply because the locale
was becoming more popular.
Be careful with statistics. To help you sort out the fine points of using
stats, I have provided a 10-point guide in the Appendix.

10. ASK FOR WHAT YOU WANT

Audiences feel cheated if, at the conclusion of a presentation, they don’t

know exactly what it was about or, specifically, what you want from them.
They might not agree with you, and they might not give you what they
want, but if you don’t include a specific call to action, your listeners will . . .

a. be confused and frustrated, and worse, from your perspective,

b. not give you what you want, because they don’t know what it is.

Do you want their vote? Give them good reasons to vote for you, and
then ask them to do it. Do you want them to join you in supporting a cause?
Tell them why they should, and ask them to do it.
Do you want their help in achieving a goal? Professional speaker Brian
Tracy uses this example of a call to action at the end of a speech, and
explains how to introduce the call to action.

“We have great challenges and great opportunities, and with your help, we will
meet them and make this next year the best year in our history!”

Imagine an exclamation point at the end of your call to action, pick up your tempo
and energy as you approach it, and drive the final point home. “Regardless of
whether the audience participants agree with you or are willing to do what you
ask,” Tracy writes, “it should be perfectly clear to them what you are requesting.4
Chapter 5

Perfect and Polish

Powerful Opens and Closes

Research shows that audiences make snap decisions about people and
tend to remember the opening of a presentation best. In other words, the
first impression is the most important element.

In addition to providing memorability, the opening sets the tone for the
whole opus, in much the same way that theme music sets the stage for
a newscast or an opening vignette gives you the flavor of the upcoming
situation comedy.

Research also indicates that statements featured near the end of a pre-
sentation are among the most memorable. The conclusion of the speech
is where you load your “takeaway”—the overall message and impression
you want to leave.

Academic types who have beards and stroke them regularly while dis-
cussing heady matters call this the “primacy/recency” effect.
Just remember that you need to have a strong start and a satisfying con-
clusion. It’s not that difficult. Here are 10 techniques:

1. WHENEVER POSSIBLE, HAVE SOMEONE INTRODUCE YOU (WITH

A SCRIPT YOU PROVIDE)

An introduction adds a bit of gravitas to your presentation. In addition,

as we discussed in Chapter 2, Step 10, it relieves you of some of the duties
46 Present Like a Pro

of educating your audience about you, the topic, and its importance. Plus
your presence won’t be diminished if you’re not the one begging for quiet
or reciting housekeeping details (“and after this presentation, there will be
lunch in the . . .”).
But whatever you do, don’t let the person introducing you wing it. That is a
recipe for disaster—the introducer might make a hash of the intro or even
forget your name. Write out bullet points for the person introducing you.
It’s doubtful that anyone to whom you provide a prewritten introduction
will complain, and in all likelihood they will be happy to have some of their
burden lifted. Have the introduction illuminate the audience as to why the
topic is important and why you are qualified to discuss it. It also helps to
give some personal background so the audience can begin to relate to you.
But don’t overdo it; long, rambling, or fulsome (look it up—it doesn’t mean
what you think) introductions will send you into a ditch before you even
get out of the driveway.

2. DON’T START TALKING TOO SOON

Bolting out of the starting gate and babbling makes you appear nervous
and unsure of yourself, and because you don’t have the audience’s full atten-
tion during the opening few seconds, they may not get the first few words.
So take your time. Set yourself at the lectern or whatever position you
are speaking from, and take a few seconds to survey the audience. This is
a very powerful technique that shows you are in control, and it also builds
expectation and suspense.

3. GO EASY ON THE THANK-YOUS AND OTHER TRITE INGREDIENTS

OF A TEPID OPENING

Try not to thank people at the beginning, or at least don’t start giving
extended thanks. Doing so is trite and uncreative. If there are people you
must thank, do it a few minutes into the presentation. It’s actually more
appropriate to do so, and your gratitude will appear more sincere that way,
because you won’t give the impression that you are performing a rote task.
Or, if the occasion seems to merit opening with a thank-you, do it briefly,
get right to the lapel-grabber, and work in more elaborate thanks later.
Saying “hello” or “good morning” can also be an energy-sucker, because
typically you will receive a mumbled chorus of unsure and unenthusias-
tic replies. If saying hello is your style, go for it—but it’s best not to leave
a space during which people wonder if they are supposed to return the
greeting.
Perfect and Polish Powerful Opens and Closes 47

Avoid saying anything stilted and overly formal, such as, “It is my dis-
tinct pleasure to be here.” It’s awkward, nobody cares, and it sounds like a
ritualistic statement you feel compelled to recite. Instead, do this . . .

4. GRAB THE AUDIENCE BY THE LAPELS

Sadly, in the next 18 minutes when I do our chat, four Americans that are alive
will be dead from the food that they eat.

That’s the opening used by Jamie Oliver in his prize-winning TED

Talk, during which he raised alarms about the quality and quantity of
food Americans eat. A little later in the opening, he followed with “We,
the adults of the last four generations, have blessed our children with the
destiny of a shorter lifespan than their own parents. Your child will live
a life ten years younger than you because of the landscape of food that
we’ve built around them. Two-thirds of this room, today, in America, are
statistically overweight or obese. You lot, you’re all right, but we’ll get you
eventually, don’t worry.”1
Do you see how much more powerful an attention-grabbing cold open-
ing is than a variation of thank-you-very-much-and-I’m-glad-to-be-here?
Oliver cleverly galvanized the audience and set a perfect tone for the talk,
which you can read in Chapter 12 and can see online (www.ted.com/talks
/jamie_oliver).
The point? A provocative statement at the very top is one way to grab
the audience. Here are some other techniques:

• Humor. An entire chapter of this book is dedicated to humor, but let me

address it here from the standpoint of whether you should use it in an open.
The short answer is that yes, humor (or an amusing anecdote) is an effective
opening if (a) you have the skill to pull it off and (b) it is relevant to the pre-
sentation; ideally it will be self-deprecating, so as not to put anyone on the
defensive immediately. If you are not comfortable opening with humor, as
you might not be on a very formal or solemn occasion, you can always work
it in a minute or so later, as Oliver did in his food warning:

Two-thirds of this room, today, in America, are statistically overweight or obese.

You lot, you’re all right, but we’ll get you eventually, don’t worry.

Bill Clinton used the same delayed-humor tactic in a 1993 speech

before a conference of bishops, during which, for some reason, he was
introduced as “Bishop Clinton.” A minute or so into his talk, Clinton said,
48 Present Like a Pro

In the last ten months, I’ve been called a lot of things. Nobody’s called me a
bishop yet. When I was about nine years old, my beloved and now deceased
grandmother, who was a very wise woman, looked at me and she said, “You
know, I believe you could be a preacher if you were just a little better boy.”

• Tease. Give the listener a foretaste of what’s coming, and make it intrigu-
ing. Steve Jobs opened his 2005 commencement address at Stanford with
a glancing reference to his honor at being selected and then invoked the
“what the . . . ?” principle with this puzzler:

I am honored to be with you today at your commencement from one of the fin-
est universities in the world. Truth be told, I never graduated from college, and
this is the closest I’ve ever gotten to a college graduation. Today, I want to tell
you three stories from my life. That’s it, no big deal—just three stories. The first
story is about connecting the dots. I dropped out of Reed College after the first
six months, but then stayed around as a drop-in for another eighteen months or
so before I really quit. So why’d I drop out? It started before I was born.2

• Quote or Statistic. This is a versatile technique, because you can almost

always readily locate something of interest to your audience. Brian Tracy uses
this example of quoting from a research report:

According to a story in a recent issue of Businessweek, there were almost

10,000,000 millionaires in America in 2013, most of them self-made.3

5. WHEN YOU END YOUR PRESENTATION, MAKE SURE THERE IS A

REAL ENDING

“That’s all, folks” is not an ending, except in a cartoon. You want to con-
clude with something with a little impact, drama, wit, or motivation. While
this admonition appears obvious—at least, it should appear obvious—think
how many droning and indifferent presentations you’ve seen that end with
“Well, time’s up,” “Looks like we’re done,” or the infamous “I don’t have
anything more.”
If your presentation involves questions from the audience, try the tech-
nique explained in the next point.

6. IF APPROPRIATE, INSERT THE QUESTION-AND-ANSWER SESSION

BEFORE THE END

Many business presentations, either in a meeting format or presented

to a large group, must by their nature include Q and A. The problem with
Perfect and Polish Powerful Opens and Closes 49

this arrangement is that the event just sort of sputters out—it’s over after it
becomes obvious that nobody wants it to continue.
Try saying something to this effect: “Before I wrap up, let’s take some
questions for about 10 minutes.” This has the double benefit of letting lis-
teners know how much time is remaining and allowing you to end your
presentation with a nice, satisfying thump.

7. ENSURE THAT THERE IS A CALL TO ACTION AT OR NEAR THE END

OF YOUR PRESENTATION

It’s frustrating for an audience to invest time listening to you and then be
left with the thought “So what’s next?” (See also the “ask for what you want”
advice in Chapter 3, Step 10.) If you are trying to sell a product or an idea,
ask for the sale. There’s always something you are trying to sell:

• If you are accepting a sports award, you might be selling the audience on the
benefit of teamwork.
• If you are conducting a training session on new software, you will be selling
the idea of being more productive.
• If you are giving a sales presentation, you are selling the benefits of the
product, not the product itself.
• If you are emceeing a retirement party, you are selling the audience on
what a great guy old Bill is and how we all hope he enjoys the next chapter of
life.

So don’t leave the listeners hanging. Let them know what they should do
to enhance the spirit of teamwork, to move forward in their jobs, to use a
new product to enhance the lives of others, or to congratulate Bill for his
four decades on the sales floor.
Maybe the thing you are selling is inspiration. If so, then you need to
make the next step clear in your call for action. What, specifically, can and
should your listeners do?
For some calls to action, it’s appropriate to have follow-up handouts
available, or even products for sale on the spot. (I’ve given talks that men-
tioned a particular book I’ve written and have seen a few people actually
disgruntled that I didn’t have copies for sale on the spot. Seriously. Now I
always keep a few copies for what we call “back-of-the-room sales” so that
my audiences can be fully and completely gruntled.)
In short, the specific invitation will be different depending on circum-
stance, but always ask the audience to do something with the information
they have just absorbed.
50 Present Like a Pro

8. SOMEWHERE NEAR THE END, USE THE JIGSAW PUZZLE

TECHNIQUE TO SHOW YOUR AUDIENCE HOW EVERYTHING FITS
TOGETHER—AND VALIDATE THE EXPENDITURE OF THEIR TIME
LISTENING TO YOU

You invariably will want to review the information you have presented
near the end, but try to avoid referring to it as a “review.” That sounds
too much like cramming for a history test. Instead, hit on the main points
you’ve made, and show how it all fits together, like pieces of a puzzle, and
how the audience can therefore discover a new viewpoint or understanding.
It’s easy. You can even say, “So, how does all this fit together?” Sum-
marize the main points you’ve touched and show how they add up to a
satisfying whole.
I’ve revived myself several times after cratering during some disastrously
disjointed talks and lectures by using the “how does this fit together” tech-
nique. After realizing I’d rambled semicoherently, I recalled the main points
and invented some reason why they all added up to something that justified
an hour of my listeners’ attention. You don’t want to make this a habit, but I
make this confession to illustrate what an effective technique it is.
In sum, people want to justify their investment of time spent listening to
you. Make it easy by doing the thinking for them.

9. SOMEWHERE NEAR THE END, CIRCLE BACK BRIEFLY TO THE

BEGINNING TO DEMONSTRATE THAT YOUR PRESENTATION WAS
A COMPLETE, SPHERICAL GEM

This is a wonderful technique and is so simple to execute that there is

never any reason not to deploy it.
Simply close your presentation, at or near the end, with a clear reference
to something you addressed at the beginning. Doing so communicates the
notion that the entire presentation is complete and crafted from a solid
block of marble. Referring back to the beginning implies that you have
“proved” your case with a complete and logical progression of ideas.

• If you started with an anecdote, let the audience know how the story ended,
or provide some follow-up to the story.
• If you started with a quote, refer back to it or give another quote from the
same person.
• If nothing else, just say you are referring back to the beginning, with
something like, “I started off this discussion by promising that this new prod-
uct could improve productivity, so let me tell you about one company that
brought itself back from near-bankruptcy . . .”
Perfect and Polish Powerful Opens and Closes 51

10. YOUR LAST IMPRESSION COUNTS—SO MAKE IT A CLIMAX AND

ENSURE THAT YOU GET THE APPLAUSE YOU DESERVE

You have to nail the last 15 to 30 seconds. Your situation is akin to that
of a gymnast doing a floor routine. If you don’t stick the ending, the entire
performance is tarnished.
Remember the Steve Jobs commencement address, the opening of which
was described earlier in the chapter? (And do you notice how I am refer-
ring to a the opening of the chapter in this concluding part of the chapter
so I can convince you that you are getting a well-thought-out, spherical
view of openings and closings?) He ended it with an emotional, provoca-
tive, and funny phrase:

Stay hungry. Stay foolish.

The concluding lines essentially communicated the idea that you should
persevere even if people around you think that you are being unrealistic.
This echoed other parts of the speech and pointedly reviewed the main
points he’d made. How does this fit together with the rest of the advice in
this chapter? Remember the jigsaw puzzle technique introduced in Step 7?
Remember how I advised to make it clear how the pieces fit together, as I
am doing right now while wrapping up this chapter and being excruciat-
ingly clever?
Don’t leave any doubt as to when your presentation is concluded. Saying
“Thank you, and have a good evening” is clear but unimaginative. If you
want to thank the audience, you can put the thank-you right before the
closing thumper, like this:

Thank you very much for inviting me and for your kind attention, and let me
conclude with one more thought about our future. I’d urge us all to remember the
advice from former GE chairman Jack Welch, who advised, “Change.” [Pause.]
“Before you have to.”

The perfect conclusion will be the final thump line followed immediately
by spontaneous applause. If I were you, I’d opt for “planned spontaneity.” If
you have a friend in the audience, just let him or her know the last line and
tactfully ask for (literally) a helping hand.
Chapter 4

Maintain an Arsenal of 10 Techniques

to Deflect Skepticism, Hostility,
and Inattention

Presenters often have a morbid fear of hostile or indifferent audiences,

and that’s understandable. (Note that defusing generalized fear and
anxiety relating to public speaking is a discrete topic and is addressed
in Chapter 8.) But remember that it’s very unlikely that they will do
you any physical damage unless they have come armed with pitchforks,
so keep your fears in perspective. Remember, too, that there are simple
techniques that you can employ to actually turn the situation to your
advantage.

Audiences in general are not your enemy. Even under the most try-
ing of circumstances, not all members of an audience will be hostile,
inattentive, biased, or dismissively skeptical. But the neutral members
will look toward how you deal with the malcontents when framing
their own opinion of you, your presentation, and whether they want
to be persuaded to your point of view. If you handle the challenge
well, you can enhance your credibility, believability, likeability, and
persuasiveness.

Here are 10 techniques you can use when confronting difficult audience
members.
38 Present Like a Pro

1. BRIDGE AGENDAS: FIND WHAT YOUR AGENDA AND YOUR

OPPONENT’S AGENDA HAVE IN COMMON, AND PURSUE THAT ROUTE

You’d be surprised at how often a response such as, “Yes, I under-

stand,” or even the hoary “I feel your pain” can drain off negative energy.
Granted, you can overdo this, and you don’t want to be viewed as cloy-
ingly insincere—as a “handler”—but if you can convince a hostile audience
member that both of you are pursuing the same goal, you can often retain
the sympathy of the rest of the audience. You do this by first acknowledging
that the audience member’s concern is legitimate, articulating the common
ground that you share, and then calling for moving forward by bridging and
combining your agenda and the complainer’s agenda.
Here’s an example:

Question from Audience: “If this restructuring goes through, a lot of us are going
to have to move! How am I supposed to tell my kids?”
Response: “I know how difficult that can be, and I’ve been there myself.” (Acknowl-
edge.) “And it’s not something we would want to impose on people if the situation
weren’t so difficult. You and I and everybody here are worried about keeping our
jobs, period.” (Common ground.) “No solution is going to be perfect, but if we
work together on this, we have a good chance of saving everybody’s job in the long
run, and that’s what I think we all want.” (Call for moving forward with a shared
agenda. And don’t be baited into debating what to tell the kids.)

2. USE THE RICOCHET QUESTION TO DIVERT A TROUBLEMAKER’S

QUESTION TO SOMEONE ELSE IN THE AUDIENCE—A TECHNIQUE
THAT DEFUSES HOSTILITY AND BUYS YOU TIME TO THINK

A ricochet question (where you take someone’s question and refer it to

others) works best in a venue where you know some of the other audi-
ence members. It differs from a bounceback question, which is addressed
in Step 3.
You can’t always use the ricochet question, but when it’s appropriate, the
technique not only takes the focus off you momentarily but also increases
audience involvement, and in the process it may actually produce some
good discussion.
It works like this:

Skeptical Question: Our numbers each quarter keep going down. Aren’t we
headed for disaster?
Answer: It’s true that sales are a challenge in this economy, but some departments
are holding their own or actually improving. Alyssa’s department had two good
Maintain an Arsenal of 10 Techniques to Deflect Skepticism, Hostility, and Inattention 39

years in a row, and she’s been active in training throughout the company. What do
you think, Alyssa? What are the options you can identify?

Be careful, because you don’t want to anger people by putting them on the
spot, and you don’t want to appear to be ducking questions. But executed cor-
rectly, this technique can appease the audience and sometimes the questioner.
You can ask an audience-wide ricochet question, too.

Before I answer that, does anybody in the audience have any [thoughts, direct
experience with the issue, etc.]?

3. USE THE BOUNCEBACK QUESTION TO PUT TROUBLEMAKERS ON

THE DEFENSIVE

If you sense that hostile questioners or interrupters are simply intent on

disruption, you can sometimes put them back on their heels by asking their
names. (“Sorry, your name is . . . ?” or “Sorry, I missed your name.”) Many
hecklers are like Internet trolls and are courageous only when anonymous.
You can also simply ask them what they would do in the situation. Most
won’t have an answer, or if they do, it is likely to be ill-reasoned. And even
if your heckler does provide a semicoherent response, you have, after all,
steered the conversation back to a landscape of facts, where you presum-
ably have an advantage.

4. KEEP AN EMERGENCY STORE OF ADDITIONAL INFORMATION TO

OVERWHELM A TROUBLEMAKER

You should know in advance what areas of your presentation will be

controversial, so do some more homework in those areas and hold the
information in reserve. Be prepared to parry a hostile question with some-
thing like this:

I’m glad you asked, because I just recently came across a new study from the federal
government showing that this type of program has achieved remarkable success . . .

I call these nuggets “clinchers,” and if you don’t have occasion to use
them, you can always drop them in somehow, even if it’s at the end of the
question-and-answer session:

Oh, and that reminds me—before we wrap up, I should mention that on a related
topic, a new study from the federal government . . .
40 Present Like a Pro

5. REFRAME AND REPEAT A HOSTILE QUESTION TO YOUR ADVANTAGE

Often, a hostile question is 90 percent rant and 10 percent vague inter-

rogative. When you confront a situation in which a questioner is slamming
you with a rambling semi-question, repeat the question (your version of
it) and give your answer. In a large group, where the audience might not
be able to hear a questioner, repeating the question is a good technique
regardless of the intent of the person asking.
Here’s an example:

(After listening to the rant) “So, the question, as I understand it, is: How do we
make sure this policy is applied fairly? As I mentioned in the opening, what we
consistently try to do is to . . .”

There are three tricks embedded in the technique above. First, you have
subtly characterized the question as a disorganized rant (“as I understand
it”) and shown that you are diligently trying to unpack it; second, you note
that you have already at least partially answered the question and therefore
reinforce your main themes; third, you reframe the issue so that you can
give the answer you want to give. Don’t go too far afield, or it will appear as
though you are being evasive, but you do have a right to break down what
you believe the relevant issues are.
If you think it worthwhile, break down a hostile rant into two or three
separate questions. This clarifies the issue and also puts the questioner and
the audience on notice that you have exhaustively attempted to cover the
specified ground, which gives you more justification for moving on.
Then, to move on with the presentation, say something like this:

Well, we have quite a few other people who have questions, and we’ve dealt with
three of yours, so it’s only fair we move on. I’ll be glad to talk with you after . . .

6. REFOCUS AUDIENCE ATTENTION TO DEFLATE A SCENE-STEALER

Attention junkies in the audience derive perverse pleasure from elbow-

ing into someone else’s presentation, either by heckling, providing unnec-
essarily long commentary, or asking an unreasonable number of questions.
The first thing to realize about scene-stealers is that they crave positive
vibes; they are usually seeking approval from the audience and will back off
if the audience turns against them.
With this in mind, the first line of defense is to let the scene-stealer
blather on longer than necessary. Just bite your tongue, and wait it out.
Maintain an Arsenal of 10 Techniques to Deflect Skepticism, Hostility, and Inattention 41

Then thank them for their input and move on; this will often end the con-
frontation. But because these types have a defective off-switch, some will
blindly go forth until they sense that everyone around them is becoming
really uncomfortable. That’s when you answer, but be sure to address every-
one, not just the malcontent. If you talk only to the malcontent, you will
encourage him or her to keep blathering.
Presentation expert Olivia Mitchell, who has an excellent video about
handling hecklers, recommends that as a last resort, if nothing stops the
malcontent, you ask the audience if they would rather listen to you than
the heckler. It is to be hoped that they choose you, in which case your mal-
content is facing disapproval—exactly the opposite of what most attention
junkies crave.1

7. BE THE GROWNUP IN THE ROOM

Your posture and demeanor can go a long way toward defusing a trou-
blemaker or at least luring the audience to your side of the dispute. You
generally don’t want to get into a shouting match with a heckler, because
then you are stepping into their territory and doing battle on their terms. (I
say “generally” because you may have a successful tough-guy or tough-girl
style that serves you well when you give a heckler what-for. If you’re confi-
dent and experienced in that approach, it might win the audience over. But
it also carries the risk of making a bad situation worse.)
In most cases, then, when things get ugly, you can use one or more of
these techniques:

• Avoid a confrontational pose. Hands on hips or finger-pointing will likely

inflame the situation.
• Kill with kindness, and remain positive. Smiling and actually lowering the
volume of your voice makes you look more like an adult and makes the heck-
ler appear as more of a whackjob.
• If the venue allows it, move toward the malcontent in a nonthreatening
way. This allows you to demonstrate that you are directly engaging the trou-
blemaker and also builds audience sympathy.

8. DEFUSE PASSIVE-AGGRESSIVE HECKLERS BY

EMBARRASSING THEM

Eye-rollers, smirkers, and whisperers usually don’t have the nerve to dis-
rupt openly, so they act in a subterranean manner. Such passive-aggressive
42 Present Like a Pro

behavior is, in my mind, more malignant than direct heckling, because it

implies not only hostility but also disrespect.
One way to shut them down is to stop talking and stare at them. Do
this for longer than you feel comfortable doing it, and the audience will
become uncomfortable too—and direct its collective displeasure toward
the passive-aggressive disrupter. You can crank this approach up a notch by
directly calling on the malcontent, pretending (with as much sincerity as
you can fake) that you thought they had asked a question but couldn’t hear,
or that you judged by their expression that there was something wrong—
perhaps, you say with concern, they might be ill.

9. WHEN DEALING WITH A REPORTER OR SOMEONE ASSUMING THAT

ROLE, FRAME YOUR ANSWERS CAREFULLY TO AVOID BEING TAKEN
OUT OF CONTEXT

You have certainly been in positions where it was obvious that a person
asking you questions was fishing for something that could be used against
you. It might have been a journalist with a hostile agenda or a coworker
looking to torpedo you later. There is no way to keep someone who wants
to mischaracterize your communication from taking something out of
context, but you can make it more difficult for it to occur.
Use these approaches:

• Be wary of yes-or-no answers, especially if someone is trying to set you

up to give a blanket yes-or-no answer to a silly scenario. So, if you’re asked,
“Do you think everyone who disagrees with you is stupid?” don’t be lured
into the trap. Give a reasoned, three-sentence answer saying exactly what you
want to say.
• If you sense that you are dealing with someone on a fishing expedition
for something damaging, be careful about using humor and sarcasm. If
your quote is relayed in the cold entombment of print, none of the nuance of
humor will be evident.
• Should you sense that the person asking questions is seizing on something
you’ve just said (perhaps scribbling frantically in a notebook), be sure to
continue on for a few more sentences. Clarify your points, and leave as little
room for misinterpretation as possible.

10. BREAK UP THE ROUTINE TO OVERCOME PASSIVE RESISTANCE

Bored, disengaged audiences can become sullen and passively hostile

if not jolted back into their happy place. Maintain an array of techniques
Maintain an Arsenal of 10 Techniques to Deflect Skepticism, Hostility, and Inattention 43

to recapture their attention and revitalize the presentation. Here are some
ways to recover when you are losing your audience:

• Take a long pause. Silences command attention; we are conditioned to want

them filled.
• Tease them. “In about a minute I’m going to show you something that is just
amazing . . .”
• Inject an unplanned question-and-answer session. “We’ve covered a lot of
ground, so maybe we should take a minute to deal with questions I’m sure
have occurred to you . . .”
• Ask the audience a question. “What would you do?” We are conditioned to
focus attention when asked a question.
• Use my favorite refocusing technique, mentioned in Chapter 2, Step 9. Say
“in conclusion,” “in summary,” or “the most important point I want to leave
you with.”
MODULE OF INSTRUCTION

Advanced SQL

Welcome to the first module of this course on the Advanced Database

Systems! Understanding of SQL and its capabilities for querying a
single table is a pre-requisite.

In this module, you will learn how to retrieve data from multiple tables
using one SQL statement. You will see how tables can be joined
together and how similar results are obtained using different
approaches, including the joins and subqueries.

It is important that you understand how to query multiple tables for

generating approporiate reports in the creation of an information
system.

Learning Objectives
After studying this lesson, you should be able to:

• Concisely define key terms.

• Write single and multiple table queries using SQL commands.
• Define types of join commands and use SQL to write these
commands.
• Write single-row and multiple-row subqueries.
• Write correlated subqueries and know when to write them.

Joining More Than One Tables

You already learned from your previous database course how to
generate reports from a single table. This time we will explore the
capabilities of SQL in querying multiple tables by joining these tables.

Join can be performed by combining two or more tables together by

finding rows that match the values in the common columns. These
common columns in joined tables are usually the primary key of the
parent table and the foreign key of the child table in a one-to-many
relationships.

Advanced Database Systems 1

1.0 Advanced SQL

A join can either be done implicitly or explicitly. Implicit join

is performed by referring in a WHERE clause to the matching
of columns over which tables were joined. On the other hand,
JOIN…ON commands are included in the FROM clause to
join tables explicitly. Various types of join are described in the
following sections.

Equi-join

Equi-join is a type of join that selects rows from the two tables
that have equal values in the common columns, which appear
redundantly in the result table.

For example, if you want to know the names of the students

who have enrolled for courses, that information are kept in two
tables, Student (Table 1.1) and Enrollment (Table 1.2). It is
necessary to match the student names with their courses in
order to answer the question.

Table 1.1 Student

Table 1.2 Enrollment

2
MODULE OF INSTRUCTION

Query: What are the id numbers and last names of all students, along
with their course codes for all the courses they have enrolled?

Implicit:

Use table prefixes to Explicit:

qualify column
names that are in
multiple tables.

Result:

Use column aliases

to distinguish
columns that have
identical names, but
reside in different
tables.

Natural Join

A natural join is similar to equi-join, however, it is performed over

matching columns, and one of the duplicate columns is eliminated in
the result table.

Query: What are the id numbers and last names of all students, along
with their course codes for all the courses they have enrolled?

This join can only be used on all columns that have the same names
and data types in both tables. Otherwise, an error is returned.

The result table is the same as the equi-join.

Advanced Database Systems 3

1.0 Advanced SQL

Outer Join

In SQL:1999, the joined table that returns only matched rows is

called an INNER join. Both equi-join and natural join performs
an inner join of two tables.

Sometimes you also would want to retrieve rows that do not

have matching values in the common columns. This join is
known as the OUTER join. Outer join can be left, right, or full
outer join. A join between two tables that returns the matched
and unmatched rows from the left (or right) table is called a left
(or right) outer join. Likewise, joining two tables that returns
the results of the inner join as well as the results of the left and
right outer join is a full outer join.

Query: List all the student id, name and course code for all the
students listed in the Student table. Include also the student
who did no take any course.

Result:

Notice that the query retrieves all the rows in the Student table,
which is the left table, even if there is no match in the
Enrollment table. Consequently, a null value in the column of
the Enrollment table indicates that no match is achieved.

4
MODULE OF INSTRUCTION

Table 1.3 Course

Query: List all the student id and course code for all the courses listed
in the Course table (Table 1.3). Include the course code even if there is
no student enrolled in it.

Result:

The query retrieves all the rows in the Course table, which is the right
table, even if there is no match in the Enrollment table. Notice also
that the outer join indicates null as a value for the student id column
where no match is found.

Query: List the student id, name, course code and description for all
students and courses in the Student and Course Table. Include the rows
even if there is no data available.

Advanced Database Systems 5

1.0 Advanced SQL

Result:

Notice the null values on the columns with no match rows in the related
tables.

Self Join

Sometimes it is necessary to join a table to itself. This

operation is known as self join or recursive join.

It is common in unary relationship, such as the Manages

relationship as shown in Figure 1.1.

Figure 1.1 Unary relationship

Table 1.4 Employee

6
MODULE OF INSTRUCTION

In Table 1.4, the employee_id is the primary key and manager_id is

the foreign key.

Query: Find the names of the employees, along with their managers in
the Employee table.

Use table aliases in It is necessary to use a table alias when joining a table to itself to avoid
performing a self ambinguity. Otherwise, an error is returned since you are using the
join. same table in the query.

Result:

Subqueries
The previous SQL examples illustrate one of the approaches for
processing multiple tables. SQL also provides a powerful technique to
obtain values based on unknown conditional value by using
subqueries. A subquery is a SELECT statement (inner query) that is
embedded in the WHERE or HAVING clause of another SELECT
statement (outer query). The inner query executes first and returns a
value that is used by the outer query.

Two classes of comparison operators are used in the subqueries: singe-

row operators and multiple-row operators.

Advanced Database Systems 7

1.0 Advanced SQL

Single-row Subquery

In single-row subqueries, only one row is returned by the inner

query. This type of subquery uses a single-row operator.

Table 1.5 Single-row operators

Use single-row
operators if the
result of the inner
query returns only
one row.
Table 1.6 Emp

Table 1.6 shows the records of the Emp table, which will be
used in the subsequent examples.

Query: Display the employees along with their salary whose

salary is less than Hunold’s salary.

8
MODULE OF INSTRUCTION

Result:

Multiple-row Subquery

Multiple-row suqueries return more than row and uses a multiple-row

operator, which expects one or more values from the inner query.

Table 1.7 Multiple-row operators

Use multiple-row
operators if the
result of the inner
query returns more
than row.

Query: Find the employees who earn the same salary as the total salary
of all employees in each job.

Result:

Advanced Database Systems 9

1.0 Advanced SQL

Query: Find the employees whose salary is the same as the

salary of any employees with job id of IT_PROG and whose
job is not IT_PROG.

Result:

Query: Find the employees whose salary is greater than the

salary of all employees with job id of IT_PROG and whose job
is not IT_PROG.

Result:

Correlated Subquery

In the preceding subquery examples, you must examine the

results of the inner query before considering the outer query.
On the contrary, correlated subqueries use the results of the
outer query in processing the inner query. You can use the
EXISTS operator to process correlated subquery.

10
MODULE OF INSTRUCTION

Query: Find the student id and name of the students who enrolled in
the course MIT101.

Result:

You can also process this multiple table query using the IN operator or
the join SQL statement. You will get the same results.

Activities/Exercises
Answer briefly the following questions:

1 When is an outer join used instead of an inner join?

2 Explain the processing order of a correlated subquery.

3 Explain the following statement regarding SQL: Any query that

can be written using the subquery approach can also be written
using the joining approach but not vice versa.

Advanced Database Systems 11

1.0 Advanced SQL

Glossary
Correlated subquery

- A subquery in which processing the inner query depends on

data from the outer query.

Equi-join

- A join in which the joining condition is based on equality

between values in the common columns. Common columns
appear (redundantly) in the result table.

Join

- A relational operation that causes two tables with a

common domain to be combined into a single table.

Natural join

- A join that is the same as equijoin except that one of the

duplicate columns is eliminated in the result table.

Outer join

- A join in which rows that do not have matching values in

common columns are nevertheless included in the result
table.

Subquery

- A query (inner query) within another query (outer query).

References
Hoffer, J., Ramesh, V., Topi, H. (2013). Modern Database
Management, 11th Ed. New Jersey: Pearson Education, Inc.

Price, J. (2008). Oracle Database 11g SQL: Master SQL and

PL/QL in the Oracle Database. New York: McGraw-Hill.

A Brief Overview On Data Mining Survey PDF
No ratings yet
A Brief Overview On Data Mining Survey PDF
8 pages
Oracle Workflow Tutorial PDF
No ratings yet
Oracle Workflow Tutorial PDF
39 pages
Screenshot 2024-06-04 at 12.00.45 AM
No ratings yet
Screenshot 2024-06-04 at 12.00.45 AM
45 pages
Screenshot 2024-06-03 at 11.59.21 PM
No ratings yet
Screenshot 2024-06-03 at 11.59.21 PM
45 pages
Screenshot 2024-06-04 at 12.07.18 AM
No ratings yet
Screenshot 2024-06-04 at 12.07.18 AM
45 pages
Screenshot 2024-06-04 at 12.01.00 AM
No ratings yet
Screenshot 2024-06-04 at 12.01.00 AM
45 pages
Data Mining
No ratings yet
Data Mining
30 pages
BI-Unit-3-Part-1-PPT.ppt
No ratings yet
BI-Unit-3-Part-1-PPT.ppt
51 pages
DMDW-UNIT1
No ratings yet
DMDW-UNIT1
31 pages
introduction to Data Mining
No ratings yet
introduction to Data Mining
48 pages
Unit I DM
No ratings yet
Unit I DM
27 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
Data Mining Chapter 1
0% (1)
Data Mining Chapter 1
12 pages
Process: 1. Data Mining (The Analysis Step of The "Knowledge Discovery in Databases" Process, or KDD)
No ratings yet
Process: 1. Data Mining (The Analysis Step of The "Knowledge Discovery in Databases" Process, or KDD)
4 pages
BI Chapter 04 - Unlocked
No ratings yet
BI Chapter 04 - Unlocked
47 pages
Unit 1
No ratings yet
Unit 1
43 pages
Data Mining - Bi 3
No ratings yet
Data Mining - Bi 3
40 pages
Unit III Dwdm
No ratings yet
Unit III Dwdm
113 pages
Presentation 1
No ratings yet
Presentation 1
28 pages
Chapter 6_Data Mining
No ratings yet
Chapter 6_Data Mining
62 pages
Data Mining
No ratings yet
Data Mining
20 pages
My Chapter Two
No ratings yet
My Chapter Two
57 pages
Data Mining Implementation
No ratings yet
Data Mining Implementation
9 pages
Data Mining
No ratings yet
Data Mining
254 pages
Unit 3
No ratings yet
Unit 3
33 pages
Data Mining
No ratings yet
Data Mining
63 pages
Data Mining Concepts
100% (3)
Data Mining Concepts
122 pages
DM Chapter 4
No ratings yet
DM Chapter 4
47 pages
Unit - I MLT
No ratings yet
Unit - I MLT
137 pages
Unit 1 Data Mining task
No ratings yet
Unit 1 Data Mining task
7 pages
Data Mining.intro
No ratings yet
Data Mining.intro
17 pages
PPT4 W3 S4 R0 Predictive Analytics I Data Mining Process
No ratings yet
PPT4 W3 S4 R0 Predictive Analytics I Data Mining Process
50 pages
Introduction To Data Mining & Business Intelligence
No ratings yet
Introduction To Data Mining & Business Intelligence
25 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
DWM Merged
No ratings yet
DWM Merged
125 pages
Data Mining: Prof Jyotiranjan Hota
No ratings yet
Data Mining: Prof Jyotiranjan Hota
17 pages
4 Datamining
No ratings yet
4 Datamining
90 pages
datamining&warehousing
No ratings yet
datamining&warehousing
65 pages
Data Mining - An Overview
No ratings yet
Data Mining - An Overview
40 pages
Data Mining
No ratings yet
Data Mining
25 pages
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
4 pages
What Is Data Mining: Effective Data Collection Warehousing
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
21 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Survey Paper SN
No ratings yet
Survey Paper SN
4 pages
UNIT-III
No ratings yet
UNIT-III
33 pages
Chapter 6 Data Mining
No ratings yet
Chapter 6 Data Mining
39 pages
Data Mining Overview
No ratings yet
Data Mining Overview
14 pages
Sharda_11e_full_accessible_ppt_04
No ratings yet
Sharda_11e_full_accessible_ppt_04
40 pages
1.1 Data and Information Mining
No ratings yet
1.1 Data and Information Mining
24 pages
Data Mining e Resources
No ratings yet
Data Mining e Resources
98 pages
DM - MOD - 1 Part II
No ratings yet
DM - MOD - 1 Part II
14 pages
Ch1 Overview Kdd_ml
No ratings yet
Ch1 Overview Kdd_ml
23 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Notes Module 2
No ratings yet
Notes Module 2
28 pages
5 What Is Data-WPS Office
No ratings yet
5 What Is Data-WPS Office
19 pages
01-Introduction To Data Mining
No ratings yet
01-Introduction To Data Mining
43 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
45 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
DBMS LAB Important Questions For UNIV LAB
No ratings yet
DBMS LAB Important Questions For UNIV LAB
6 pages
Less08 Data TB3
No ratings yet
Less08 Data TB3
31 pages
Salesforce
100% (1)
Salesforce
50 pages
Cursor PDF
No ratings yet
Cursor PDF
104 pages
4) CKM (Checking Knowledge Module) :: 1) Flow Control
No ratings yet
4) CKM (Checking Knowledge Module) :: 1) Flow Control
4 pages
DBMS Sample Problem Statements
No ratings yet
DBMS Sample Problem Statements
3 pages
Instead of Triggers
No ratings yet
Instead of Triggers
14 pages
Swathi PL - SQL Developer Resume
No ratings yet
Swathi PL - SQL Developer Resume
4 pages
Firebird Gbak
No ratings yet
Firebird Gbak
30 pages
DBMS LAB MANUAL Aiml
No ratings yet
DBMS LAB MANUAL Aiml
57 pages
Dbms Lab Manual RGPV
No ratings yet
Dbms Lab Manual RGPV
38 pages
MSSQL to Tibero Migration
No ratings yet
MSSQL to Tibero Migration
20 pages
Semester 2 Mid Term Exam 2 PDF
No ratings yet
Semester 2 Mid Term Exam 2 PDF
23 pages
Answers
No ratings yet
Answers
5 pages
Sybase Session 4-Document
No ratings yet
Sybase Session 4-Document
11 pages
Midterm Sem2
No ratings yet
Midterm Sem2
20 pages
An Introduction To Triggers
No ratings yet
An Introduction To Triggers
8 pages
SQL Test 1
No ratings yet
SQL Test 1
8 pages
Codd Rules
No ratings yet
Codd Rules
63 pages
SQL, PL/SQL Faq About Triggers:: Insert Update Delete
No ratings yet
SQL, PL/SQL Faq About Triggers:: Insert Update Delete
25 pages
rzahfpdf
No ratings yet
rzahfpdf
34 pages
1 Course Materials 2 Prerequisites 3 Course Outline 5 Setup 9 Microsoft Official Curriculum 12 Microsoft Certified Professional Program 13 Facilities 15
No ratings yet
1 Course Materials 2 Prerequisites 3 Course Outline 5 Setup 9 Microsoft Official Curriculum 12 Microsoft Certified Professional Program 13 Facilities 15
11 pages
Orafaq
No ratings yet
Orafaq
5 pages
Design Patterns Elements of Reusable Object-Oriented Software
No ratings yet
Design Patterns Elements of Reusable Object-Oriented Software
17 pages
PR PLSQL Sem3 Day Wise IMCA
No ratings yet
PR PLSQL Sem3 Day Wise IMCA
12 pages
Database Lecture08
No ratings yet
Database Lecture08
40 pages
36 Anshuman RDBMS
No ratings yet
36 Anshuman RDBMS
15 pages
ELT Process
No ratings yet
ELT Process
80 pages
ENCh 03
No ratings yet
ENCh 03
56 pages