Week 1 - Database Concepts and Systems
Week 1 - Database Concepts and Systems
2
Why Databases?
3
Why Databases?
➢ Data is not only ubiquitous and pervasive; it
is also essential for organizations to survive
and prosper.
➢ Imagine trying to operate a business without
knowing who your customers are, what
products you are selling, who is working for
you, who owes you money, and to whom you
owe money.
➢ All businesses must keep this type of data and
much more.
➢ At the heart of all these systems are the
collection, storage, aggregation,
manipulation, dissemination, and
management of data.
➢ Databases are specialized structures that
allow computer-based systems to store,
manage, and retrieve data very quickly.
➢ Virtually all modern business systems rely on
databases.
4
Data is the new oil…
data value
5
Data vs. Information
Data Information
Raw facts Produced by processing data
Raw data - Not yet been
processed to reveal the meaning Reveals the meaning of data
Building blocks of information To reveal meaning,
Raw data must be properly information requires context
formatted for storage, processing, Enables knowledge
and presentation (ie;
creation
DD/MM/Y)
Data management Should be accurate, relevant,
Generation, storage, and and timely to enable good
retrieval of data decision making
6
Data vs. Information
7
Data / Information / Knowledge / Wisdom Pyramid
Data vs. Information
Data is conceived of as symbols or signs,
representing stimuli or signals.
Information is defined as data that are endowed
with meaning and purpose.
Knowledge is a fluid mix of framed experience,
values, contextual information, expert insight and
grounded intuition that provides an environment
and framework for evaluating and incorporating
new experiences and information.
It originates and is applied in the minds of
knowers.
In organizations it often becomes embedded
not only in documents and repositories but
also in organizational routines, processes,
practices and norms.
Wisdom is the ability to increase effectiveness.
Wisdom adds value, which requires the mental
function that we call judgment.
The ethical and aesthetic values that this
implies are inherent to the actor and are
unique and personal.
8
Data vs. Information
Example 1 - Imagine the string “WifiPassword”. The string alone is data. Understanding
that it is a string is information. Knowing it is your wifi password is knowledge. And
using is to access your wireless is wisdom.
Example 2 - Fitness tracking devices collect your health and activity data, but your end
goal is to use that to make decisions about how to train or how to manage your health.
Data: The smartwatch collects raw data such as the number of steps taken, heart rate, and
sleep duration.
Information: The smartwatch app organizes and structures the data, displaying it in a
comprehensible format, such as daily step count, average heart rate, and hours of sleep
per night.
Knowledge: Analyzing and interpreting the information may reveal patterns, such as
increased step count leading to improved sleep quality or a correlation between heart rate
and workout intensity.
Wisdom: Understanding these patterns lets you make informed decisions about adjusting
your exercise routine, sleep habits, and other lifestyle factors to improve your health and
fitness.
9
Database
… is a shared, integrated computer structure that stores a
collection of:
End-user data - Raw facts of interest to end user
Metadata - Data about data, which the end-user data are integrated
and managed
Describe data characteristics and relationships
For example, the metadata component stores information such as the name
of each data element, the type of values (numeric, dates, or text) stored on
each data element, and whether the data element can be left empty.
The metadata provides information that complements and expands the value
and use of the data.
In short, metadata presents a more complete picture of the data in the
database.
Given the characteristics of metadata, you might hear a database described as
a “collection of self-describing data.”
10
11
Database management system (DBMS)
is a collection of programs that manages the database structure and
controls access to the data stored in the database.
Roles of the DBMS:
Intermediary between the user and the database,
Enables data to be shared,
Presents the end user with an integrated view of the data,
Receives and translates application requests into operations
required to fulfill the requests,
Hides database’s internal complexity from the application
programs and users.
12
The DBMS Manages the Interaction between the
End User and the Database
13
Advantages of the DBMS
• Better data integration (how actions in one segment of the company
affect other segments) and less data inconsistency
– Data inconsistency: Different versions of the same data appear in different
places
For example, when a company’s sales department stores a sales
representative’s name as Bill Brown and the company’s personnel department
stores that same person’s name as William B. Brown, or
When the company’s regional sales office shows the price of a product as
$45.95, and its national sales office shows the same product’s price as
$43.95.
• Increased end-user productivity
– The availability of data, combined with the tools that transform data into
usable information, empowers end users to make quick, informed decisions
that can make the difference between success and failure in the global
economy.
14
Advantages of the DBMS
• Improved:
Data sharing
Data security
Data access
The DBMS makes it possible to produce quick answers to ad hoc
queries.
From a database perspective, a query is a specific request issued
(SQL codes) to the DBMS for data manipulation-for example, to
read or update the data.
Decision making
Data quality: Promoting accuracy, validity, and timeliness of data
15
Types of Databases
Over the years, as technology and innovative uses of databases have evolved,
different methods have been used to classify databases.
For example, databases can be classified by the number of users supported, where
the data is located, the type of data stored, the intended data usage, and the degree
to which the data is structured.
Single-user database: Supports one user at a time
the #
of Desktop database: Runs on PC
users
Multiuser database: Supports multiple users at the same time
Workgroup databases: Supports a small number of users (less than 50) or a
specific department
Enterprise database: Supports many users (more than 50) across many
departments
16
Types of Databases
Centralized database: Data is located at a single site
Loca
tion Distributed (decentralized) database: Data is distributed
across different sites
Cloud database: Created and maintained using cloud data services
(such as Microsoft Azure or Amazon AWS) that provide defined performance
measures for the database.
17
Types of Databases
General-purpose databases: Contains a wide variety of data used
in multiple disciplines
For example, a census database that contains general demographic data and the
Type
LexisNexis and ProQuest databases that contain newspaper, magazine, and
of journal articles for a variety of topics.
data
stored Discipline-specific databases: Contains data focused on specific
subject areas
Examples of discipline-specific databases are financial data stored in databases
such as CompuStat or CRSP (Center for Research in Security Prices),
geographic information system (GIS) databases that store geospatial and other
related data, and medical databases that store confidential medical history data.
18
Types of Databases
Operational database: Designed to support a company’s day-
to-day operations.
The also known as an online transaction processing (OLTP)
most
popular
database, transactional database, or production database.
categor
ization Analytical database: Stores historical data and business metrics
used exclusively for tactical or strategic decision making.
… allows the end user to perform advanced analysis of business
data using sophisticated tools for pricing decisions, sales
forecasts, market strategies, and so on.
«data massaging» = data manipulation for information
production.
19
Types of Databases
Analytical databases comprise two main components: a data
warehouse and an online analytical processing front end.
Data warehouse: Stores data in a format optimized for
decision support.
… contains historical data obtained from the operational databases as well
as data from other external sources.
Online analytical processing (OLAP)
… is a set of tools that work together to provide an advanced data analysis
environment for retrieving, processing, and modeling data from the data
warehouse.
20
Types of Databases
This graphic illustrates the concept
of OLAP.
21
Types of Databases
In recent times, the area of database application has grown in importance and usage, to the point
that it has evolved into its own discipline: business intelligence.
Business intelligence (BI): Captures and processes business data to generate information
that support decision making.
The lack of skilled and proficient workforce is one of biggest challenge faced by most of the
organization while implementing this tool hence acts as a major constraints in the growth of
this market.
22
Types of Databases
Unstructured data: It exists in their original state,
therefore, does not lend itself to the processing that yields
information.
The
degree
Structured data: It results from formatting
of data
structure Structure is applied based on type of processing to be
performed
For example, the data value 37890 might refer to a zip code, a sales value, or a
product code. If this value represents a zip code or a product code and is stored
as text, you cannot perform mathematical computations with it. On the
other hand, if this value represents a sales transaction, it must be formatted
as numeric.
23
Types of Databases
25
Database Design
A problem that has evolved with the use of personal productivity tools such as
spreadsheets and desktop database programs is that users typically lack proper data-
modeling and database design skills.
People naturally have a “narrow” view of the data in their environment.
Database design refers to the activities that focus on the design of the database
structure that will be used to store and manage end-user data.
Even a good DBMS will perform poorly with a badly designed database.
Data is one of an organization’s most valuable assets.
Because current-generation DBMSs are easy to use, an unfortunate side effect is that
many computer-savvy business users gain a false sense of confidence in their ability
to build a functional database.
Well-designed database; Facilitates data management & Generates accurate and valuable
information.
Poorly designed database causes difficult-to-trace errors.
26
Database Design
ID Enum Name Title HireData Skill1 Skill1Date Skill2 Skill2Date Skill3 Skill3Date
27
Database Design
Designing appropriate data
repositories of integrated A better solution is…
28
Basic File Terminology
30
Data Anomaly
Data anomaly: Develops when not all the required changes
in the redundant data are made successfully.
Update Anomalies
Insertion Anomalies
Deletion Anomalies
31
Database Systems
Logically related data stored in a single logical data
repository
Physically distributed among multiple storage
facilities
DBMS eliminates most of file system’s problems
Current generation DBMS software:
– Stores data structures, relationships between structures, and
access paths
– Defines, stores, and manages all access paths and components
32
The Database System Environment
The term database system refers to an organization of components that define and regulate
the collection, storage, management, and use of data within a database environment.
33
From a general management point of view, the database system is composed of the five
major parts: hardware, software, people, procedures, and data.
The Database System Environment
Hardware. … refers to all the system’s physical
devices, including computers (PCs, tablets,
workstations, servers, and supercomputers), storage
devices, printers, network devices (hubs, switches,
routers, fiber optics), and all other devices.
Procedures. Procedures are the instructions People. This component includes all users of the
and rules that govern the design and use of database system. Based on primary job functions, five
the database system. types of users can be identified in a database system:
system administrators (database system’s general
Data. The word data covers the collection of operations), database administrators (ensure that the
facts stored in the database. database is functioning properly), database designers
(design the database structure), system analysts and
programmers (design and implement the application
programs), and end users (the people who use the
34 application programs to run the organization’s daily
operations).
DBMS Functions
Data dictionary management
Data dictionary: Stores definitions of the data elements and their relationships. The DBMS provides
data abstraction, and it removes structural and data dependence from the system.
Security management
Enforces user security and data privacy.
35
DBMS Functions
Multiuser access control
Sophisticated algorithms ensure that multiple users can access the database
concurrently without compromising its integrity.
36
DBMS Functions
Database access languages and application programming interfaces
Query language: Lets the user specify what must be done without having to specify
how.
Structured Query Language (SQL): De facto query language and data access
standard supported by the majority of DBMS vendors.
Management complexity
Vendor dependence
38