0% found this document useful (0 votes)
33 views37 pages

05 Database Management Systems

Database management systems address problems with traditional file-based data storage like data redundancy, inconsistent data, and difficulty accessing data. A DBMS provides a centralized, organized collection of data and a set of programs to access the data. It allows for more efficient data storage, retrieval, security, and consistency compared to traditional file-based systems. The relational model became popular and allowed for flexible querying of data through SQL. Modern databases now handle big data challenges through technologies like Hadoop and in-memory computing. Data warehouses further integrate heterogeneous data sources for business intelligence purposes.

Uploaded by

murtaza ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views37 pages

05 Database Management Systems

Database management systems address problems with traditional file-based data storage like data redundancy, inconsistent data, and difficulty accessing data. A DBMS provides a centralized, organized collection of data and a set of programs to access the data. It allows for more efficient data storage, retrieval, security, and consistency compared to traditional file-based systems. The relational model became popular and allowed for flexible querying of data through SQL. Modern databases now handle big data challenges through technologies like Hadoop and in-memory computing. Data warehouses further integrate heterogeneous data sources for business intelligence purposes.

Uploaded by

murtaza ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Database Management Systems

DBMS

Application
program
End-user
Problems
• Data redundancy and inconsistency
• Multiple file formats, duplication of information in different files
• Difficulty in accessing data
• Need to write a new program to carry out each new task
• Data isolation — multiple files and formats
• Time-consuming reporting processes
• Outdated data management technology
Solution
• In an age of nonorganic corporate growth where companies grow by
acquiring other companies, business firms quickly become a
collection of hundreds of databases, e-mail systems, personnel
systems, accounting systems, and manufacturing systems, none of
which can communicate with one another. Even if firms grow
organically without acquisitions it is common for separate
departments and divisions to have their own systems and databases.
Firms in this case suffer the same result: the firm becomes a
collection of systems that cannot share information.
• Replace disparate systems with enterprise system and data
management system
Basics
• Data: Known facts that can be recorded and have an implicit meaning (Information).
• Database
• is collection of related data and its metadata organized in a structured format for optimized
information management

• Database Management System (DBMS)


• A software package/system to facilitate the creation and maintenance of a computerized
database.
• Database System
• is an integrated system of hardware, software, people, procedures, and data that define and
regulate the collection, storage, management, and use of data within a database
environment
Database Management System
• Collection of interrelated
data
• Set of programs to access
the data
• DBMS provides an
environment that is both
convenient and efficient to
use.
• Databases touch all aspects
of our lives
Database System Environment
Timeline of Data Models

1960s 1970s 1980s 1990s 2000+

File-based

Hierarchical
Object-oriented
Network

Relational Web-based
Entity-Relationship
Manual File System
• To keep track of data
• Used tagged file folders in a filing cabinet
• Organized according to expected use
• e.g. file per customer
• Easy to create, but hard to
• locate data
• aggregate/summarize data
Computerized File System
• To accommodate the data growth and information need
• Manual file system structures were duplicated in the
computer
• Data Processing (DP) specialists wrote customized programs
to
• write, delete, update data (i.e. management)
• extract and present data in various formats (i.e. report)
File System
Database System vs. File System
Entity Relationship Model
• E-R Model can be expressed as the collection of entities, also called as real word
objects and relations between those entities.
• No two entities should be identical.
• Based on Entity, Attributes & Relationships
• Entity is a thing about which data are to be collected and stored
• e.g. EMPLOYEE
• Attributes are characteristics of the entity
• e.g. SSN, last name, first name
• Relationships describe an associations between entities
• i.e. 1:M, M:N, 1:1
Relationships
• Connect two or more entity sets.
• Represented by diamonds.

Students Taking Courses


E-R Diagram
• Entity
• represented by a rectangle
with its name in capital
letters.

• Relationships
• represented by an active
or passive verb inside the
diamond that connects
the related entities.
Relational Database
Provides a logical “human-level” view of the data and
associations among groups of data (i.e., tables)

Customer_ID Customer_Account Agent_ID


1224 4556 23
1225 4558 25

Agent_ID Last_Name First_Name Phone


23 Sturm David 334-5678
25 Long Kyle 556-3421

Customer_ID Last_Name First_Name Phone Account_Balance


1224 Vira Dyne 678-9987 1223.95
1225 Davies Tricia 556-3342 234.25
Relational Database tables for the entities SUPPLIER and PART showing how they represent each entity
and its attributes. Supplier Number is a primary key for the SUPPLIER table and a foreign key for the
PART table.

–Rows (tuples): Records


for different entities
–Fields (columns):
Represents attribute for
entity
–Key field: Field used to
uniquely identify each
record
–Primary key: Field in
table used for key fields
–Foreign key: Primary key
used in second table as
look-up field to identify
records from original
table
Relational Database
• Advantages
• Structural independence
• Separation of database design and physical data storage/access
• Easier database design, implementation, management, and use
• Ad hoc query capability with Structured Query Language (SQL)
• SQL translates user queries to codes

• Disadvantages
• Substantial hardware and system software overhead
• more complex system
• Poor design and implementation is made easy
• ease-of-use allows careless use of RDBMS
EXAMPLE OF AN SQL QUERY

SQL statements for a query to select suppliers for parts 137 or 150.
MICROSOFT ACCESS DATA DICTIONARY
FEATURES
Microsoft Access has a
rudimentary data dictionary
capability that displays
information about the size,
format, and other
characteristics of each field
in a database. Displayed here
is the information
maintained in the SUPPLIER
table. The small key icon to
the left of Supplier_Number
indicates that it is a key field.
Designing Databases
Conceptual (logical)
design: abstract model
from business perspective
Physical design: How
database is arranged on
direct-access storage
devices
AN UNNORMALIZED RELATION FOR ORDER

•Normalization
–Streamlining complex groupings of data to
minimize redundant data elements
NORMALIZED TABLES CREATED FROM ORDER

The Order table has been broken down into four smaller, related tables.
Order table contains only two unique attributes, Order Number and Order Date.
The multiple items ordered are stored using the Line_Item table.
The normalization means that very little data has to be duplicated when creating orders, most of
the information can be retrieved by using keys to the Part and Supplier tables.
Big data
• Massive sets of unstructured/semi-
structured data from Web traffic,
social media, sensors, and so on
• Petabytes, exabytes of data
• Volumes too great for typical DBMS
• Volume is increasing exponentially.
• Variety (Complexity)
• Velocity, need to be processed fast
A Single View to the Customer

Social Banking
Media Finance

Our
Gaming
Customer Known
History

Purchas
Entertain
e
Big Data needs speed
• Velocity refers to the frequency of incoming data that must be
processed. Think text messages, Facebook status updates, credit card
swipes, the multitude of sensors in modern cars, and the stock
exchange.
• Late decisions  missing opportunities
• Examples
• E-Promotions: Based on your current location, your purchase history, what you like  send
promotions right now for store next to you

• Healthcare monitoring: sensors monitoring your activities and body  any abnormal
measurements require immediate reaction
Big Data Opportunities
Business intelligence infrastructure
• Contemporary tools:
• Data warehouses
• Data marts
• Hadoop
• In-memory computing
Data warehouses
Problem: Heterogeneous Information Sources leading to:
Different interfaces
Different data representations
Duplicate and inconsistent information

“Heterogeneities are everywhere”


Personal
Databases

World
Scientific Databases
Wide
Web
Digital Libraries
Data warehouses
Solution: Unified Access to Data that:
 Collects and combines information
 Provides integrated view, uniform user interface
 Supports sharing

Integration System

World
Wide
Personal
Web
Digital Libraries Scientific Databases Databases
Data marts
• Subset of data warehouse
• Summarized or focused portion of data for use by specific
population of users
• Typically focuses on single subject or line of business
Hadoop
• Software platform that lets one easily write and run applications that
process vast amounts of data. It includes:
– MapReduce – distributes application
– HDFS – Hadoop distributed file system: distributes data
– Hbase – online data access
• Open-source framework that was created to make it easier to work
with big data.
• Hadoop also is often used interchangeably with big data, but it
shouldn’t be. Hadoop is a framework for working with big data. It is
part of the big data ecosystem.
Hadoop
In-memory computing
• Used in big data analysis
• Uses computers main memory (RAM) for data storage to avoid delays
in retrieving data from disk storage
• Can reduce hours/days of processing to seconds
• Storage is done on dedicated servers.
Data Mining
1. Collect Big Data or obtain access to a repository.
2. Perform data analysis to explore patterns (pattern recognition, predictive
analytics).
3. Identify potential correlations.
4. Infers rules to predict future behavior
• Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
• Forecasting
Text mining and Web Mining
• Text Mining:
• Extracts key elements from large unstructured data sets
• Mines e-mails, blogs, social media to detect opinions

• Web Mining:
• Discovery and analysis of useful patterns and information from Web
• Understand customer behavior
• Evaluate effectiveness of Web site, and so on
• Web content mining
• Mines content of Web pages
• Web structure mining
• Analyzes links to and from Web page
• Web usage mining
• Mines user interaction data recorded by Web server
Databases and the Web
Many companies use Web to make some internal databases available to customers
or partners
• Advantages of using Web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
• Inexpensive to add Web interface to system
Questions from Business

You might also like