0% found this document useful (0 votes)
16 views

5.1 DB2 Introduction

The document provides an introduction to DB2 and database concepts including DBMS, relational model, data normalization, and data integrity. It compares databases to file systems, discusses key database terms and features including data independence, security, and integrity. It also covers database models like hierarchical, network and relational as well as relational components and constraints.

Uploaded by

Harini M
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

5.1 DB2 Introduction

The document provides an introduction to DB2 and database concepts including DBMS, relational model, data normalization, and data integrity. It compares databases to file systems, discusses key database terms and features including data independence, security, and integrity. It also covers database models like hierarchical, network and relational as well as relational components and constraints.

Uploaded by

Harini M
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

DB2

Introduction to DB2
Topics

• Introduction to DB2
o DBMS concepts
o Relational model
o E-R Diagrams
o Data normalization techniques
Introduction to DB2
Comparing databases with files

• Some limitations of file based data storage


• Sharing of data within the file concurrently with many people
• Security of data
• Data redundancy
• Why a database?
o To solve some of the problem, found in conventional file processing
o Database concept solves the problem of
• Duplication of Data and effort
• Programmers need to know the exact file structure
• Programs needs to be changed
o Databases are used to meet four main data-related goals
• Increasing data independence
• Reducing data redundancy
• Increasing security
• Maintaining data integrity
Database (contd.)

• Information & Data


o When information is entered and stored in a computer, it is referred as
data.
o After processing the data (such as formatting or printing), output data can
again be perceived as information
o Information is something that can be understood with a context whereas
data is just characters.
• Database
o Collection of data that is organized in a way, so that it can easily be
accessed, managed, and modified.
• DBMS
o A database management system (DBMS) is a software application that
interacts with the user, other applications, and the database itself to
capture and analyze data.
o A general-purpose DBMS is designed to allow the definition, creation,
querying, update, and administration of databases.
Database

• Data independence
o If the format of the data is changed or re-organized, the program dependent
on that data must also be changed
o for example, suppose you want to add city and state to an address field, then
the program must be modified even though it may not need the newly added
fields.
o In a database management system,
• the data and program are independent
• a change in data organization will not usually affect the user programs
• users do not have to worry about how the data is actually stored
• users can focus on the particular part of the data they want to access with an
application program
• user programs manipulate the data only, not the organization of the data
Database (contd.)

• Data redundancy
o In the conventional file system, the same data may be stored in many files to
meet needs of different applications.
• Therefore, some of the data is redundant.
o For example, with traditional file processing it is possible to find a customer’s
address repeated in many files.
• Customer Record System
• Purchase Order System
• Account Receivable System
o This redundancy in conventional file processing system could cause problem
when the address changes.
• address should be modified in all these files leads to duplication of efforts and
more opportunity for update anomaly
o In a database system, different applications share the same centralized data.
• A database system is able to reduce data redundancy by storing each element in
only one location.
Database (contd.)

• Data Security
o Databases often uses database security manager for protection.
o There are restrictions on how an user can access the data itself
o Most databases have security features that authorize access to the data and
places restrictions on the operations that may be performed on the data.
• Such features and restrictions enable the database to meet another of its goals –
to increase data security.
• Data Integrity
o Every one who uses the database is responsible for the integrity of the data.
o Maintaining integrity of the data is very important in a database system
because the database is shared by many users.
o Without integrity the data could become invalid or unreliable.
Database (contd.)

• Characteristics of a database
o Program-Data Independence
• Immunity of applications to change in the storage structure and access strategy of
data
o Database should be a collection of data which has no redundancy
o Data Integrity
• No inconsistency while maintaining data
o Data protection
o Dynamic definition
• e.g. add a column
Database (contd.)

• In a DBMS environment, there are two functionalities


o Design, develop and maintain the database
o Application program that uses the database
• DBMS functions
o Define data
o Read/Write data
o Write audit trail
o Provide recovery and backup
o Provide data security, data independence, multiple user access
o Avoid data redundancy
• Application user functions
o Logically access and process data
Database (contd.)

• Hierarchical model
o Top down structure
o Parent child relationship
o Used even today in mainframe applications
• example – IMS
• Network model
o No concept of parent/child
o Any record type can be associated with any number of arbitrary record
types
o Not a very successful model
• example – IDMS
• Relational model
o Data stored in the form of tables
• Consists of multiple rows and columns.
• examples – Oracle, DB2 (UDB), mySQL, SQL Server
Database (contd.)

• Problems with non-relational systems


o Programmer must know the database structure
o Limited flexibility for change
o Connections maintained by external pointers
o Many-to-many relationship problem
o Structures can be complex, making programming complex
Database (contd.)

• RDBMS
o A relational database is a system that is structured in sets of two dimensional
tables
o DB2’s simplicity is automatic navigation.
• Navigation is to simply determine the path to the data
o In some non-relational systems, when you want to retrieve data you must
specify
• What data you want to retrieve
• How to find it
o In relational database user can concentrate their business functions, instead
of wrestling with data processing problems, making them more productive
o User can easily interact with system with queries.
o Ease of use and power of SQL.
• Concern only for WHAT data is needed, not HOW do to get the data.
Database (contd.)

• Advantages of RDBMS
o Data independence (logical/physical)
o Avoiding data redundancy
o Provide multiple views
o Allow sharing of data
o Provide facilities to search quickly
o Provide good security & control over data
o Provide methods for easier backup & recovery
o Provide maximum concurrency without compromising on data integrity
Database (contd.)

• Relational terms
o Candidate key
• Some attribute (or a set of attributes) that may uniquely identify each row(tuple)
in the relation(table)
o Primary key
• The candidate key that is chosen for primary attributes to uniquely identify each
row.
o Alternate key
• Candidate keys that were not chosen as primary key can be alternate key.
o Foreign key
• An attribute of one relation that might be a primary key of another relation
Database (contd.)

• Data structure (Relational)


o All data in the relational model is organized into two dimensional TABLES
called RELATION.
o The rows of such tables are referred to as tuples.
o Columns are usually referred to as attributes of a relation.
• All entries in a column must be of the same kind (domain)
• All columns must be assigned with distinct values by name.
• Ordering of columns is not significant (for SQL processing).
o Each row in a relation must be distinct
• i.e. duplicate rows are not allowed in a RELATION.
• NULL (No value) is not supported for primary key
o Each column/row intersection(cell) in a relation should contain a single value
• i.e. values in a relation must be atomic.
o Sequence of rows is insignificant
Database (contd.)

• Relation
o In general tables are referred to as RELATIONS
o The relational approach to data is based on the realization that, files that
obey certain constraints may be considered as mathematical relations.
• Hence the term RELATION.
o A relational database is a collection of tables.
• Relational model components
o Data structure
o Data integrity
o Data manipulation
Database (contd.)

• Data integrity
o Refers to accuracy and validity of data in database.
o DB2’s security features ensure that data integrity is maintained and that
access to the data is authorized.
o Entity Integrity
• Primary key
o Declarative Integrity
• Declarations of columns
o Referential Integrity
• Foreign Key
Database (contd.)

• Entity Integrity
o A logical consequence of the definition of a primary key is that it guarantees
the uniqueness of each occurrence.
o No attribute participating in the primary key of a base relation is allowed to
accept NULLS.
o An entity with a primary key defined to it satisfies the entity integrity rules.
• Declarative integrity (Also called as Domain integrity)
o User defined constraints for a column value that is enforced by the DBMS
Database (contd.)

• Referential constraint
o A rule which enforces the relationship between a parent table and a child
table such that a non-null value of the foreign key must match a value of the
associated primary key.
• Referential Integrity
o The automatic enforcement of the referential constraint is called the
referential integrity.
o Referential integrity is used to define what will happen when the integrity of
the references defined for a particular database is encountered by a
DELETE, INSERT or UPDATE operations
Data Modeling
Data Modeling (contd.)

• Primary key
o It identifies its own row
o It must be unique in the entire table.
o Cannot be NULL
o Only one Primary Key can be defined in a table
• Related tables
o A row in one table carry KEY of another table
o It identifies a row of data.
• Foreign key
o It identifies the row of related data either in the same table or another table.
• But referenced column should be primary key
• Examples
o EMPT,DEPT
o CUSTOMER, ORDER, ORDER DETAILS, PRODUCT
Data Modeling (contd.)

• Table types
o The table containing the primary key which is related to a foreign key of a
dependent table is called parent table
o The table containing the foreign key is called dependent table
o The dependent of a dependent table is called descendant table
• Entity
o A business entity is something that is fundamental to the organization and
an individual instance or occurrence of this thing can be uniquely
identified.
o The relation between business entities can be
• One to One
• One to Many
• Many to Many
Data Modeling

• Business Modeling
o Business model is a formal representation of business information – its
object, the object's properties or attributes and the relationship between one
another
o Business modeling does not depend upon the underlying database
structure or a specific application.
• Instead the business model can span an entire enterprise or a division of the
organization
o It requires a solid understanding of the business and can serve as a
verification of the user's view of the business before the database is even
designed
Data Modeling (contd.)

• E-R Model
o Converting Business entities to data entities consists of turning business
entities into data entities and then determining the attributes or properties of
each entities.
o E-R model is a logical representation of data for a business area
• Business entity becomes data entity
• Represented as entities, relationship between entities and attributes of both
relationships and entities.
• E-R models are conceptual data models expressed in the form of an E-R
Diagram
o Steps in modeling
• Identify Primary and Foreign Keys
• Resolve many to many relationships
• Normalize the data design
Data Modeling (contd.)

• Example:
o CUSTOMER places ORDERS
o ORDERS have PRODUCTS
• Each order relates to only one customer (one-to-one)
• A customer can place any number of orders (one-to-many)
• Many orders can contain many products (many-to-many)
o A product can be a part of many orders
o An order can have many products
• Customer, Order & Product are called ENTITIES.
• An Entity may transform into one or more tables
• The unique identity for information stored in an ENTITY is called a
PRIMARY KEY.
o e.g... CUSTOMER-ID uniquely identifies each customer
Data Modeling (contd.)

• Relationships
o For e.g.. CUSTOMER is related to ORDERS through ‘ORDER_NO’ which is
the foreign-key in CUSTOMER and primary key in another ORDER table.
o Relationships transform into foreign keys.
o As per the relational integrity the primary key, ORDER_NO, for the table
‘ORDER’ can never be NULL, while it can be NULL in the table
‘CUSTOMER’.
• Since ORDER_NO is not a primary key in the CUSTOMER Table, it can be
allowed to have nulls, though practically it is rare
Data Modeling (contd.)

• E-R Diagrams
o There are many parts in a business model and there is no formal definition
of what part must exist to be a valid model.
o It can be as simple as just naming the major business entities or objects
and their relationship to each other.
o This is commonly manifested in an Entity Relationship Diagram
o Entity Relationship diagram is a pictorial representation of the user's view of
the business representation
Data Modeling (contd.)

• One to many relationship


o One Employee should work in only one department

EMP WORKDEPT

EMP WORKDEPT DEPT

EMP WORKDEPT
Data Modeling (contd.)

• Many to one relationship


o One department can have many employees

EMP WORKDEPT

EMP WORKDEPT DEPT

EMP WORKDEPT
Data Modeling (contd.)

• Many to many relationship


o Many employees work in many projects

MANY

EMP PROJ

EMP PROJ

EMP PROJ

MANY
Data Modeling (contd.)

• An E-R Diagram

1 Department M
DEPT has many LOCATION
locations

M M M
Many M
Employees
EMPL works on PROJECT
many projects
M M

SKILLS
Normalization
Normalization

• Data normalization
o Data Normalization is a step-by-step technique of analyzing data into its
constituent entities and attributes.
o Removes functional dependencies amongst entities.
o Eliminates redundancy.
o Prevents update anomalies and data inconsistencies.
o Necessary for transactional data
• De-normalization
o Reverse of normalization
o Useful for 'read-only' data with heavy SQL SELECT access for reports
Normalization (contd.)

• There are many levels in Normalization


o Un normalized entity
o First normal form
• Remove repeating groups
o Second normal form
• Remove partial key dependencies
o Third normal form
• Remove indirect or transitive dependencies
o Higher levels of normalization leads to more complexity
• Trade off between redundancy and easier access
Normalization (contd.)

• Consider an entity PURCHASE-ORDER represented as follows


• PURCHASE-ORDER
o (order-no, cust-no, cust-address, order date, delivery-date, {book-no, book-
title, book-price, qty-ordered, price}, total-price)
o The attribute type order-no is the entity identifier.
o The inner set of brackets signifies that the attribute types enclosed in them
form a repeating group.
Normalization (contd.)

• Un-normalized data

Order Number : 1101


Customer Number : 1234
Customer Address : 6, Smith Road, Chennai, 600 002
Order Date : 20/03/2003
Delivery Date : 20/04/2004

Book No Book Title Price Qty Total

7100 COBOL Handbook 155.00 2 310.00


3990 LOTUS 1-2-3 80.00 5 400.00
3667 dBASE-IV 100.00 4 400.00

1110.00
Normalization (contd.)

• First normal form


o Remove any repeating groups of attribute types.
o Rewrite them as new entities.
o In order to link the derived entities to the original entity the identifier of the
original entity must be part of the derived entities.
o Remove repeating groups
Here The repeating group is
{ order-no,book-no, book-title, book-price, Qty-ordered, price}

The 1NF of the entity PURCHASE-ORDER now becomes :

o PURCHASE-ORDER (order-no,cust-no,cust-address,order-date, delivery-


date )

PURCHASE-ITEM (order-no,book-no, book-title, book-price, Qty-ordered,


price)
Normalization (contd.)

• First normal form


PURCHASE – ORDER
ORDER-NO CUST-NO CUST-ADDRESS ORDER-DATE DELIVERY-DATE TOTAL-PRICE
1101 1234 6,Smith Road 20/03/2003 20/04/2004 1100.00
1103 1254 7,ITC 12/12/2004 18/12/2004 550.00
1104 1789 10/A UK Council 01/04/2004 08/04/2004 100.00

PURCHASE-ITEM
ORDER-NO BOOK-NO BOOK-TITLE BOOK-PRICE QTY PRICE
1101 7100 COBOL Hand Book 55.00 2 310.00
1101 3990 LOTUS 1-2-3 80.00 5 400.00
1101 3667 DBASE - IV 100.00 4 400.00
1103 7100 COBOL Hand Book 155.00 2 310.00
1103 3900 LOTUS 1-2-3 80.00 3 240.00
1104 3667 DBASE – IV 100.00 1 100.00
Normalization (contd.)

• Second normal form


o For an entity type to be in 2NF, it must already be in 1NF and every non-
identifying attribute type must be fully functionally dependent on the
identifier.
o Consider the purchase table.
• Order-no and Book-no constitute the composite primary key.
• But the book title and book-price does not depend on Order-no.
• The Book-title and Book-price primarily depend on Book-no.
• So it has to be eliminated
o PURCHASE-ORDER has only one attribute type as identifier and is
therefore already in 2NF.
o In PURCHASE-ITEM, Book-title & Book-price are partially dependent on the
full identifier and therefore not in 2NF.
Normalization (contd.)

• Second normal form


o PURCHASE-ORDER
• (order-no, cust-no, cust-address, order-date, delivery-date, total-price)
o PURCHASE-ITEM
• (order-no, book-no, qty-ordered, price)
o BOOK
• (book-no, book-title, book-price)
Normalization (contd.)
PURCHASE – ORDER

ORDER-NO CUST-NO CUST-ADDRESS ORDER-DATE DELIVERY-DATE TOTAL-PRICE


1101 1234 6,Smith Road 20/03/2003 20/04/2004 1100.00
1103 1254 7,ITC 12/12/2004 18/12/2004 550.00
1104 1789 10/A UK Council 01/04/2004 08/04/2004 100.00

PURCHASE-ITEM BOOK TABLE

ORDER-NO BOOK NO QTY PRICE BOOK-NO. BOOK-TITLE BOOK-PRICE


1101 7100 2 310.00 7100 COBOL HAND BOOK 155.00
1101 3990 5 400.00
1101 3667 4 400.00 3990 Lotus 1-2-3 80.00
3667 dBase – IV 100.00
1103 7100 2 310.00
8980 DB2 300.00
1103 3900 3 240.00
9854 CICS 400.00
1104 3667 1 100.00
2786 JCL Ranade 450.00
Normalization (contd.)

• Third normal form


o Determine whether any transitive dependencies, i.e. dependencies
between non-identifying attributes, exist.
o An entity type is in 3NF if it is already in 2NF and no transitive
dependencies exist.
o In the purchase-order table the customer-address not directly depend on
order-no. It is dependent on customer-no only.
• So in third normal form indirect dependency should be eliminated
Normalization (contd.)

• Third normal form


o CUSTOMER
• (cust-no, cust-address)

o PURCHASE- ORDER
• (order-no, cust-no, order-date, delivery- date, total-price)

o PURCHASE-ITEM
• (order-no, book-no, qty-ordered, price)

o BOOK
• (book-no, book-title, book-price)
Normalization (contd.)

• Third normal form


PURCHASE-ITEM TABLE BOOK-TABLE
ORDER-NO BOOK NO QTY PRICE BOOK-NO. BOOK-TITLE BOOK-PRICE
1101 7100 2 310.00 7100 COBOL HAND BOOK 155.00

1101 3990 5 400.00 3990 Lotus 1-2-3 80.00


1101 3667 4 400.00 3667 dBase – IV 100.00
1103 7100 2 310.00
8980 DB2 300.00
1103 3900 3 240.00
9854 CICS 400.00
1104 3667 1 100.00
2786 JCL Ranade 450.00

CUSTOMER TABLE PURCHASE-ORDER TABLE


CUSTOMER-NO CUSTOMER-ADDRESS ORDER-NO CUST-NO ORDER-DATE DELIVERY-DATE TOTAL-PRICE
1234 6,SMITH ROAD 1101 1234 20/03/2003 20/04/2004 1100.00
1254 7,ITC 1103 1254 12/12/2004 18/12/2004 550.00
1789 10/A UK
COUNCIL 1104 1789 01/04/2004 08/04/2004 100.00
Questions?

Thank You !

You might also like