5.1 DB2 Introduction
5.1 DB2 Introduction
Introduction to DB2
Topics
• Introduction to DB2
o DBMS concepts
o Relational model
o E-R Diagrams
o Data normalization techniques
Introduction to DB2
Comparing databases with files
• Data independence
o If the format of the data is changed or re-organized, the program dependent
on that data must also be changed
o for example, suppose you want to add city and state to an address field, then
the program must be modified even though it may not need the newly added
fields.
o In a database management system,
• the data and program are independent
• a change in data organization will not usually affect the user programs
• users do not have to worry about how the data is actually stored
• users can focus on the particular part of the data they want to access with an
application program
• user programs manipulate the data only, not the organization of the data
Database (contd.)
• Data redundancy
o In the conventional file system, the same data may be stored in many files to
meet needs of different applications.
• Therefore, some of the data is redundant.
o For example, with traditional file processing it is possible to find a customer’s
address repeated in many files.
• Customer Record System
• Purchase Order System
• Account Receivable System
o This redundancy in conventional file processing system could cause problem
when the address changes.
• address should be modified in all these files leads to duplication of efforts and
more opportunity for update anomaly
o In a database system, different applications share the same centralized data.
• A database system is able to reduce data redundancy by storing each element in
only one location.
Database (contd.)
• Data Security
o Databases often uses database security manager for protection.
o There are restrictions on how an user can access the data itself
o Most databases have security features that authorize access to the data and
places restrictions on the operations that may be performed on the data.
• Such features and restrictions enable the database to meet another of its goals –
to increase data security.
• Data Integrity
o Every one who uses the database is responsible for the integrity of the data.
o Maintaining integrity of the data is very important in a database system
because the database is shared by many users.
o Without integrity the data could become invalid or unreliable.
Database (contd.)
• Characteristics of a database
o Program-Data Independence
• Immunity of applications to change in the storage structure and access strategy of
data
o Database should be a collection of data which has no redundancy
o Data Integrity
• No inconsistency while maintaining data
o Data protection
o Dynamic definition
• e.g. add a column
Database (contd.)
• Hierarchical model
o Top down structure
o Parent child relationship
o Used even today in mainframe applications
• example – IMS
• Network model
o No concept of parent/child
o Any record type can be associated with any number of arbitrary record
types
o Not a very successful model
• example – IDMS
• Relational model
o Data stored in the form of tables
• Consists of multiple rows and columns.
• examples – Oracle, DB2 (UDB), mySQL, SQL Server
Database (contd.)
• RDBMS
o A relational database is a system that is structured in sets of two dimensional
tables
o DB2’s simplicity is automatic navigation.
• Navigation is to simply determine the path to the data
o In some non-relational systems, when you want to retrieve data you must
specify
• What data you want to retrieve
• How to find it
o In relational database user can concentrate their business functions, instead
of wrestling with data processing problems, making them more productive
o User can easily interact with system with queries.
o Ease of use and power of SQL.
• Concern only for WHAT data is needed, not HOW do to get the data.
Database (contd.)
• Advantages of RDBMS
o Data independence (logical/physical)
o Avoiding data redundancy
o Provide multiple views
o Allow sharing of data
o Provide facilities to search quickly
o Provide good security & control over data
o Provide methods for easier backup & recovery
o Provide maximum concurrency without compromising on data integrity
Database (contd.)
• Relational terms
o Candidate key
• Some attribute (or a set of attributes) that may uniquely identify each row(tuple)
in the relation(table)
o Primary key
• The candidate key that is chosen for primary attributes to uniquely identify each
row.
o Alternate key
• Candidate keys that were not chosen as primary key can be alternate key.
o Foreign key
• An attribute of one relation that might be a primary key of another relation
Database (contd.)
• Relation
o In general tables are referred to as RELATIONS
o The relational approach to data is based on the realization that, files that
obey certain constraints may be considered as mathematical relations.
• Hence the term RELATION.
o A relational database is a collection of tables.
• Relational model components
o Data structure
o Data integrity
o Data manipulation
Database (contd.)
• Data integrity
o Refers to accuracy and validity of data in database.
o DB2’s security features ensure that data integrity is maintained and that
access to the data is authorized.
o Entity Integrity
• Primary key
o Declarative Integrity
• Declarations of columns
o Referential Integrity
• Foreign Key
Database (contd.)
• Entity Integrity
o A logical consequence of the definition of a primary key is that it guarantees
the uniqueness of each occurrence.
o No attribute participating in the primary key of a base relation is allowed to
accept NULLS.
o An entity with a primary key defined to it satisfies the entity integrity rules.
• Declarative integrity (Also called as Domain integrity)
o User defined constraints for a column value that is enforced by the DBMS
Database (contd.)
• Referential constraint
o A rule which enforces the relationship between a parent table and a child
table such that a non-null value of the foreign key must match a value of the
associated primary key.
• Referential Integrity
o The automatic enforcement of the referential constraint is called the
referential integrity.
o Referential integrity is used to define what will happen when the integrity of
the references defined for a particular database is encountered by a
DELETE, INSERT or UPDATE operations
Data Modeling
Data Modeling (contd.)
• Primary key
o It identifies its own row
o It must be unique in the entire table.
o Cannot be NULL
o Only one Primary Key can be defined in a table
• Related tables
o A row in one table carry KEY of another table
o It identifies a row of data.
• Foreign key
o It identifies the row of related data either in the same table or another table.
• But referenced column should be primary key
• Examples
o EMPT,DEPT
o CUSTOMER, ORDER, ORDER DETAILS, PRODUCT
Data Modeling (contd.)
• Table types
o The table containing the primary key which is related to a foreign key of a
dependent table is called parent table
o The table containing the foreign key is called dependent table
o The dependent of a dependent table is called descendant table
• Entity
o A business entity is something that is fundamental to the organization and
an individual instance or occurrence of this thing can be uniquely
identified.
o The relation between business entities can be
• One to One
• One to Many
• Many to Many
Data Modeling
• Business Modeling
o Business model is a formal representation of business information – its
object, the object's properties or attributes and the relationship between one
another
o Business modeling does not depend upon the underlying database
structure or a specific application.
• Instead the business model can span an entire enterprise or a division of the
organization
o It requires a solid understanding of the business and can serve as a
verification of the user's view of the business before the database is even
designed
Data Modeling (contd.)
• E-R Model
o Converting Business entities to data entities consists of turning business
entities into data entities and then determining the attributes or properties of
each entities.
o E-R model is a logical representation of data for a business area
• Business entity becomes data entity
• Represented as entities, relationship between entities and attributes of both
relationships and entities.
• E-R models are conceptual data models expressed in the form of an E-R
Diagram
o Steps in modeling
• Identify Primary and Foreign Keys
• Resolve many to many relationships
• Normalize the data design
Data Modeling (contd.)
• Example:
o CUSTOMER places ORDERS
o ORDERS have PRODUCTS
• Each order relates to only one customer (one-to-one)
• A customer can place any number of orders (one-to-many)
• Many orders can contain many products (many-to-many)
o A product can be a part of many orders
o An order can have many products
• Customer, Order & Product are called ENTITIES.
• An Entity may transform into one or more tables
• The unique identity for information stored in an ENTITY is called a
PRIMARY KEY.
o e.g... CUSTOMER-ID uniquely identifies each customer
Data Modeling (contd.)
• Relationships
o For e.g.. CUSTOMER is related to ORDERS through ‘ORDER_NO’ which is
the foreign-key in CUSTOMER and primary key in another ORDER table.
o Relationships transform into foreign keys.
o As per the relational integrity the primary key, ORDER_NO, for the table
‘ORDER’ can never be NULL, while it can be NULL in the table
‘CUSTOMER’.
• Since ORDER_NO is not a primary key in the CUSTOMER Table, it can be
allowed to have nulls, though practically it is rare
Data Modeling (contd.)
• E-R Diagrams
o There are many parts in a business model and there is no formal definition
of what part must exist to be a valid model.
o It can be as simple as just naming the major business entities or objects
and their relationship to each other.
o This is commonly manifested in an Entity Relationship Diagram
o Entity Relationship diagram is a pictorial representation of the user's view of
the business representation
Data Modeling (contd.)
EMP WORKDEPT
EMP WORKDEPT
Data Modeling (contd.)
EMP WORKDEPT
EMP WORKDEPT
Data Modeling (contd.)
MANY
EMP PROJ
EMP PROJ
EMP PROJ
MANY
Data Modeling (contd.)
• An E-R Diagram
1 Department M
DEPT has many LOCATION
locations
M M M
Many M
Employees
EMPL works on PROJECT
many projects
M M
SKILLS
Normalization
Normalization
• Data normalization
o Data Normalization is a step-by-step technique of analyzing data into its
constituent entities and attributes.
o Removes functional dependencies amongst entities.
o Eliminates redundancy.
o Prevents update anomalies and data inconsistencies.
o Necessary for transactional data
• De-normalization
o Reverse of normalization
o Useful for 'read-only' data with heavy SQL SELECT access for reports
Normalization (contd.)
• Un-normalized data
1110.00
Normalization (contd.)
PURCHASE-ITEM
ORDER-NO BOOK-NO BOOK-TITLE BOOK-PRICE QTY PRICE
1101 7100 COBOL Hand Book 55.00 2 310.00
1101 3990 LOTUS 1-2-3 80.00 5 400.00
1101 3667 DBASE - IV 100.00 4 400.00
1103 7100 COBOL Hand Book 155.00 2 310.00
1103 3900 LOTUS 1-2-3 80.00 3 240.00
1104 3667 DBASE – IV 100.00 1 100.00
Normalization (contd.)
o PURCHASE- ORDER
• (order-no, cust-no, order-date, delivery- date, total-price)
o PURCHASE-ITEM
• (order-no, book-no, qty-ordered, price)
o BOOK
• (book-no, book-title, book-price)
Normalization (contd.)
Thank You !