0% found this document useful (0 votes)
124 views

Chapter Two

1) Organizations collect and store vast amounts of data about various entities to aid in decision making. As data is analyzed, patterns and insights can emerge. 2) For data to be processed by a computer, it must be organized into basic elements like characters, fields, records, files and databases. A database management system allows users to access, update and manipulate this organized data. 3) There are two approaches to data management - the flat file method and database approach. The database approach centralizes data in a shared database, avoiding data redundancy and improving data sharing, currency and access across users.

Uploaded by

Oscar Frizzi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views

Chapter Two

1) Organizations collect and store vast amounts of data about various entities to aid in decision making. As data is analyzed, patterns and insights can emerge. 2) For data to be processed by a computer, it must be organized into basic elements like characters, fields, records, files and databases. A database management system allows users to access, update and manipulate this organized data. 3) There are two approaches to data management - the flat file method and database approach. The database approach centralizes data in a shared database, avoiding data redundancy and improving data sharing, currency and access across users.

Uploaded by

Oscar Frizzi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 80

Chapter 2: Data base Mgt system 2.1.

Introduction
For financial and/or legal reasons, organizations collect and store vast amounts of data about employees, customers, finances, vendors, inventory, competitors, and markets, to name only a few. The amount of data needed is important because people generally make better decisions if they have more data available to them.
Honelign,2012 1

Data cannot be understood until it is analyzed. As the manager begins to process and analyze the data, it eventually begins to tell a story. A computer cannot process data unless it is organized in special ways; into characters, fields, records, files and databases.

Honelign,2012

Character
A character is the most basic element of data that can be observed and manipulated. e.g., $, #, and ?

Attribute /Field
A field contains an item of data; that is, a character, or group of characters that are related. For instance, a grouping of related text characters such as "John Smith" makes up a name in the name field.
Honelign,2012 3

An attribute is a descriptive property of an entity. synonyms include, element, property, and field. Generally there are four types of fields:
primary key, secondary, key foreign key and descriptive /non key fields.
Honelign,2012 4

A primary key is the attribute, or combination of attributes, that uniquely identifies a specific row in a table. Secondary key-is an alternative identifier for a data base. It may identify either a single record (as with primary key) or a subject of records. A foreign key is an attribute in a table that is a primary key in another table. Foreign keys are used to link tables.
Honelign,2012 5

Foreign keys are pointers to the records of a d/t file in a data base. Foreign keys in one file requires the existence of the corresponding primary key in other table or file-otherwise it dos not point to any thing. Descriptive field-is any other non key fields that stores business data.

Honelign,2012

Record
A record is composed of a group of related fields. A record contains a collection of attributes related to an entity such as a person or product. E.g. A payroll record would contain the name, address, social security number, and title of each employee.
Honelign,2012 7

Database File
A database file is a collection of related records. A database file is sometimes called a table. A file may be composed of a complete list of individuals on a mailing list, including their addresses and telephone numbers. Files are frequently categorized by the purpose or application for which they are intended. common examples include
mailing lists, quality control files, inventory files, or document files.
Honelign,2012 8

Database
A database is composed of related files that
are consolidated, organized and stored together. One collection of related files might pertain to employee information. Another collection of related files might contain sports statistics.

Honelign,2012

Data Management System


A database management system is a software package that enables users to edit, link, and update files as needs dictate. Data management systems are used to access and manipulate data in a database.

Honelign,2012

10

The re are two basic approaches to data management


the flat-file method, and the data base approach.

2.2. Flat-File Versus Database Environments

In a flat-file approach, each user group owns its data and it is not usually available to others, even within the organization. Thus, the same data element may be represented in all user files. This is called data redundancy.
Honelign,2012 11

2.1. Flat-File Environment


User 1 Transactions

Data Program 1 A,B,C

User 2 Transactions

Program 2
User 3 Transactions

X,B,Y

Program 3

L,B,M

Honelign,2012

12

Data Redundancy & Flat-File Problems Data Storage - creates excessive storage costs of paper documents and/or magnetic form. Data Updating - any changes or additions must be performed multiple times. Currency of Information - potential problem of failing to update all affected files. Task-Data Dependency - users inability to obtain additional information as his or her needs change.
Honelign,2012 13

2.2. Database Approach


Data base pools data in user information set in to a common data base that is shared by all users.

Honelign,2012

14

Advantages of the Database Approach


Data sharing/centralize database resolves flat-file problems: No data redundancy - Data is stored only once, eliminating data redundancy and reducing storage costs. Single update - Because data is in only one place, it requires only a single update procedure, reducing the time and cost of keeping the database current. Current values - A change to the database made by any user yields current data values for all other users. Task-data independence - As users information needs expand beyond their immediate domain, the new needs can be more easily satisfied than under the flat-file approach.

Honelign,2012

15

Disadvantages of the Database Approach


Can be costly to implement
additional hardware, software, storage, and network resources are required

Can only run in certain operating environments


may make it unsuitable for some system configurations

Because it is so different from the file-oriented approach, the database approach requires training users
may be inertia or resistance

Honelign,2012

16

2.3. Elements of the Database Approach

Honelign,2012

17

Elements of the database


users, the Data Base Management Systems (DBM S), the database administrator (DBA), and the physical database.

2.3.1. Users access data in two ways. Via user programs that send data access to requests to DBMS, and Through direct query, which requires no formal user program.
Honelign,2012

18

2.3.2. DBMS
The purpose of the DBMS is to provide controlled access to the database. The DBMS is a special software system programmed to know which data elements each user is authorized to access and deny unauthorized requests of data.

Honelign,2012

19

Typical DBMS features are:


Program Development :-DBMS contains application development software. Backup and Recovery - copies database Database Usage Reporting - captures statistics on database usage (who, when, etc.) Database Access - authorizes access to sections of the database. Data base access is facilitated by three software modules. These are: Data definition language Data manipulation language Query language
Honelign,2012 20

Data Definition Language (DDL)


DDL is a programming language used to define the database to the DBMS. The DDL identifies the names and the relationship of all data elements, records, and files that constitute the database. The DDL defines the database on three levels called views: internal view -presents the physical arrangement of records .
There is only one internal view of the database.

conceptual view- logical and abstract representation of data base. There is only one conceptual view of the database. user view - defines how a particular user sees the portion of the database each user views. There are Honelign,2012 21 many user views of a data base.

Honelign,2012

22

Data Manipulation Language (DML)


DML is the proprietary programming language that a particular DBMS uses to retrieve, process, and store data. Entire user programs may be written in the DML, or selected DML commands can be inserted into universal programs, such as COBOL and FORTRAN.

Honelign,2012

23

Query Language
The query capability permits end users and professional programmers to access data in the database without the need for conventional programs. ANSIs Structured Query Language (SQL) is a fourth-generation language that has emerged as the standard query language. SQL is a nonprocedural language with many commands that allow users to input, retrieve, and modify data easily. The SELECT command is a powerful tool for retrieving data.
Honelign,2012

24

SQL is an efficient data processing tool, requires far less training in computer concepts and fewer programming skills than many languages. This feature places ad hoc reporting and data processing capability in the hands of the user/manager. By reducing reliance on professional programmers, managers are better able to deal with problems that pop up. The example in the next figure illustrates the use of the SELECT command to produce a user report from a database called Inventory.
Honelign,2012 25

Honelign,2012

26

2.3.3. The Database Administrator


The DBA is responsible for managing the database resource. Multiple users sharing a common database requires organization, coordination, rules, and guidelines to protect the integrity of the database. The duties of the DBA fall into the following areas: database planning, database design, database implementation, database operation and maintenance, and database change and growth. Creation and maintenance of data dictionary.
Honelign,2012 27

Functions of the DBA

Honelign,2012

28

Organizational Interactions of the DBA Of particular importance is the relationship among the DBA, the end users, and the systems professionals of the organization. As information needs arise, users send formal requests for computer applications to the systems professionals (programmers) of the organization. The requests are handled through formal systems development procedures, which produce the programmed applications.

Honelign,2012

29

Honelign,2012

30

The user requests also go to the DBA, who evaluates these to determine the users database needs. Once this is established, the DBA grants the user access authority by programming the users view (subschema). This relationship is shown as the lines between the user and the DBA and between the DBA and DDL module in the DBMS. By keeping access authority separate from systems development (application programming), the organization is better able to control and protect the database. Intentional and unintentional attempts at unauthorized access are more likely to be discovered when these two groups work independently. Honelign,2012 31

The Data Dictionary


Another important function of the DBA is the creation and maintenance of the data dictionary. The data dictionary describes every data element in the database. This enables all users (and programmers) to share a common view of the data resource and greatly facilitates the analysis of user needs.

Honelign,2012

32

2.3. 4. The Physical Database


Is the fourth major element of the database approach . is the lowest level of the database. It consists of magnetic spots on magnetic disks. The other levels of the database (for example, the user view, conceptual view, and internal view) are abstract representations of the physical level. At the physical level, the database is a collection of records and files. This section deals with the data structures used in the physical data base.

Honelign,2012

33

Data structures
Data structures are the bricks and mortar of the data base. It allows records to be located, stored,and retrieved and enables movement from on record to another.

In general data structures must support the following file processing operations :
Retrieve a record from the file based on its pk value Insert a record in to a file Update a record in the file Read a complete file of records Find the next record in a file Scan a file for records with common secondary keys Delete a record from a file
Honelign,2012 34

Components of data structures


Data structures have two components-Organization and access method Organization of a file refers to the way records are physically arranged on secondary storage devices. This may be either Sequential or Random. The records in sequential files are stored in contiguous locations that occupy a specified area of disc space. Records in random files are stored with out regard for their physical relation ship to other records of the same file. Access method:-is the technique used to locate records & to navigate through the data base. During data processing access method program responds to requests for data from the users application, locates and retrieves or stores the record.
Honelign,2012 35

Criteria for data structure selection No single structure is best for all processing tasks/operations. Therefore, the following criteria are used to select data structure
Rapid file access and data retrieval Efficient use of disc storage device High throughput for transaction processing Protection from data loss Ease of recovery from system failure Accommodation of file growth

Honelign,2012

36

Four basic data structures


i. ii. iii. iv. Sequential data structures Indexed data structures Hashing data structures and Pointers data structures (Figures- Data structures-2.3.pptx )

Honelign,2012

37

i. Sequential structure
Also called sequential access method. Records in the file lie in contiguous storage spaces in a specified sequence arranged by their primary key. Sequential files are simple and easy to process It does not permit accessing a record directly. Thus, it is efficient for only operations
Read a complete file of records Find the next record in a file

Honelign,2012

38

ii. Indexed structure


Contains both actual data file and separate index that is itself a file of record addresses. This index contains numerical value of physical disc storage location (cylinder ,surface and record block) for each record in the associated data file The data file itself may be organized either sequentially or randomly.

Honelign,2012

39

Indexed Random File


Records in an indexed random file are dispersed throughout a disk without regard for their physical proximity to other records. A records physical location is unimportant as long as the operating system software can find it when needed. When a new record is added tot the file the data mgt software randomly selects a vacant disk location, stores the record and adds the new address to the index.

Honelign,2012

40

The physical organization of the index itself may be either sequential (by key value) or random. Advantages of indexed random files is its efficiency in the ff operations of single record processing
Retrieve a record from the file based on its pk value Insert a record in to a file Update a record in the file Scan a file for records with common secondary keys and, Efficient use of disk storage. Disadvantage Not efficient for operations that involve processing a large portion of a file.
Honelign,2012 41

Indexed Sequential Files


Uses an index in conjunction with a sequential file organization. Allows both direct access to individual records and batch processing of the entire file. Eg, indexed sequential access method(ISAM). ISAM structure is used for very large files that require routine batch processing and a moderate degree of individual record processing. ISAM is moderately effective for operation Retrieve a record from the file based on its pk value Update a record in the file
Honelign,2012 42

Direct access speed is sacrificed to achieve very efficient performance in operations:


Read a complete file of records
Find the next record in a file Scan a file for records with common secondary keys Disadvantage: inefficient in record insertion operation. This problem can be resolved by storing new records in an overflow area that is physically separate from the other data records in the file.
Honelign,2012 43

An ISAM file has three physical components: the indexes, the prime data storage area, and the overflow

area.
ISAM is popular option for large and stable files that need both direct access & batch processing but not for highly volatile files.

Honelign,2012

44

Honelign,2012

45

iii. Hashing Structure


It employs an algorithm that converts the primary key of a record directly into a storage address. Eg.prime #/key I.e. 99997/15943=6.27215705 Residual translates to: cylinder 272
surface 15 record # 705

Hashing eliminates the need for a separate index. It uses a random file organization since the process of calculating residuals and converting them into storage locations produces widely dispersed record addresses
Honelign,2012 46

Advantage: access speed


Retrieve a record from the file based on its pk value Insert a record in to a file Update a record in the file Scan a file for records with common secondary keys

Disadvantages:
It does not used storage space efficiently as some disk locations will never be selected by algorithm. Collision(the reverse of the first) that slows down access speed.(see the book p.421)

Honelign,2012

47

iv. Pointer structures


Creats a liked-list file. Records in this type of file are randomly distributed but pointers provide connections b/n records found in same file /different files.(see fig.9-14, p.423)

Honelign,2012

48

Types of Pointers
Three type of pointers

1. physical address pointer


Contains the actual disk storage location(cylindr, surface &record #) Allow the system to access records directly. Advantage:speed Disadvantages: Frequent change of pointers whenever related record is moved from one disk location to another. This is a problem when disks are periodically reorganized /copied. Physical pointers bear no logical r/ship to records they identify.
Honelign,2012

49

2. Relative address pointer.


Contains the relative position of a record in the file(see Fig.9-15,p.424)

3. Logical key pointer


Contains the PK of the related record. By using hashing algorithm this PK value is converted to records physical address.

Honelign,2012

50

2.4. Data Base Models


What is a data model?
It is an abstract representation of the data about entities & their relationships in an organization.

Its purpose is to represent entity attributes in a way that is


understandable to users.

Honelign,2012

51

Three Conceptual models


Hierarchical model
are termed navigational
models as they possess explicit links/paths among data elements

Network model Relational model

possesses implicit linkage


among data elements

Honelign,2012

52

Data Base Terms


Entity- An entity is anything about which the organization wishes to capture data.
Entities may be physical, such as inventories, customers, or employees. They may also be conceptual such as sales , AR, or AP

Data elements Attribute- are the data elements that define an entity.
For example, an Employee entity may be defined by the following partial set of attributes: Name, Address, Job Skill, Years of Service, and Hourly Rate of Pay.

Record type-a group of data elements that logically pertain to an entity. Record associations- the relationship that exists among record types.
Honelign,2012 53

Record associations
Record types exist in relation to other record types. This is called an association. There are three types of record associations:
One-to-One eg. Employee record -to - year to date earning One-to-Many eg. Customer record to-sales order record Many-to-Many(two way relationship) eg. inventory record to- vendor record
Honelign,2012 54

2.4.1. The Hierarchical model


It is constructed of sets of files. Each set contains a parent and a child.

A file can be both the child in one set and the parent in another
set but this is impossible within a set. Files at the same level with the same parent are called siblings. This structure is also called a tree structure. The file at the most aggregated level in the tree is the root

segment, and the file at the most detailed level in a particular


branch is called a leaf.
Honelign,2012 55

The only way to access data at lower levels in the tree is from the root and via the pointers down the navigational path to the desired records. i.e. it allows only one path.

Honelign,2012

56

Honelign,2012

57

Limitations of Hierarchical data model


The following rules, which govern the hierarchical model, reveal its operating constraints:

1. A parent record may have one or more child records. For


example, in above figure , the customer is the parent of both sales invoice and cash receipts. 2. No child record can have more than one parent . Therefore many-to-many record association is impossible

(limitation) .

Honelign,2012

58

2.4.2. The Network Database Model


network model allows a child record to have multiple parents.( principal distinguishing feature). it allows multiple paths to single record many-to-many record association is possible.
(Each file in a set can be both parent and child). Navigating an M:M association requires creating a separate link file that contains pointer records in a linked- list structure and accounting data.
Honelign,2012 59

Honelign,2012

60

2.4.3. The Relational Model


It has its own terminology. Data base table is called Relation. Attributes -Data elements form columns Tuples (records) - form rows Data - the intersection of rows and columns A system is relational if it: 1. Represents data in the form of two-dimensional tables such as
the database table, called Customer. 2. Supports the relational algebra functions of restrict, project, and join.
Honelign,2012 61

Honelign,2012

62

Honelign,2012

63

Properly designed tables possess the following four characteristics: 1.All occurrences at the intersection of a row and a column are a single value. No multiple value or repeating group is allowed 2. All attribute values in any column must be of the same class. 3. Each column in a given table must be uniquely named. However, different tables may contain columns with the same name. 4. Each row in the table must be unique in at least one attribute. This attribute is the PK

Honelign,2012

64

Data linkage in the Relational model

Implicit linkage(Absence of explicit pointers) Data presented as the collection of independent tables (absence of a tree or network structure) Relations are formed by an attribute common to both tables in the relation.(absence of pointers or explicit links)

Honelign,2012

65

Advantages of Relational Tables


Removes all three anomalies
Data update anomaly Date insertion anomaly Data deletion anomaly

Various items of interest (customers, inventory, sales) are stored in separate tables. Space is used efficiently. Very flexible. Users can form ad hoc relationships.

Honelign,2012

66

Foreign key Assignment


The nature of association b/n two tables determine the method used for assigning foreign keys. 1:1 -either of the primary key can be a foreign key

1:M - the one side primary key serves as foreign key


M:M - not a primary key rather a separate link table containing keys for the related tables must be created.

Honelign,2012

67

2.5. Data normalization and its importance


Correctly designed data base tables are critical to the success of the DBMS. Poorly designed tables can cause operational problems that restrict ,,or even deny, users access to the information they need. Data normalization is a process that promotes effective data base design by grouping data attributes into tables that comply to specific conditions

Honelign,2012

68

Importance
Table that have not been normalized are associated with three types of problems called anomalies : update anomaly, insertion anomaly and deletion anomaly. The importance of data normalization is making the data base tables free from these anomalies.

Honelign,2012

69

The Normalization Process


Data normalization is the process of systematically reducing a complex table to a set of simple efficient tables that meet two conditions:
1. All nonkey attributes in the table are dependent on (defined by)the primary key 2. All nonkey attributes are independent of the other nonkey attributes. When these conditions are met, the table in question is in third normal form(3NF).

Honelign,2012

70

Accountants and Data Normalization


The update anomaly can generate conflicting and obsolete database values. The insertion anomaly can result in unrecorded transactions and incomplete audit trails. The deletion anomaly can cause the loss of accounting records and the destruction of audit trails. Accountants should have an understanding of the data normalization process and be able to determine whether a database is properly normalized.

Honelign,2012

71

Normalization process
Step 1. Identify and remove any repeating groups. Repeating groups are multiple data values at the intersection of rows and columns. When this is done, the table is in 1NF.

Step 2. Identify and remove any partial dependencies.


These are nonkey attributes dependent on (defined by) only part of the PK. This condition exists only when PK is the composite key. At this point the table is in 2NF
Honelign,2012 72

Step 3. Remove any transitive dependencies.


These are nonkey attributes dependent on another nonkey attribute in the table.At this point the table is in 3NF(freed from any of the three anomalies).

EXAMPLE: Next Slides.

Honelign,2012

73

Table 1. Unnormalized data base of Student Enrollment


Stdnt# 86432 86432 86432 86789 86789 Stdnt Majar Course Crse desc Sethi Acctg Acct 315 Fin Acct Sethi Acctg Acct 324 Mgt Acct Sethi Acctg Math 21 Archer Mgt Archer Mgt Mgt 1 Hist 1 Calc Intr Mgt Us Hist Instr Ray Paul Jones Buel Patch Office hrs 9 -11 8 -11 1 -3 4-5 9-11 Loc 442 448 323 463 342 Tel no Grade 8-4545 8-8945 8-2345 8-3436 8-2378 A A B C B

98653
98653 98653

Mills Acctg Acct 1


Mills Acctg Math 21 Mills Acctg Mgt 1

Intr Acct
Calc Intr Mgt

Ray
Jones Buel

9-11
1 -3 4-5

442
323 463

8-4545
8-2345 8-3436

B
B C

Table 1. Unnormalized data base of Student Enrollment Stdnt# Stdnt Majar Course Crse desc Instr Office hrs Table 3: Course Grade (1 NF) Crse Stdnt# Course desc Instr Table 2: Student (3NF) Stdnt# Stdnt 86432 Sethi Majar Acctg 86432 Acct 315 Fin Acct 86433 Acct 324 Mgt Acct 86434 Math 21 Calc Ray Paul Jones Buel Patch Ray Jones Buel

Loc

Tel no Grade

Office hrs 9 -11 8 -11 1 -3 4-5 9-11 9-11 1 -3 4-5

Loc

Tel no Grade A A B C B B B C

442 8-4545 448 8-8945 323 8-2345 463 8-3436 342 8-2378 442 8-4545 323 8-2345 463 8-3436

86789 Archer
98653 Mills

Mgt
Acctg

86789 Mgt 1 Intr Mgt 86789 Hist 1 Us Hist

98653 Acct 1 Intr Acct 98653 Math 21 Calc

98653 Mgt 1 Intr Mgt

Table 3: Course Grade(1 NF) Stdnt# Course Crse desc

Instr

Office hrs

Loc

Tel no

Grade

Table 4:Student Grade (3NF) Stdnt# Course Grade 86432 Acct 315 A 86433 Acct 324 A 86434 Math 21 B 86789 Mgt 1 C 86789 Hist 1 B 98653 Acct 1 B 98653 Math 21 B 98653 Mgt 1 C

Table 5: Course Instructor (2NF) Course Crse desc Instr Acct 315 Fin Acct Ray Acct 324 Mgt Acct Paul Math 21 Calc Jones Mgt 1 Intr Mgt Buel Hist 1 Us Hist Patch Acct 1 Intr Acct Ray Math 21 Calc Jones Mgt 1 Intr Mgt Buel

Office hrs 9 -11 8 -11 1 -3 4-5 9-11 9-11 1 -3 4-5

Loc 442 448 323 463 342 442 323 463

Tel no 8-4545 8-8945 8-2345 8-3436 8-2378 8-4545 8-8945 8-2345

Table 5: Course Instructor (2NF) Course Crse desc Instr

Office hrs

Loc

Tel no

Table 6: Course (3NF) Course Crse desc Instr Acct 315 Fin Acct Ray Acct 324 Mgt Acct Paul Math 21 Calc Jones Mgt 1 Intr Mgt Buel Hist 1 Us Hist Patch Acct 1 Intr Acct Ray Math 21 Calc Jones Mgt 1 Intr Mgt Buel

Table 7: Instructor (3NF) Instr Ray Paul Jones Buel Patch Office hrs 9 -11 8 -11 1 -3 4-5 9-11 Loc 442 448 323 463 342 Tel no 8-4545 8-8945 8-2345 8-3436 8-2378

Table 2: Student (3NF) Stdnt# Stdnt Majar 86432 Sethi Acctg 86789 Archer Mgt 98653 Mills Acctg

Table 4:Student Grade (3NF) Stdnt# Course Grade A 86432 Acct 315 A 86433 Acct 324 B 86434 Math 21 C 86789 Mgt 1 B 86789 Hist 1 B 98653 Acct 1 B 98653 Math 21 C 98653 Mgt 1

Table 6: Course (3NF) Course Acct 315 Acct 324 Math 21 Mgt 1 Hist 1 Acct 1 Crse desc Fin Acct Mgt Acct Calc Intr Mgt Us Hist Intr Acct Instr Ray Paul Jones Buel Patch Ray Table 7: Instructor (3NF) Instr Ray Paul Jones Buel Patch Office hrs 9 -11 8 -11 1 -3 4-5 9-11 Loc 442 448 323 463 342 Tel no 8-8945 8-2345 8-3436 8-2378 8-4545

Making Relational Data base Using microsoft Access


Relational Data Base

Honelign,2012

79

2.6. Data base control issues


Controlling techniques for dealing with data base exposures fall in to two general categories: back up controls and access controls.
Back up controls ensure that a current copy of the data base exists at all times. Access controls ensures that only authorized users access the data base and those that do so perform only authorized actions.

Honelign,2012

80

You might also like