0% found this document useful (0 votes)
7 views

Chapter 4

written report

Uploaded by

Ram
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 4

written report

Uploaded by

Ram
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

WRITTEN REPORT

· Data Management Approach:

1. Flat- File Approach

- A giant collection of data in which the tables and records have no relation
between any other tables. ( It could be a single table)

- Example: “ My Small Business Data” with everything stored in it from


customers, sales, orders, and invoices (It sounds too messy) but there are uses.
One does not have to normalise a database.

- By normalising, it means “breaking out” all the repeating values in tables


and saving them into other, related tables.

- If the database is simple, a flat-file database makes more sense.

- If you have created an excel spreadsheet, you have created a basic flat
file. A workbook with multiple tabs makes up the database of a flat-file
database. There can be many values that are the same in both worksheets, but
they are not linked together.

Example:

In Sheet 1 of your worksheet, employee’s data can be seen.

Employee First Surname Addre Town Post Salary


Code Name ss Code

Sheet 2 is for the payroll.

Payroll First Name Surname Address Town Post Monthly


Code Code Pay
Sheet 3 is for the Stock account

Stock Item Description Supplier Address Town Post Code


Code Name

Sheet 4 is for orders from customers.

Order Customer Customer Address Customer Customer Stock Quantity


Code Name Town Postcode Code

- From the above set-up, you can now analyse the problems you might
encounter using the flat-file database.

1. Handling is difficult.

- Inserting

- Updating

- Deleting

Example:

EMPLOYEE

Employee First Name Surname Address Town Post Code Salary


Code

001 Leni Robredo 228 Naga SW1 120,000

002 Ferdinand Marcos Jr. 121 Makati HA1 120,000


PAYROLL

Payroll Code First Name Surname Address Town Post Code Monthly Pay

0101 Leni Robredo 228 Naga SW1 10,000

0102 Ferdinand Marcos Jr. 121 Makati HA1 10,000

- You can notice the redundancy and duplication of data from the worksheet
of employee and payroll.

- What if Ferdinand Marcos Jr. is convicted with moral turpitude because


he doesn’t pay the right amount of taxes? Basically, he can’t go to work. If you
are using the flat-file approach, to remove Ferdinand Marcos Jr. from the list of
employees, you both need to update and delete from the employee’s worksheet
and payroll’s worksheet.

2. Database Approach

- The entire data is stored in a single repository and multiple users can
access the data based on their interest.

- Hence, there is no duplication of data.

Characteristics of DBMS Approach:

1. Self- Describing Nature of a Database System


- A database system does not only consist of a Database but also a Meta-
data. Information from the Metadata is stored in DBMS Catalog and it is used
by DBMS Software and Database Users.

Example:

RELATIONS

Relation _Name No of
columns

EMPLOYEE 7

PAYROLL 7

STOCK 7

ORDERS 7

COLUMNS

Col_Name Data_Type Belongs_to_Relation

Name Character EMPLOYEE

Employee Code Integer EMPLOYEE

Address Integer EMPLOYEE

Monthly Pay Integer PAYROLL


- DBMS Software must work equally well with any number of database
applications.

2. Insulation between programs and Data, and Data Abstraction

- The structure of data files is stored in the DBMS Catalog and is separate
from access programs. If there is any change from the structure of data files, it
does not affect the programs and that is called “program data independence.”

- The characteristic that allows program-data independence is called “data


abstraction.”

3. Support of Multiple Views of the Data

- A database has many users, each of whom may require a different view
of the database.

- Example, one will only be interested in an employee’s salary while


another user may be interested in the quantity of orders from customers.

4. Sharing of Data and Multi User Transaction Processing

- Allows multiple users to access the database at the same time.

- DBMS must include concurrency control.

- Concurrency Control: When multiple users are sharing the same database,
at the same time, it should prevent two users from editing the same data at the
same time.

- Example: Reservation Tickets for Concerts: One of the agents is booking


a ticket for a particular customer and if he is assigning a certain seat for a
particular passenger, that particular seat should get blocked and no other agents
should access or book that seat from any other customers.

Key Elements of the Database Environment

Topic Outline
● Database Management System
● Users
● The Database Administrator
● The Physical Database
● DBMS Models

Database Management System


● is a software which is used to manage the database, DBMS provides an interface for
carrying out a variety of tasks, including building databases, putting data in them,
updating data, creating tables in the databases, and many other things, It also gives the
database security and safety. Moreover, it preserves data consistency when there are
many users.

Features of DMS

Program Development
● The DBMS includes application development software, this feature can be used by
programmers and end users to build applications that access databases. Also it is a
process of gathering and assessing real-world requirements, creating the system's data
and features, and then putting those designs into practice

Backup and recovery


● The DBMS periodically creates backup copies of the physical database while
processing. The database management system (DBMS) can restore to an earlier
version that is known to be accurate in the case of a disaster (disk failure, software
error, or malicious act) that makes the database unusable. Although some data loss
might happen, the database would be at risk of being completely destroyed without
the backup and recovery feature. The purpose of this feature is to both safeguard the
database from data loss and to rebuild the database after loss. The following are
typical backup administration tasks: Developing and testing solutions to various
failure types.

Database usage reporting


● This function records statistics on the types of data used, when they are used, and by
whom. The DBA makes use of this data to support user authorisation assignments and
database maintenance. A user administration is used to preserve data security, deal
with concurrency management, register and monitor users, maintain data integrity,
track performance, and restore data that has been compromised by unexpected failure.

Database Access
● The ability of a DBMS to allow authorized users formal and informal access to the
database is its most critical element or feature, Each kind of data can be created,
stored, managed, manipulated, retrieved, and updated using this methodical system or
software. The ability of DBMS to administer an accounting system is one of its
primary uses.

Three software modules that facilitate database access, Data Definition Language, Data Manipulation
Language, Query Language

Data Definition Language


● is a programming language that is employed to specify the database to the DBMS. All of the
data pieces, records, and files that make up the database are identified by name and
relationship in the DDL. It is a language used to describe data and the connections between it
in a database. To: Maintain a copy of the database structure. You can create DDL for database
objects in a script. Create a test system with a database that behaves like the production
system but has no data in it. The physical internal view, the conceptual view (schema), and
the user view are the three levels that make up this definition (subschema). by engaging with
descriptions of the database schema, it is used to construct and modify the structure of objects
in a database.

Database Views or DBMS Schemas


Internal View/Physical View ( The Database Level )
● The internal view displays the database's records in their actual physical order. The
physical database is one level below this, which is the lowest level of representation.
This internal view explains the internal organization of files, the relationships
between them, and the structures of data records. The database only has one internal
view, The internal schema specifies how the database's physical storage is organized.
A very basic representation of the complete database can be found in the internal
schema. It has numerous instances of various internal record kinds. It is also known
as "stored record" in ANSI terminology.

Conceptual View/Logical View ( The Server Level )


● The entire database is described by the schema (or conceptual view). Instead of
showing the database's physical storage, this view shows it conceptually and
abstractly. A database has just one conceptual view. For the user community, the
conceptual schema represents the database's overall structure. In order to describe
data kinds, entities, relationships, etc., this schema conceals information about the
actual storage structures.Between the user level and the physical storage view is this
logical level. Yet, a single database only has a single conceptual view.

External View/User View ( The User/Client Level )


● The user view or subschema defines a part of the database that a specific user is
allowed access to is called the user's section. The user view is the complete database
to a specific user. The internal and conceptual, however, there could be a variety of
different user viewpoints. In contrast to the inventory control clerk, who only views
supplier and related inventory records through his user view, a user in the personnel
department may view the database as a collection of employee records.
● The portion of the database that a particular user is interested in is described in an
external schema. It keeps the user from seeing the database's irrelevant information.
For any database, a different number of external views are possible. Each external
view is specified using an external schema, which includes descriptions of different
kinds of external records for that particular view. An external view is simply the
database's content as it appears to a particular user. A person from the sales
department, for instance, will only view data relating to sales.

Users
● A user is someone who employs or uses a particular thing, or uses the database.

Two types of Access

Formal Access/ Application Interfaces


● The authorised application Interfaces allow access. User applications developed by systems
specialists call the DBMS with requests for data access, which the DBMS then verifies and
retrieves for processing. The users are unaware of the DBMS's existence when using this way
of access. For transactions like sales, cash receipts, and purchases, data processing operations
(both batch and real time) are substantially the same as they would be in a flatfile system.

Data Manipulation Language


● is the precompiled programming language that a particular DBMS uses to retrieve, process,
and store data. A computer programming language that is used to add (insert), remove
(delete), and alter (update) data in a database. A DML is frequently a sublanguage of a more
general database language like SQL, and it contains some of the language's operators. The
DML can be used to create complete user programs, or alternatively, it can be used to inject
specific DML instructions into programs created using more traditional programming
languages like COBOL and FORTRAN or universal languages like JAVA and C++. Standard
programs that were created for the flat-file environment can simply be changed to work in a
database context by inserting DML commands. Also, the company gains some degree of
independence from the DBMS provider thanks to the use of standard language programs. It
won't be necessary to rewrite all user programs if the company chooses to change vendors and
go with one that uses a different DML. User applications can be changed to operate in the
new environment by substituting the new commands for the old DML commands.

(Figure 4.3)
DBMS Operation
1. A user program sends a request for data to the DBMS. The requests are written in a
special DML that is embedded in the user program.
2. The DBMS analyses the request by matching the called data elements against the user view
and the conceptual view. If the data request matches, it is authorised, and processing proceeds
to Step 3. If it does not match the views, access is denied.
3. The DBMS determines the data structure parameters from the internal view and passes them
to the operating system, which performs the actual data retrieval. Data structure parameters
describe the organisation and access method for retrieving the requested data.
4. Using the appropriate access method (an operating system utility program), the operating
system interacts with the disk storage device to retrieve the data from the physical database.
5. The operating system then stores the data in a main memory buffer area managed by the
DBMS.
6. The DBMS transfers the data to the user’s work location in main memory. At this point, the
user’s program is free to access and manipulate the data.
7. When processing is complete, Steps 4, 5, and 6 are reversed to restore the processed data to
the database.

Informal Access/ Query Language


● The informal method of queries is the second way to access databases. An ad hoc access
method for obtaining data from a database is called a query. Any computer programming
language that sends queries to databases and information systems in order to seek and retrieve
data is referred to as a "query language" (QL). In order to locate and extract data from host
databases, it uses user-entered structured and formal programming command-based queries.
The built-in query function of the DBMS allows users to access data directly (without the
need for explicit application applications). This feature offers a "pleasant" environment for
integrating and retrieving data to create ad hoc management reports, enabling authorized users
to process data independently of expert programmers.

Structured Query Language


● A relational database's structured query language (SQL) is a programming language used to
store and process data and used to communicate with a database. In a relational database, data
is stored in tabular form, with rows and columns denoting various data qualities and the
relationships between the values of those attributes. IBM’s Structured Query Language (often
pronounced sequel or S-Q-L)has become the default query language for DBMSs on both
mainframe and microcomputers. With its various commands and fourth generation,
nonprocedural syntax, SQL enables users to enter, retrieve, and alter data with ease. The
SELECT command is an effective tool for data retrieval.

( figure 4.5 )

● This example demonstrates how effective SQL is as a data processing tool. Despite not being
a form of plain English, SQL requires significantly less programming knowledge and training
than third-generation languages. In fact, some query tools don't even require an understanding
of SQL. By "pointing and clicking" on the desired attributes, users choose data visually. The
required SQL commands are then automatically generated by this visual user interface. The
real advantage of the query feature is that it gives the manager or end user control over ad hoc
reporting and data processing. Reducing the manager's reliance on expert programmers
enhances the manager's capacity to respond quickly to emerging issues. Yet, the query feature
creates a significant control problem. It must be prevented by management from being utilised
to get unwanted access to the database.

The Database Administrator


● The DBA is in charge of administering the database resource. Sharing a common database by
several users necessitates organization, collaboration, rules, and guidelines to ensure the
database's integrity.The database administrator (DBA) is the person in charge of managing,
backing up, and ensuring the availability of data produced and consumed by today's
enterprises through their IT systems. The DBA plays a key function in many of today's IT
departments, and thus in their corporations as a whole.

Data Dictionary
● a collection of names, definitions, and attributes about data elements that are being used or
captured in a database, information system. Every data element in the database is described in
the data dictionary. This allows all users (and programmers) to have a similar view of the data
resource, considerably improving user needs analysis. The data dictionary might be in hard
copy or digital form.

(Table 4.1)
● In large corporations, the DBA function may be supported by an entire department of
technical personnel. DBA roles may be assumed by someone in the computer services group
in smaller organisations. The DBA's responsibilities include database planning, database
design, database installation, operation, and maintenance, and database growth and change.

(figure 4.6)
● This figure depicts some of the DBA's organisational interfaces; particularly significant is the
relationship that exists between the DBA, end users, and system specialists of the
organisation.

The Physical Database


● A physical database serves as the device that houses the information files and the search
pathways that are used to get information from each source and a process of translating a data
model into the physical data structure of a database management system
● This is the database's lowest level and the only level that exists in tangible form. Magnetic
spots on metallic coated disks form a logical aggregation of files and records in the physical
database. This section discusses the physical database's data architecture.

(table 4.2)
● This is the list of The file processing activities that data structures must support, The DBMS's
efficiency in doing these tasks is a crucial predictor of its overall success, and it is heavily
dependent on how a particular file is constructed.

Data Structures
● Data structures are the foundation of a database. The data structure allows for the location,
storage, and retrieval of records, as well as movement from one record to another. A data
structure is an intricate format for arranging, processing, accessing, and storing data. Data
structures come in both simple and complex forms, all of which are made to organise data for
a certain use. Users find it simple to access the data they need and use it appropriately thanks
to data structures. There are two main components to data structures: (1) organisation and (2)
access technique.

For all processing jobs, no one structure is optimum. So, choosing one entails the compromise of
desirable traits, here are the criterias that influence the selection of data structures.

1. Rapid file access and data retrieval


2. Efficient use of disk storage space
3. High throughput for transaction processing
4. Protection from data loss
5. Ease of recovery from system failure
6. Accommodation of file growth

Data Organizations
● The physical arrangement of the documents on the storage device. This could be random or in
a specific order. Sequential files keep their records in contiguous regions that take up a certain
amount of disk space. Records are kept in random folders without consideration for how
physically close they are to one another. Records from random files might be dispersed
around the disk.
● Data organisation is the process of categorising and classifying data to improve its usability.
You'll need to organise your data in the most logical and orderly way possible, similar to how
we organise critical documents in file folders, so you and anybody else who accesses it can
quickly find what they're searching for.
Data Access Methods
● Access methods are operating system-integrated computer programs that are used to search
for records and browse the database. In response to requests for data from the user's
application, the access method program finds and retrieves or stores the records during
database processing. The user is fully unaware of any tasks performed by the access method.

DBMS Models
● A data model is an abstract illustration of information about relevant things. They include
assets (resources), transactions (events), and agents (people, clients, etc.) within a company. A
data model's function is to represent things and define their properties in a form that is user-
friendly. Each DBMS is based on a particular conceptual model.

Database Terminology

Entity - anything the organization wants to collect data on. Physical things like consumers, staff, or
stockpiles are examples of entities. Sales (to a customer), accounts receivable (AR), or accounts
payable are examples of conceptual transactions (AP).

Record Type - is an actual database image of an entity. The record types that are relevant to particular
entities are grouped together by database designers into tables (files). For instance, the sales order
record type, which practically reflects the Sales Order entity, would consist of records of sales to
clients.

Occurrence - refers to the quantity of records that a specific record type can represent. For instance,
the Employee entity (record type) is said to contain 100 occurrences if an organization has 100
employees.

Attributes - a piece of data that describes an entity, For example, in a customer database, the attributes
might be name, address, and phone number. In a product database, the attributes might be name,
price, and date of manufacture.

Database - is the collection of record kinds required by an organization to support its operational
procedures. Some businesses use a distributed database strategy and build separate databases for each
of their main functional areas. Such a company might have different databases for production,
marketing, and accounting.

Associations - The several record types that make up a database coexist with one another. An
association is a business component that defines a relationship between two entity objects based on
common attributes. The relationship can be one-to-one, one-to-many, or many-to-many. The
association allows entity objects to access the data of other entity objects through a persistent
reference.
One-to-one
● This indicates that for every instance of Record Type X, there is only one instance of Record
Type Y (or maybe zero for new instances), for instance, for every occurrence of Employee in
the Employee database, there is only one occurrence of Year-to-Date Earnings (or possibly
zero for new instances).
One-to-many
● This indicates that a specific consumer may have made zero, one, or numerous purchases
from the business during the time period under consideration.For instance, for every
occurrence (client) in the customer table, there are zero, one, or many sales orders in the sales
order table.
Many-to-many
● There are zero, one, or many instances of Record Types Y and X for every occurrence of
Record Types X and Y, respectively. The M:M link is demonstrated by the commercial
relationship between an organization's inventory and its suppliers. In this example, a certain
supplier gives the business zero, one, or many inventory items (although if the supplier is
shown in the database, the business doesn't purchase from the provider). Similar to this, the
business may purchase a certain inventory item from zero, one, or many different suppliers (if
the company manufactures the item in-house, for example).

Three common models : hierarchical, network, and relational

The Hierarchical Model


● The first database management system paradigm was the hierarchical model. The data in this
idea is organized using a hierarchical tree structure. The root of the hierarchy, which includes
the root data, is where it all starts. As child nodes are added to the parent node, the hierarchy
develops into a tree. This type of data representation was widely used since it fairly
represented various organizational aspects that have a hierarchical relationship. The most
widely used illustration of a hierarchical database is IBM's information management system
(IMS). Even 40 years after its 1968 debut, it is still a well-liked database design.

(figure 4.8)
● This is an
illustration of a hierarchical
database's data structure.
The sets that make up the
hierarchical model describe
the connection between two
linked files. A parent and a
child are present in every
pair. Keep in mind that File
B is both the parent and the
child in two sets at the
second level. Siblings are
files that are at the same
level as the same parent. This design is also known as a tree structure. The lowest file on a
specific branch is referred to as a leaf, while the root segment is the level highest in the tree.

Navigational Database
● Hierarchical model is also called a navigational database because it needs following a
predetermined path to traverse the files. This is demonstrated by explicit pointers (links)
between records that are related. The root of the tree and the pointers leading to the relevant
records provide the only access to data at lower levels of the tree. For example (figure 4.9)
The customer record (the root) contains a pointer to the sales invoice record, which points to
the invoice line item record. This record must be accessed by the DBMS before it can obtain
an invoice line item record. Later, we go into greater depth about the processes in this
procedure.

Data Integration in the Hierarchical Model (figure 4.10)


● The elaborate file structures
for the incomplete database in
Figure 4.9 are displayed in
Figure 4.10. The data content
of the records has been
simplified because the
example's goal is to
demonstrate the model's
navigational character.
● Let's say a user wants to get
information on a certain sales
invoice (Number 1921) for
customer John Smith in order
to perform an enquiry
(Account Number 1875). The
user enters the primary key
(Cust #1875) into the query
program, which looks through
the customer file for a
matching key value. When the key matches, it immediately retrieves John Smith's file. You'll
see that the customer record simply includes summaries of data. The amount due by John
Smith in total ($1,820) is shown as the current balance. This is the difference between the
total sales to this customer and the total cash payment on account received. Lower-level
records for sales invoices and cash receipts contain the supporting information on these
transactions.
● The query program reads the pointer value stored in the customer record and directs itself to
the menu item "List Invoices" from the user's selection. The initial invoice for John Smith, the
customer, is located at a certain location (the disk address). Each record in this list of invoice
records has a pointer to the record after it in the linked-list format. Each pointer will be
followed by the program, which will then retrieve each record in the list.
● The sales invoice records only include a summary of the relevant sales information. Further
arrows in these records indicate to the invoice line item file locations of supporting detail
records (the precise items sold). The user is then prompted to enter the key value required
(Invoice Number 1921) or choose it from a menu by the application. The program reads the
pointer to the first line item record after receiving this input. The application fetches the
complete list of line items for Invoice Number 1921, starting with the head (first) record. In
this example, the invoice is linked to just two records: item numbers 9215 and 3914. The
user's computer screen is then updated with the sales invoice and line item details.

Limitations of the Hierarchical Model


● Lack of flexibility: One of the main drawbacks of hierarchical databases is their lack of
flexibility compared to other database formats. They are also poorly equipped to handle
complicated data linkages or changes in data structures.
● Due to the tree-like structure, data must be repeatedly stored in numerous separate entities.
● Sequential searching is necessary, thus the database management system must traverse the
complete model from top to bottom until the necessary data is discovered.
● The hierarchical model exhibits a synthetic restricted perspective on data relationships. This
model does not always correspond to reality because it is predicated on the idea that all
business interactions are hierarchical (or can be represented as such). The hierarchical model's
functioning restrictions are shown by the rules that follow:
1. A parent record may have one or more child records. For example, in Figure 4.9, customer is the
parent of both sales invoice and cash receipts.
2. No child record can have more than one parent.

The hierarchical model's usefulness is constrained by the second criterion, which is frequently
limiting. Many businesses require a data association view that supports numerous parents for example
(figure 4.11)
Natural Relationship
● The client
file and the
salesperson file are
the two natural
parents of the sales
invoice file in this
instance. Both the
customer's purchase
event and the
salesperson's selling event result in a specific sales order. Sales order records must be viewed
by management as the logical offspring of both parents if it wants to connect sales activity
with customer service and personnel performance evaluation. Despite being rational, this
relationship breaks the hierarchical model's one parent constraint. Data integration is
constrained because complex relationships cannot be represented.
Hierarchical Representation with Data Redundancy
● demonstrates the most popular method for overcoming this issue. We produce two distinct
hierarchical representations by duplicating the sales invoice file and the associated line item
file. Regrettably, we have to increase data redundancy in order to accomplish this improved
functionality. The network model, which we look at next, effectively addresses this issue.

Network Model
● A hierarchical model known as the "Network Model" is used in database management
systems to describe the many-to-many link between database constraints. Although the
network model in DBMS has a hierarchical structure, it differs from the hierarchical database
model in that a member might have several parents.
● The Committee on Development of Applied Symbolic Languages (CODASYL), which was
established by an ANSI committee in the late 1970s, developed a database task force to
provide guidelines for database design. The network model for databases was created by
CODASYL. The integrated database management system (IDMS), which Cullinane/Cullinet
Software delivered to the commercial market in the 1980s, is the most well-known illustration
of the network paradigm. Even though there have been numerous changes to this model over
time, it is still in use today.

(figure 4.12)
● The network model is a
navigable database with clear
linkages between records and
files, similar to the hierarchical
model. The network model differs
in that it allows a child record to
have several parents.
● For example Salesperson
Number 1 and Customer Number
5 are the parents of Invoice
Number 1. The path to the invoice
(child) record is clearly defined by pointer fields in both of the parent records. There are two
linkages from this invoice record to related (sibling) entries. The first is a link to Invoice
Number 2 from a salesman (SP).
● This record is the outcome of a transaction between Customer Number 6 and Salesperson
Number 1. The customer (C) link to Invoice Number 3 is the second pointer. This is the
second purchase made by Salesperson Number 2 to Customer Number 5, who was the buyer
this time. With the help of this data structure, management is able to monitor and report sales
data on clients and salespeople.
● By supplying the proper primary key (SP # or Cust #), the structure can be accessed at either
of the root-level records (salesperson or customer). Beyond this, the access procedure
resembles the hierarchical model's description.

Relational Model
● An abstract model used to arrange and control the data kept in a database is the relational
model in DBMS. It stores information in two-dimensional interrelated tables, or relations,
where each row corresponds to an entity and each column to its properties.
● The relational model's guiding concepts were first put forth by E. F. Codd in the late 1960s.
The majority of the data manipulation techniques utilized have a theoretical foundation in
relational algebra and set theory, which are the foundations of the formal model. The way that
data associations are displayed to the user differentiates the relational model from the
navigational models most obviously. Two-dimensional tables are used to represent data in the
relational model.

(figure 4.13)
● This figure shows an example of a
database table called customer
● Columns of data attributes (data
fields) run across the top of the table. Tuples
are used in the table to create rows by
intersecting the columns. A record in a flat-
file system is analogous to, but not exactly
comparable to, a tuple, which is a normalized
array of data.
Four characteristics of properly designed
table
1. All occurrences at the intersection of a row
and a column are a single value. No multiple
values (repeating groups) are allowed.
2. The attribute values in any column must all be of the same class.
3. Each column in a given table must be uniquely named.
4. Each row in the table must be unique in at least one attribute. This attribute is the primary key.

● It's necessary to normalize the table. Each property in the row should be independent of the
others and be dependent on (and uniquely defined by) the primary key. We learned in the last
section how navigational databases create associations between records by using explicit
linkages (points). In the relational paradigm, the connections are implicit.
Difference between hierarchical database to relational model
● To illustrate this distinction, compare the file structures of the relational tables in Figure 4.14
with those of the hierarchical example in Figure 4.10. The conceptual relationship between
files is the same, but note the absence of explicit pointers in the relational tables.
● An attribute shared by both tables in a relation serves as its foundation. For instance, the Cash
Receipts and Sales Invoice tables both include embedded foreign keys that correspond to the
Customer table's primary key (Cust #). Similar to this, the Line Item table's foreign key to the
Sales Invoice table's primary key (Invoice #) is the same way. Invoice # and Item # are the
two fields that make up the Line Item table's composite primary key. Only the invoice number
portion of the key provides the logical connection to the Sales Invoice table, even though both
values are required to uniquely identify each entry in the table.
● Rather than using explicit addressees that are constructed into the database, logical operations
of the DBMS are used to create links between records in related tables. For instance, the
system would look for entries in the Sales Invoice table with the foreign key value of 1875 if
a user wanted to see all the invoices for Customer 1875. Figure 4.14 demonstrates that there is
only one, which is Invoice 1921. The Line Item table is searched for entries with a foreign
key value of 1921 in order to find the line item data for this invoice. The retrieval of two
records.
● The process for assigning foreign keys is based on how closely two tables are related to one
another. When there is a one-to-one relationship, each table's main key may be included as a
foreign key in the other. The primary key on the "one" side of one-to-many connections is
embedded as the foreign key on the "many" side. One customer, for instance, can have
numerous records of invoices and cash receipts. As a result, the Cust # is included in the
entries of the Cash Receipts and Sales Invoices tables.
● The Sales Invoice and Line Item tables are connected one to many in a similar manner.
Foreign keys that are incorporated in the tables are not used in many-to-many connections.
Instead, it is necessary to establish a separate link table that contains the keys for the
associated tables.

Databases in a Distributed Environment


Centralized Databases

Remote IT units send requests for data to the central site, which processes the requests and transmits
the data back to the requesting IT unit. The actual processing of the data is performed at the remote IT
unit. The central site performs the functions of a file manager that services the data needs of the
remote sites.

Data Currency in a DDP Environment


During data processing, account balances pass through a state of temporary inconsis- tency where
their values are incorrectly stated. This occurs during the execution of a transaction. To illustrate,
consider the computer logic next for recording the credit sale of $2,000 to customer Jones.

Database Lockout
is a software control (usually a function of the DBMS) that prevents multiple simultaneous accesses to
data. The previous example can be used to illustrate this technique.

Distributed Databases

Distributed databases can be either partitioned or replicated. We examine both approaches in the
following pages.

Partitioned Databases
The partitioned database approach splits the central database into segments or parti- tions that are
distributed to their primary users.

Advantages of Partitioned Databases


* Having data stored at local sites increases users’ control.
* Transaction processing response time is improved by permitting local access to data
and reducing the volume of data that must be transmitted between IT units.
* Partitioned databases can reduce the potential effects of a disaster. By locating data at several sites,
the loss of a single IT unit does not eliminate all data processing by the organization.

The Deadlock Phenomenon

In a distributed environment, it is possible for multi- ple sites to lock out each other from the database,
thus preventing each from processing its transactions. A deadlock occurs here because there is mutual
exclusion to the data resource, and the transactions are in a “wait” state until the locks are removed.
This can result in transactions being incom- pletely processed and the database being corrupted. A
deadlock is a permanent condi- tion that must be resolved by special software that analyzes each
deadlock condition to determine the best solution. Because of the implication for transaction
processing, accountants should be aware of the issues pertaining to deadlock resolutions.
Deadlock Resolution

Resolving a deadlock usually involves terminating one or more transactions to complete processing of
the other transactions in the deadlock. In preempting transactions, the dead- lock resolution software
attempts to minimize the total cost of breaking the deadlock. Some of the factors that are considered
in this decision follow:
* The resources currently invested in the transaction. This may be measured by the number of updates
that the transaction has already performed and that must be repeated if the transaction is terminated.
* The transaction’s stage of completion. In general, deadlock resolution software will avoid
terminating transactions that are close to completion.
* The number of deadlocks associated with the transaction. Because terminating the transaction
breaks all deadlock involvement, the software should attempt to termi- nate transactions that are part
of more than one deadlock.

Replicated Databases

Are effective in companies where there exists a high degree of data shar- ing but no primary user.
Since common data are replicated at each IT unit site, the data traffic between sites is reduced
considerably. Figure 4.18 illustrates the replicated database model.
The problem with this approach is main- taining current versions of the database at each site. Since
each IT unit processes only its transactions, common data replicated at each site are affected by
different transactions and reflect different values.

Concurrency Control
Database concurrency is the presence of complete and accurate data at all user sites. Sys- tem
designers need to employ methods to ensure that transactions processed at each site are accurately
reflected in the databases of all the other sites.

First, special software groups transactions into classes to identify potential conflicts. For example,
read-only (query) transactions do not conflict with other classes of transactions.

The second part of the control process is to time-stamp each transaction. A system- wide clock is used
to keep all sites, some of which may be in different time zones, on the same logical time.

Database Distribution Methods and the Accountant

The decision to distribute databases is one that should be entered into thoughtfully. There are many
issues and trade-offs to consider. Here are some of the most basic ques- tions to be addressed:
* Should the organization’s data be centralized or distributed?
* If data distribution is desirable, should the databases be replicated or partitioned?
* If replicated, should the databases be totally replicated or partially replicated?
* If the database is to be partitioned, how should the data segments be allocated among the sites?

CONTROLLING AND AUDITING DATA MANAGEMENT SYSTEMS

Controls over data management system falls into two categories:

ACCESS CONTROLS
- Designed to prevent unauthorised individuals from viewing, retrieving, corrupting, or
destroying the entity’s data.

•User Views (subschema) is a subset of the database that defines the user's data domain and
access.
•Database authorization table contains rules that limit user actions.
•User-defined procedures allow users to create a personal security program or routine.
•Data encryption procedures protect sensitive data.
•Biometric devices such as fingerprints or retina prints control access to the database.
•Inference controls should prevent users from interfering, through query options, specific data
values they are unauthorised to access.

Audit Procedures for Testing Database Access Controls


•Verify DBA personnel retain responsibility for authority tables and designing user views.
•Select a sample of users and verify access privileges are consistent with the job description.
•Evaluate cost benefits of biometric controls.
•Verify database query controls to prevent unauthorised access via interference.
•Verify sensitive data are properly encrypted.
BACKUP CONTROLS
- Ensures that in the event of loss of data due to unauthorised equipment failure, or physical
disaster the organisation can still recover its database.

•Since data sharing is a fundamental objective of the database approach, the environment is
vulnerable to damage from individual users.
•Four needed backup recovery features:
•Backup feature makes periodic backup of the entire database which is stored in a secure,
remote location.
•Transaction log provides an audit trail of all processed transactions.
•Checkpoint facility suspends all processing while the system reconciles transaction log and
database change log against the database.
Recovery module uses logs and backup files to restart the system after a failure.

Audit Procedures for Testing Database Backup Controls


•Verify backups are performed routinely and frequently.
•Backup policy should balance inconvenience of frequent activity against business disruption
caused by system failure.

•Verify that automatic backup procedures are in place and functioning and that copies of the
database are stored off-site

You might also like