INFOMRATION MANAGEMENT NOTES 1ST SEM
INFOMRATION MANAGEMENT NOTES 1ST SEM
- GOD
Database Systems
Data - Raw facts, or facts that have not yet been processed to reveal their meaning to the end user.
Information – The result of processing raw data to reveal its meaning. Information consists of transformed data
and facilities decision making. (Coronel and Morris, 2017, p. 4)
Database is a shared, integrated computer structure that houses a collection of the following:
• End-user data – that is, raw facts of interest to the end user.
• Metadata, or data about data, through which the end-user data is integrated and managed. (Coronel
and Morris, 2017, p. 6).
Roles and Advantages of DBMS (Coronel and Morris, 2017)
A Database Management System (DBMS) is a collection of programs that manages the database structure and
controls access to the data stored in the database. The DBMS serves as the intermediary between the user and
the database. The database structure itself is stored as a collection of files, and the only way to access the data
in those files is through the DBMS.
Advantages of DBMS: IDS, IDS, BDI, MDI, IDA, IDM, IE-UP
• Improved data sharing - The DBMS serves as the intermediary between the user and the database. The
database
structure itself is stored as a collection of files, and the only way to access the data in those files is
through the DBMS.
• Improved data security - The more users access the data, the greater the risks of data security breaches.
Corporations invest considerable amounts of time, effort, and money to ensure that corporate data is
used properly. A DBMS provides a framework for better enforcement of data privacy and security
policies.
• Better data integration - Wider access to well-managed data promotes an integrated view of the
organization’s operations and a clearer view of the big picture. It becomes much easier to see how
actions in one segment of the company affect other segments.
• Minimized data inconsistency - Data inconsistency exists when different versions of the same data
appear in different places.
• Improved data access - The DBMS makes it possible to produce quick answers to ad hoc queries. From a
database perspective, a query is a specific request issued to the DBMS for data manipulation—for
example, to read or update the data.
• Improved decision making - Better-managed data and improved data access make it possible to
generate better- quality information, on which better decisions are based. The quality of the information
generated depends on the quality of the underlying data.
• Increased end-user productivity - The availability of data, combined with the tools that transform data
into usable information, empowers end users to make quick, informed decisions that can make the
difference between success and failure in the global economy.
Types of Databases
Database Management System (DBMS) cab be used to build many different types of databases. The number
of users determines whether the database is classified as single user or multiuser.
Types of Databases:
• Single-user database – A type of database that supports only one user at a time.
• Desktop database – A single user database that runs on a personal computer.
• Multiuser database – A type of database that supports multiple users at the same time.
• Workgroup database – A type of database that supports a relatively small number of users or a specific
department within an organization.
• Enterprise database – A type of database that is used by the entire organization and supports many
users across many departments.
01 Handout 1
• Centralized database – A type of database that supports data located at a single site.
• Distributed database – A type of database that supports data distributed across several different sites.
• Cloud database – A database that is created and maintained using cloud services, such as Microsoft
Azure or Amazon AWS.
• Discipline-specific database – A type of database that contains data focused on specific subject areas.
• Operational database – A type of database designed primarily to support a company's day-to-day
operations.
• Analytical database – A type of database focused primarily on storing historical data and business
metrics used for tactical or strategic decision making.
• General-purpose database – A database that contains a wide variety of data used in multiple disciplines.
01 Handout 1
Importance of Database Design
Database Design refers to the activities that focus on the design of the database structure that will be used to
store and mange end-user data. A database that meets all user requirements does not just happen; its structure
must be designed carefully. In fact, database design is such as crucial aspect of working with databases that
most of this book is dedicated to the development of good database design techniques. (Coronel and Morris,
2017, p. 11)
Oftentimes the database design does not get the attention it deserves. This can occur for numerous reasons such
as:
• Insufficient specifications and/or poor logical data modeling
• Not enough time in the development schedule
• Too many changes occurring throughout the development cycle
• Database design assigned to, or performed by novices
The first step in constructing a physical database should be transforming the logical design using best practices.
The transformation consists of the following:
• Transforming entities into tables
• Transforming attributes into columns
• Transforming domains into data types and constraints
• Transforming relationships into primary and foreign keys
File System Data Processing Issue (Coronel and Morris, 2017)
The file system method of organizing and managing data was a definite improvement over the manual system,
and the file system served as useful purpose in data management for over two (2) decades. Nonetheless, many
problems and limitations became evident in this approach.
A critique of the file system method serves two (2) major purposes:
• Understanding the shortcomings of the file system enable you to understand the development of
modern databases.
• Many of the problems are not unique to file systems. Failure to understand such problems is likely to
lead their duplication in a database environment, even though database technology makes it easy to
avoid them.
The following problems associated with file systems, whether created by Data Processing (DP) specialist or
through a series of spreadsheets, severely challenge the types of information that can be created from the data
as well as the accuracy of the information:
• Lengthy development times – The first and most glaring problem with the file system approach is that
even the simplest data-retrieval task requires extensive programming. With the older file systems,
programmers had to specify what must be done and how to do it. As you will learn in upcoming
chapters, modern databases use a nonprocedural data manipulation language that allows the user to
specify what must be done without specifying how.
• Difficulty of getting quick answers – The need to write programs to produce even the simplest reports
makes ad hoc queries impossible.
• Complex system administration – System administration becomes more difficult as the number of files
in the system expands. Even a simple file system with a few files requires creating and maintaining
several file management programs.
• Lack of security and limited data sharing – Another fault of a file system data repository is a lack of
security and limited data sharing. Data sharing and security are closely related. Sharing data among
multiple geographically dispersed users introduces a log of security risks.
• Extensive programming – Making changes to an existing file structure can be difficult in a file system
environment.
Structural dependence – A data characteristic in which a change in the database schema affects data access,
01 Handout 1
thus requiring changes in all access programs.
Structural independence – A data characteristic in which changes in the database schema do not affect data
access.
01 Handout 1
Data dependence – A data condition in which data representation and manipulation are dependent on the
physical data storage characteristics.
Data independence – A condition in which data access is unaffected by changes in the physical data storage
characteristics.
Data redundancy – It exists when the same data is stored unnecessarily at different places.
Uncontrolled data redundancy sets the stage for the following:
• Poor data security – Having multiple copies of data increases the chances for a copy of the data to be
susceptible to unauthorized access.
• Data inconsistency – Data inconsistency exists when different and conflicting versions for the same data
appear in different places.
• Data-entry errors – Data-entry errors are more likely to occur when complex entries are made in several
different files or recur frequently in one or more files.
• Data integrity problems – It is possible to enter a nonexistent sales agent's name and phone number
into the Customer file, but customers are not likely to be impressed if the insurance agency supplies the
name and phone number of an agent who does not exist.
Data Anomalies
• A data abnormality in which inconsistent changes have been made to a database.
• A data anomaly develops when not all of the required changes in the redundant data are made
successfully.
REFERENCES:
Coronel, C. and Morris, S. (2017). Database systems: design, implementation, and management, 12th edition. USA: Cengage Learning.
Elmasri, R. and Navathe, S. (2016). Fundamentals of database systems, 7th edition. USA: Pearson Higher Education.
Kroenke, D. and Auer, D. (2016). Database processing: fundamentals, design, and implementation. England: Pearson Education
01 Handout 1
Limited.
Data Models
Database design focuses on how the database structure will be used to store and manage end-user data. Data
Modeling, the first step in designing a database, refers to the process of creating a specific data model for a
determined problem domain.
A data model is relatively simple representation, usually graphical, of more complex real-world data structures.
In general terms, a model is an abstraction of a more complex real-world object or event. (Coronel and Morris,
2017, p. 36)
The importance of data modeling cannot be overstated. Data constitutes the most basic information used by a
system. Applications are created to manage data and to help transform data into information, but data is viewed
in different ways by different people.
Hierarchical Model
- It was developed in the 1960s to manage large amounts of data for complex manufacturing projects.
- The model's basic logical structure is represented by an upside-down tree. It contains levels, or segments.
- Segment is the equivalent of a file system's record type.
Network Model
- It was created to represent complex data relationships more effectively than the hierarchical model, to
improve database performance, and to impose a database standard.
- The network database model is generally used today, the standard database concepts that emerged with
the network model are still used by modern data models:
o Schema – It is the conceptual organization of the entire database as viewed by the database
administrator.
o Subschema – It defines the portion of the database by the application programs that actually
produce the desired information from the data in the database.
o Data Manipulation Language (DML) – It defines the environment in which data can be managed.
o Data Definition Language (DDL) – It allows the database administrator to define the schema
01 Handout 1
components.
Relational Model
- It was introduced in 1970 by E. F. Codd of IBM.
- The relational model represented a major breakthrough for both users and designers.
- The foundation of mathematical concept is known as a relation.
Entity Relationship Model
- It was introduced in 1976 by Peter Chen.
- The graphical representation of entities and their relationships in a database structure quickly became
popular, because it complemented the relational data model concepts.
- The relational data model and ERM are combined to provide the foundation for tightly structured
database design.
Object-Oriented Model
- Increasingly complex real-world problems demonstrated a need for a data model that more closely
represented the real world. In the Object-Oriented Data Model (OODM), both data and its relationships
are contained in a single structure known as an object. In turn, the OODM is the basis for the Object-
Oriented Database Management System (OODBMS).
- The OODM is said to be a semantic data model because it indicates meaning.
- The Object-Oriented Data Model is based on the following components:
o An object is an abstraction of a real-world entity
o Attributes describe the properties of an object.
o Objects that share similar characteristics are grouped in classes. A class is a collection of similar
objects with shared structure (attributes) and behavior (methods).
o Classes are organized in a class hierarchy. The class hierarchy resembles an upside-down tree in
which each class has only one parent.
o Inheritance is the ability of an object within the class hierarchy to inherit attributes.
o Object-oriented data models are typically depicted using Unified Modeling Language (UML) class
diagrams. UML is a language based on Object-Oriented concepts that describes a set of diagrams
and symbols you can use to graphically model a system.
Extensible Markup Language (XML) – A metalanguage used to represent and manipulate data elements. Unlike
other markup languages, XML permits the manipulation of a document's data elements.
Emerging Data Models: Big Data and NoSQL
Big Data
- It refers to a movement to find new and better ways to manage large amounts of web and sensor-
generated data and derive business insight from it, while simultaneously providing high performance and
scalability at a reasonable cost.
- The term seems to have been first used in a computing framework by John Mashey, Silicon
Graphics scientist in the 1990s. However, it seems to be Douglas Laney, a data analyst from the Gartner
Group, who first described the basic characteristics of Big Data databases:
01 Handout 1
NoSQL
- It is a large-scale distributed database system that stores structured and unstructured data in efficient
ways.
- Searching in Amazon, sending messages in Facebook, videos in YouTube, or searching for directions
in Google Maps, are examples of those that use a NoSQL database.
- The following are the general characteristics of NoSQL databases:
o They are not based on the relational model and SQL, hence the name NoSQL.
o They support distributed database architectures.
o They provide high scalability, high availability, and fault tolerance.
o They support very large amounts of sparse data.
o They are geared toward performance rather than transaction consistency.
- NoSQL supports distributed database architecture – One of the big advantages of NoSQL
databases is that they generally use a distributed database node.
- NoSQL supports very large amounts of sparse data – NoSQL databases can handle very high
volumes of data. In particular, they are suited for sparse data – that is, for cases in which the number of
attributes is very large but the number of actual data instances is low.
- NoSQL provides high scalability, high availability, and fault tolerance – True to its web origins,
NoSQL databases are designed to support web operations, such as the ability to add capacity in the form
of nodes to the distributed database when the demand is high, and to do it transparently and without
downtime.
- Most NoSQL databases are geared toward performance rather than transactions consistency –
One of the biggest problems if very large distributed databases are enforcing data consistency.
Distributed databases automatically make copies of data elements at multiple nodes to ensure high
availability and fault tolerance.
In early 1970s, the American National Standards Institute (ANSI) Standards Planning and Requirements Committee
(SPARC) defined a framework for data modeling based on degrees of data abstraction. The resulting ANSI/SPARC
architecture defines three (3) levels of data abstraction: external, conceptual, and internal (Coronel and Morris,
2017, p. 60).
External Model
- It is the end user's view of the data environment.
- It refers to people who use the application programs to manipulate the data and generate information.
- ER diagrams will be used to represent the external views. A specific representation of an external view is
known as an external schema.
Conceptual Model
- It represents a global view of the entire database by the entire organization.
- Also known as a conceptual schema, it is the basis for the identification and high-level description of the
main data objects.
Internal Model
- It is the representation of the database as "seen" by the DBMS.
01 Handout 1
- It requires the designer to match the conceptual model's characteristics and constraints to those of the
selected implementation model.
- Internal schema depicts a specific representation of an internal model, using the database constructs
supported by the chosen database.
Physical Model operates at the lowest level of abstraction, describing the way data is saved on storage media such
as magnetic, solid state, or optical media. The physical model requires the definition of both the physical storage
devices and the (physical) access methods required to reach the data within those storage devices, making it both
software and hardware dependent (Coronel and Morris, 2017, p. 63).
External Model
It is the end user's view of the data environment.
It refers to people who use the application programs to manipulate the data and generate information.
ER diagrams will be used to represent the external views. A specific representation of an external view is known as
an external schema.
Conceptual Model
It represents a global view of the entire database by the entire organization.
Also known as a conceptual schema, it is the basis for the identification and high-level description of the main data
objects.
Internal Model
It is the representation of the database as "seen" by the DBMS.
It requires the designer to match the conceptual model's characteristics and constraints to those of the selected
implementation model.
Internal schema depicts a specific representation of an internal model, using the database constructs supported
by the chosen database.
01 Handout 1