0% found this document useful (0 votes)
2 views

Database Concepts

Uploaded by

prashanthrc127
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Database Concepts

Uploaded by

prashanthrc127
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Database concepts

Total Marks: 16
MCQ- 1
Fill in the blanks-05
2marks- 01
3marks-01
5 mark -01
INTRODUCTION

• Gathering and processing of data can be seen in day –to-day lives


• Large and complex data which are collected, entered, stored and
accessed based on user needs are in the form of Queries
• A unique software are developed and used for data management
Applications of database
• Banking
• PWD software
• Railways and airlines reservation and maintenance
• Schools, Colleges and universities
• Credit and transactions
• Telecom
• Finance – share market
• Sales
• Manufacturing
Data and information
Data: Collection of Raw facts, numbers, figures, statistics or symbols
that the computer process into meaningful information

Information: Processed data with definite meaning which cane be


sored and transmitted.
Evolution of database
Data processing cycle
Data input:
• Raw material such as letters, numbers,
symbols , shapes images put to the
computer.
• Raw material which requires processing
• Input to the computer can be done
through
keyboard, mouse, scanner, microphone,
digital camera etc.
• Data id converted to computer
understandable format
Data processing cycle

Data processing:
• Processing is series of actions or
operations performed on data to
generate output .
• Calculation, sorting, indexing,
accessing data, extracting part of
data, condition based operation
Data processing cycle
Data Storage :
• Data which is currently being not
required to be kept safely. This
process is known as data storage
• Primary storage : computer circuitry
temporarily stores the data until
computer RAM process the data .
• Secondary storage : Data storage is
done permanently . Stored in floppy
disk, hard disk or CD-ROM.
Data processing cycle
Communication :
• Wired and wireless communication
to input data from afar.
• Processing at remote place
• Data storage at different places.
• Data transmitted through modem.
Database terms
1. File: Large collection of related data is called a file. It’s a basic unit of
storage in computer
2. Database: Collection of logically related data organized in a way such
that it can be accessed, managed (processed) and updated.
3. Table: Collection of data elements organized in terms of rows and
columns .
Employee
EMP_ID NAME AGE SALARY
1 AAA 43 45000
2 BBB 54 60000
3 CCC 23 30000
4 DDD 19 25000

Table: Employee, Columns: EMP_ID, NAME,


AGE, SALARY
Rows: There are four rows
Database terms
4. Records: A single entry in a table is called a record or row. Record in a
table represents set of related data
Employee
3 CCC 23 30000
5. Tuple: Records or rows in a table is also called tuple
6. Fields: Each column is identified by a distinct header called attribute or
field
Employee
EMP_ID NAME AGE SALARY

7. Domain: Set of values for an attribute in that column


8. An entity: Is an object such as table or form. An Entity relationship is
how each table link to each other.
Thing , place, person or object that is independent of another
Data types in dbms
1. Integer- Whole number without fraction
2. Single and double precision : seven significant value of number
3. Logical: stored data has two values TRUE or FALSE
4. Characters : includes letters, numbers, spaces, symbols, special
characters, punctuation etc. Character filed store text information like
name, address.
5. Strings: Sequence of characters more than one
6. Memo data type: Store more than 255 characters
7. Index field: Used to store relevant information along with the
documents
8. Currency fields: Used to store currency values. Accepts data in the form
dollars by default.
9. Date field: Used to accept data in the form date
10. Text filed: Used to accept data in the form of alphanumeric text string
Example for index field
DBMS- DATABASE MANAGEMENT SYSTEM
• DBMS is software that allows creation, definition and manipulation of
database.
• Tool used to perform operations on database.
• Provides protection and security for database
• Maintains data consistency in case of multiple users
• Example for DBMS:
MySQL- ORACLE Sybase-SAP Microsoft Access- MICROSOFT
Oracle- ORACLE DB2-IBM
DBMS- DATABASE MANAGEMENT SYSTEM
• A software used to manage the storage, organization , processing and retrieval
of data in a database
• DBMS is a general purpose software system that facilitates
• Defining: specifying the data types , structures and constraints for the
data stored in the database
• Constructing: storing the data on some storage medium that is
controlled by DBMS
• Manipulating: retrieve specific data, updating the database, generating
reports etc. are known as manipulating
DBMS- DATABASE MANAGEMENT SYSTEM
DBMS consists of
• DB: A Collection of interrelated data. This is part of
DBMS Known as database (DB)

• MS: a set of application programs used to access,


update and manage data. This part of DBMS is known
as Management system (MS)
DBMS- DATABASE MANAGEMENT SYSTEM
Features or advantage of
database system
• Data stored at a central location and is shared among multiple users. This is
called centralized data management.

• Controlled data redundancy : Due to centralized data management files are


integrated and each logical data item is stored at a central location. This
eliminates data duplication (redundancy)

• Data integrity: Validity of data is called data integrity . This can be automatically
checked by DBMS software

• Data sharing: Data stored can shared among multiple users and application
programs. Any new application can use the stored data without having to create
any additional data or with minimal modification
Features or advantage of
database system
• Data security : DBMS provides security tools such as user codes and
passwords. This features of DBMS enables data security whenever
access attempt is made for sensitive data
• Ease of application development: DBMS handles security, data
access, data integrity. This makes development of application
software an easy task.
• Multiple user interfaces: in order to meet needs of users with
various technical knowledge DBMS provides following user interface
• Query Language
• Application program interface
• Graphical User interface – form style and menu driven
• Backup and recovery : DBMS provides facility of backup and
recovery subsystem. This system is responsible for recovery of data
from hardware and software failures
disadvantages of database system
• Danger of overload: For small simple applications database is not advisable
• Complexity: Database system adds additional complexity and requirements.
This makes the application costly.
• Qualified personnel: Professional operation of database requires trained staff.
• Costs: due to the use of database system additional hardware requirements
makes application system comparatively costly
• Lower efficiency: database system is multi user software. This feature makes it
less efficient than specialized software designed for exactly to solve one
problem.
Data independence :
• Accessing of data without interrupting the other related data in
database is known as data independence.
Logical data independence :
• Data about the database is called logical data .
• Logical data stores information regarding how data is organized in
the database
• Logical data independence is a mechanism to change the conceptual
level without changing the external view of each user group
• Ex: In a book database if a additional data item such as quantity is
added it should change external view of each user group.
Data independence :
Physical data independence :
• Physical data independence is the ability to change physical data without
changing the logical data.
• Ex : if the storage system of database is changed ( from hard disk to SSD)
should not impact on logical data or design of database. Database should
work as before,
DBMS users
• End user: People who require access to database for querying, updating and
generating reports .
• System Analyst: They determine requirement of end user especially naïve,
parametric end users and develop specification for transaction that meet these
requirements
• Application programmers: Application programmers develop the specifications
provided by system analysts into computer programs
• Database administrator (DBA) : DBA are responsible for authorized access to
database and their usage. They also responsible for acquiring required software
and hardware .
• Database designers: They are responsible for identifying the data and suitable
structure for data to be stored in database.
DATA ABSTRACTION
• Hiding the complexity of use DBMS from users is known
as data abstraction.

• Objective of data abstraction is to retrieve the desired data


efficiently .

• The system hides certain details of how the data are


stored and maintained.
DATA ABSTRACTION
• Following are three levels of abstraction
1. Physical Level (Internal Level)
• This level describes how the data is
actually stored
• This level describes complex low
level data structures in detail
• It contains definition of stored data
and method of representing the data
fields and access aid used
DATA ABSTRACTION
• 2. Conceptual Level (Logical Level)
• Second higher level of abstraction
• This level describes what data
stored in database and what
relationships exist among those data
• It contains method of deriving
objects in the conceptual view from
the objects in the internal view
DATA ABSTRACTION
• 3. View Level (External Level)
• highest level of abstraction
• This level describes only a part of
database
• It consists of definition of logical
records and relationships in the
external view
• It contains method of deriving
objects in the external view from
the objects in the conceptual view
DATA ABSTRACTION
• Schema : The overall design of data base is called Schema

• Instance: The collection of information stored in the database at a


particular moment is called instance of the database

• Mapping : The process of transforming the requests and results


between various levels of DBMS architecture is known as mapping
File organization
The systematic arrangements of contents of file such as file structure, records,
fields for efficient access is known as file organization.
1. Serial file organization
• The collection of data records are stored in the
chronological order (time of creation) in the physical
medium.
• No particular sequence is followed in order to store the
data
• It can used as temporary transaction file but not as master
file.
• Magnetic type storage is used to store data
2. Sequential file organization
• Records arranged in a predetermined order of their keys.
• Contents of the file is accessed sequentially.
• Adding of record requires to make new file since the new
file is to be sorted.
2. Sequential file organization

• In the above example records organized on the key field value


REGNO. The values are in an ascending form.
• A magnetic tape is the most common storage device used to
implement sequential files.
2. Sequential file organization
Advantages :
• Simple file design and easy to understand
• Uses less expensive storage medium – magnetic tape
• Uses storage space efficiently
• Easy to re construct the file

Disadvantages :
• Entire file must be processed even if a single record is to be accessed
• transaction have to be sorted before processing
•Data redundancy is high since same data may be stored at different
places with different keys
3. Direct/random file organization
• Data is stored in the random order at a known physical
address.
• Accessing data also uses random method with the help of
record key.
• Storage device such as magnetic disks, CD’s or DVD’s are
used
• Desired data is accessed using various methods. One basic
techniques used access data in random file organization is
“Hashing”
3. Direct/random file organization
• Hashing has two parts
1. Hashing- It generates the physical address for the new
record key
2. Conflict resolution technique- it solves the conflicts
occurred when same physical address is assigned to
multiple record keys
• Used in information retravel system like reservation of bus,
air or train tickets.
3. Direct/random file organization
Advantages :
• No index is used to store records. This saves memory space
• Any record can be directly accessed with high speed
• Concurrent processing of transactions
•On –line processing of data can be done effectively
Disadvantages :
• More complex and requires comparatively expensive devices
• implemented only in device which support random /direct access.
• separate algorithm to be written for conflict management while hashing
process
4. Indexed sequential file organization
( ISAM)
• Combines features of sequential and direct file organization
and one of the popular method
• It consists of an index file which is a sequential file arranged
using key field which has index.
• It used random access storage like magnetic disks.
• it used in applications where transaction happens
• in both sequential and random method.
• It is also called Indexed Sequential Access Method (ISAM)
4. INDEXED SEQUENTIAL FILE ORGANIZATION
( ISAM)
Advantages :
• Provides flexibility for users as it uses both sequential and random access method
• Provides quick accessing of record provided in the index file properly organized
• permits quick access of record with high activity ratio
• On –line processing of data can be done effectively

Disadvantages :
• Extra storage and processing time for the indexing is required
• Hardware and software used are relatively expensive
Architecture of database(DBMS)
• Design of a database system highly depends on its
architecture. The design can centralized or decentralized.
• DBMS architecture can be either single layer called 1-tier
or multi layer called multi-tier.
• In a multi tier architecture each tier is related but works as
independent modules where each tier can be
independently modified, altered, changed or replaced.
Architecture of database(DBMS)

1-tier Architecture
• User directly interacts with database.
• This types of systems simplest and most
direct.
• Example - using SQL commands by an
user may extract information directly from
the computer database. Any changes done
will directly reflect on DBMS system itself
Architecture of database(DBMS)
2-tier Architecture
• it’s a software architecture
• A presentation layer or a software
interface runs on a client and data layer or
data structure gets stored on a server.
• A client- an application program send a
query to server, server process the request
and sends the required details to the client
• Server may have many clients
ARCHITECTURE OF DATABASE(DBMS)
3-tier Architecture

Example
Architecture of database(DBMS)
3-tier Architecture
• Widely used architecture
• It’s a client-server architecture
• Client- it is a user computer request the server for some
service
• Server- it is high speed computer used to provide services
against the clients request
Architecture of database(DBMS)
Layers of 3-tier Architecture
1. Presentation tier
• End user interacts with this layer. User is not aware of
application and database layers
• application layer provides different views of database as per
request of user .
• All views generated are stored in the application layer.
Architecture of database(DBMS)
Layers of 3-tier Architecture
2. Application tier
• Its is called middle layer or middle tier.
• Controls the application functionality
• It takes the request from presentation layer and interacts
with the database tier. User is not aware of this layer.
Architecture of database(DBMS)
Layers of 3-tier Architecture
3. Database tier
• it contains database server where information is stored and
retrieved
• Data in the tier is independent of application servers
• It contains all relations and their constraints
Key and Types keys
• A key is defined as the column or attribute of the database table
• Keys are used to identify record in the database table
• Keys are used for efficient access of database and to avoid duplicate records

SL NO REG NO NAME COMBINATION MARKS


• In the above
1 table S110
the attributesAAASl.No, REG NO, NAME, COMBINATION,
PCMC 80 MARKS
are called
2 keys. Using
C223keys tableBBB
data can be accessed.
CEBA 90
3 S312 AAA PCMC 75
SL NO REG NO NAME COMBINATION MARKS

Types keys 1
2
S110
C223
AAA
BBB
PCMC
CEBA
80
90

1. Candidate Key: 3 S312 AAA PCMC 75

• Candidate key an attribute or set of attribute that uniquely identifies each


row .
Ex: In the above table Sl NO and REG NO identifies all the tuples(rows)uniquely.
They are called candidate Key
2. Primary Key:
• A column or attribute that identifies each row of table uniquely is called
primary key .
• Primary key column should not be empty or NULL. It is also called super key.
• Primary key should be candidate key.
• Ex in the above table among SL NO and REG NO, REG NO is considered as
primary key since it identifies all record uniquely.
SL NO REG NO NAME COMBINATION MARKS

Types keys 1
2
S110
C223
AAA
BBB
PCMC
CEBA
80
90

3. Composite Key: 3 S312 AAA PCMC 75

• Two or more keys that uniquely identifies record (tuples) in the table is called
composite key .
•A set of primary key among candidate key is called is composite key
Ex: In the above table Sl NO and REG NO identifies all the tuples(rows)uniquely.
Both are called composite Key
3. Alternate Key:
• A candidate key which is not considered as primary key currently is called
alternate key . It is called secondary key.
• In the above table if REG NO is considered as primary key then
SL NO currently can be called as an Alternate key or secondary key.
STUDENT
Types keys SL NO REG NO NAME
COURSE
5. Foreign Key: CODE
1 S110 AAA C1
• is an attribute or set of attribute that
2 C223 BBB C2
appears as non key attribute in one
3 A312 AAA C3
relation and as primary key attribute in
4 S201 CCC C1
another relation.
COURSE
• used to extract data from two tables
COURSE DURATION
•Ex : in the two tables STUDENT and CODE
NAME FEES
IN YEARS
COURSE , COURSE CODE is called as C1 SCIENCE 50,000 2 YEARS
Foreign key since it’s a non key attribute C2 COMMERCE 48,000 2 YEARS
in STUDENT but it is a primary key in the C3 ARTS 40,000 2 YEARS
table COURSE.
Database model
• Database model describes logical design of data in a database.
• It describes relationships between different entities
• Collection of conceptual tools for describing the data and
relationships
• There are three database models
• Hierarchical model
• Network Model
• Relational model
Database model
Hierarchical Model
• It uses tree structure to represent relationship among record
• Multiple entities are ordered in hierarchical order
• Entities are related in one to many relationships .
Database model
Network Model
• Introduced in late 1960s
• Data is represented by a group of records and relationships are
connected by links
• Links are association between the records
• It allows modelling of many to many relationships among the data
• Data is organised in the form of graph
Database model
Relational Model
• Proposed by E F Codd’s theoretical paper

• Hierarchical model relaced by neat ana easily understandable


structure is called relational model

• It represents the information about any entity and its relationships


with other entities in the form of attributes and tuples
Database model
Properties of Relational Model

• Each column has a distinct name and value must be simple

• Each row is distinct i.e one row cannot duplicate another row for
selected key attribute

• The sequence of row is immaterial


Entity Relationship (E-R) model
• An Entity , real world object is component of data domain
• Entity relationship diagram (ERD)shows the relationship of
entity sets stored in database
• ERD illustrates and uniquely represents logical structure of
data bases with symbols like flow chart
Components of E-R model
• Entity: real world object or concept
about which data is stored. Symbol
used to represent is RECTANGLE
• Attributes : Properties or
characteristics of entities. Symbol
used to represent is OVAL
• Relationship: This connects entities
and represents meaningful
dependencies among them. Symbol
used is RHOMBUS
Components of E-R model

Other Symbols used in E-R diagram


Different types of relationships between
entity sets
1. One to One relationship: One instance of
entity(E1) associated with one instance
of another entity(E2)

2. One to Many relationship: One instance


of entity (E1)associated with zero, one or
more instances of another entity set(E2)
Different types of relationships between
entity sets
3. Many to one relationship : One instance of an entity set
(E1) is associated with zero, one or more instances of
another entity set (E2) but for one instance of entity E1
there is only one instance of entity E2.

4. Many to Many relationship: One instance of an entity


set (E1) is associated with zero, one or many instances of
another entity set (E2) and one instance of entity E2 is
associated with zero, one or many instances of entity E1
Cardinality
• Maximum number of times an instance in one entity can be associated with
instances in the related entity is known as cardinality.
• It specifies maximum number of relationships
• It specifies the occurrences of relationships

ORDINALITY
• Minimum number of times an instance in one entity can be associated with instances in
the related entity is known as ordinality.
• It specifies absolute number of relationships
• It describes relationship as either mandatory or optional
• When minimum number is zero , the relationship is called optional
• When minimum number is one or more , the relationship is called mandatory
Relational algebra
• Relational algebra is a procedural query language which performs
set of operations on relations .
• It has operators to perform queries
• Operator can be either unary or binary
• Terminologies :
1. Relation: a set of tuples
2. Tuples : a collection of attributes which describe some real
world entity
3. Attribute: a real world role played by a named domain
4. Domain: a set of atomic values
5. Set: Collection of objects which contains no duplicate
• Some operations in relational algebra are INSERT, DELETE,
MODIFY(Unary Operations). UNION, INTERSECTION, DIFFERENCE
AND CARTESIAN PRODUCT ( Binary operations)
Data warehouse

• Process of gathering, cleaning and integrating data from


various sources , usually from long term existing systems is
known as data warehousing
• It is used to store the data, reporting and data analysis.
• Data warehouse is also defined as “A central repository of
current and historical data for creating the reports.
Benefits or advantages of data warehouse
Disadvantages of data warehouse
•Process of extracting, cleaning and loading is
time consuming
•Data warehouse can get outdated quickly
•Compatibility issues with already existing
systems
•Providing training for users
•Security issues of the data when data
warehouse is connected to internet
Evolution of data warehouse
• Offline operational data warehouse- data is copied from a
operational system to Datawarehouse
• Offline data warehouse- Data updated regular basis
• Real/On time data warehouse- data updated in real time through
on-line
• Integrated data warehouse- gathers information from various areas
of business, makes consolidated information
Components of data warehouse
1. Data Source: Electronics repository of information from mainframe
computers, client-server databases, PC database and ESS are called Data
source. These data should be transferred to data warehouse on a regular basis

2. Data Transformation: Data transformation layers receives data from data


source. cleans , standardizes and loads it to the data warehouse. This is done
through a specific tool called ETL(Extract, Transform and Load)

3. Reporting : Data in the data warehouse are made available for various
organization in an useful manner. Thia process is called reporting.
Components of data warehouse

4. Meta Data: Data about the data is known as meta data. It gives
information about the data in the data warehouse.

4. Operations : Process of loading, Manipulating and extracting data


from the date ware house is called operations
Data mining
• Analysis and picking our relevant information from data warehouse is known as data
mining
Various steps in data mining
1. Data Integration: Data is collected and integrated from all possible
different sources
2. Data Selection: Selection useful data
3. Data Cleaning : correcting errors in collected data
4. Data Transformation : Data is made usable and Navigable by doing
various process and techniques
5.Data mining : Data after cleaning and transformation is ready for
mining. It draws various interesting patters from available data.
Clustering and association analysis are the techniques used in data
mining
6.Interpretation and Evaluation : Patterns identified for mining is made
used to support in various process like decision making , prediction,
classification, summarizing

You might also like