0% found this document useful (0 votes)
42 views

Class XII IT Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Class XII IT Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Unit - 1: Database Concepts

Basic Concepts and Definitions

Data is a collection of raw facts which have not been processed to reveal useful information.
Information is produced by processing data.

Collection of related data that has been recorded, organized, and made available for searching is
called a Database.

A database has the following properties:


1) A database is a representation of some aspect of the real world also called miniworld.
Whenever there are changes in this miniworld they are also reflected in the database.
2) It is designed, built and populated with data for specific purpose.
3) It can be of any size and complexity.
4) It can be maintained manually or it may be computerized.

Need for a Database


In traditional file processing, data is stored in the form of files. It would result in:
1. Data Redundancy: Same information is stored in more than one file. This would result in
wastage of space.
2. Data Inconsistency: If a file is updated then all the files containing similar information
must be updated else it would result in inconsistency of data.
3. Lack of Data Integration: As data files are independent, accessing information out of
multiple files becomes very difficult.

In database approach, a single repository of data is maintained which is accessed by different users
as per their needs.

Database Management System (DBMS)


A database management system is a collection of programs that enables users to create, maintain
and use a database. It enables creation of a repository of data that is defined once
and then accessed by different users as per their requirements. Thus there is a single repository of
data which is accessed by all the application programs.
The various operations that need to be performed on a database are as follows:
1. Defining the Database: It involves specifying the data type of data that will be stored in
the database and also any constraints on that data.
2. Populating the Database: It involves storing the data on some storage medium that is
controlled by DBMS.
3. Manipulating the Database: It involves modifying the database, retrieving data or querying the
database, generating reports from the database etc.
4. Sharing the Database: Allow multiple users to access the database at the same time.
5. Protecting the Database: It enables protection of the database from software/
hardware failures and unauthorized access.
6. Maintaining the Database: It is easy to adapt to the changing requirements.

Some examples of DBMS are – MySQL, Oracle, DB2, IMS, IDS etc.

Characteristics of Database Management Systems


The main characteristics of a DBMS are as follows:
1. Self-describing Nature of a Database System: DBMS contains not only the database
but also the description of the data that it stores. This description of data is called metadata.
Meta-data is stored in a database catalogue or data dictionary. It contains the structure of the
data and also the constraints that are imposed on the data.
2. Insulation Between Programs and Data: Since the definition of data is stored separately in a
DBMS, any change in the structure of data would be done in the catalogue and hence programs
which access this data need not be modified. This property is called Program-Data Independence.
3. Sharing of Data: A multiuser environment allows multiple users to access the database
simultaneously. Thus a DBMS must include concurrency control software to allow simultaneous
access of data in the database without any inconsistency problems.
Types of Users of DBMS
DBMS is used by many types of users depending on their requirements and interaction with
the DBMS. There are mainly four types of users:
1. End Users: Users who use the database for querying, modifying and generating reports
as per their needs. They are not concerned about the working and designing of the database. They
simply use the DBMS to get their task done.
2. Database Administrator (DBA): As the name implies, the DBA administers the database and the
DBMS. The DBA is responsible for authoring access, monitoring its use, providing technical
support, acquiring software and hardware resources.
3. Application Programmers: Application programmes write application programs to interact with
the database. These programs are written in high level languages and SQL to interact with the
database.
4. System Analyst: System analyst determines the requirements of the end users and then
develops specifications to meet these requirements. A system analyst plays a major role in the
database design and all the technical, economic and feasibility aspects.

Advantages of using DBMS Approach


Following are the advantages of using a DBMS:
1. Reduction in Redundancy: Data in a DBMS is more concise because of the central repository of
data. All the data is stored at one place. There is no repetition of the same data. This also reduces
the cost of storing data on hard disks or other memory devices.
2. Improved Consistency: The chances of data inconsistencies in a database are also reduced as
there is a single copy of data that is accessed or updated by all the users.
3. Improved Availability: Same information is made available to different users. This helps sharing
of information by various users of the database.
4. Improved Security: Though there is improvement in the availability of information to users, it
may also be required to restrict the access to confidential information. By making use of passwords
and controlling users' database access rights, the DBA can provide security to the database.
5. User Friendly: Using a DBMS, it becomes very easy to access, modify and delete data. It reduces
the dependency of users on computer specialists to perform various data related operations in a
DBMS because of its user friendly interface.

Limitations of using DBMS Approach


The two main disadvantages of using a DBMS:
1. High Cost: The cost of implementing a DBMS system is very high. It is also a very timeconsuming
process that involves analyzing user requirements, designing the database specifications, writing
application programs and then also providing training.
2. Security and Recovery Overheads: Unauthorized access to a database can lead to threat to the
individual or organization depending on the data stored. Also the data must be regularly backed up
to prevent its loss due to fire, earthquakes, etc.
Hence the DBMS approach is usually not preferred when the database is small, well defined, less
frequently changed and used by few users.

Relational Database
Relational database, developed by E.F Codd at IBM in 1970, is used to organize collection of data
as a collection of relations where each relation corresponds to a table of values. Each row in the
table corresponds to a unique instance of data and each column name is used to interpret the
meaning of that data in each row.

In relational model,
• A row is called a Tuple.
• A column is called an Attribute.
• A table is called as a Relation.
• The data type of values in each column is called the Domain.
• The number of attributes in a relation is called the Degree of a relation.
• The number of rows in a relation is called the Cardinality of a relation.
• Relation Schema R is denoted by R (A1, A2, A3 …, An) where R is the relation name and A1,
A2, A3 …, An is the list of attributes.
• Relation State is the set of tuples in the relation at a point in time. A relation state r of the
relation schema R (A1, A2, A3 …, An), denoted r(R) is a set of n-tuples r = {t1, t2, …, tm}, where
each n-tuple is an ordered list of values t = <v1, v2, ..., vn>, where vi is in the domain of Ai or
is NULL. Here n is the degree of the relation and m is the cardinality of the relation.

Hence in this Figure,


• EMPLOYEE table is a relation.
• There are three tuples in EMPLOYEE relation.
• Name, Employee_ID, Gender, Salary, Date_of_Birth are attributes.
• The domain is a set of atomic (or indivisible) values. The domain of a database attribute is
the set of all the possible values that attribute may contain. To specify a domain, we specify
the data type of that attribute. Following are the domain of attributes of the EMPLOYEE
relation:
(a) Name – Set of character strings representing names of persons.
(b) Employee_ID–Set of 4-digit numbers
(c) Gender – male or female
(d) Salary – Number
(e) Date_of_Birth – Should have a valid date, month and year. The birth year of the
employee must be greater than 1985. Also the format should be dd-mm-yyyy.

• The degree of the EMPLOYEE relation is 5 as there are five attributes in this relation.
• The cardinality of the EMPLOYEE relation is 3 as there are three tuples in this relation.
• Relation Schema – EMPLOYEE (Name, Employee_ID, Gender, Salary, Date_of_Birth)
• Relation State – {<Neha Mehta, 1121, Female, 20000, 04-03-1990>,
<Paras Bansal, 2134, Male, 25000, 19-10-1993>,
<Himani Verma, 3145, Female, 20000, 23-11-1992>}

Some More Characteristics of Relations:


• Ordering of tuples is not important in a Relation.
• The ordering of attributes is also unimportant.
• No two tuples of relation should be identical i.e. given any pair of two tuples, value in at
least one column must be different.
• The value in each tuple is an atomic value (indivisible).
• If the value of an attribute in a tuple is not known or not applicable or not available, a
special value called null is used to represent them.

Examples of RDBMS are Oracle, MySQL, IBM DB2.

Relational Model Constraints


Constraints, are restrictions on the values, stored in a database based on the requirements.
Various types of constraints in Relational model:
• Domain Constraint: It specifies that the value of every attribute in each tuple must be
from the domain of that attribute. For example, the Employee_ID must be a 4-digit
number. Hence a value such as “12321” or “A234” violates the domain constraint as the
former is not 4-digit long and the latter contains an alphabet.

• Key Constraint:
(i) Superkey is a set of attributes in a relation, for which no two tuples in a relation state
have the same combination of values. Every relation must have at least one superkey
which is the combination of all attributes in a relation. Thus, for the EMPLOYEE relation,
following are some of the superkeys:

(a) {Name, Employee_ID, Gender, Salary, Date_of_birth} - default superkey


consisting of all attributes.
(b) {Name, Employee_ID, Date_of_Birth}
(c) {Employee_ID, Gender, Salary}
(d) {Name, Employee_ID, Gender}
(e) {Employee_ID}
However {Gender, Salary} is not a superkey because both these attributes have
identical values for employees Neha and Himani.

(ii) Key is the minimal superkey, which means it is the superkey of a relation from which
if any attribute is removed then it no longer remains a superkey.
For example, the superkey {Name, Employee_ID, Gender}is not a key as we can remove
Name and Gender from this combination and then what is left {Employee_ID} is still a
Superkey. Now {Employee_ID} is a key as it is a superkey as well as no more removals are
possible. A relation may have more than one key. Consider the relation PERSON with the
following schema: PERSON (Aadhar_number, PAN, Voter_ID_cardno, Name, Date_of_birth,
Address). This relation has three keys namely: {Aadhar_number}, {PAN}, {Voter_ID_no} as
every individual in India has a unique Aadhar card number, PAN as well as Voter ID card
number.

(iii) Candidate key: A key as described above is called candidate key of the relation. For
example, the PERSON relation has three candidate keys as discussed above.

(iv) Primary Key: One of the candidate keys may be designated as Primary key. Primary
key is used to identify tuples in a relation. If a relation has many candidate keys it is
preferable to choose that one as primary key which has least number of attributes. Primary
key is usually underlined in the schema of the relation. For example, in the relation
schema: PERSON (Aadhar_number, PAN, Voter_ID_cardno, Name, Date_of_birth, Address),
Aadhar_number is the primary key.

• Null Value Constraint: Sometimes it is required that certain attributes cannot have null
values. For example, if every EMPLOYEE must have a valid name then the Name attribute
is constrained to be NOT NULL.
• Entity Integrity Constraint: This constraint specifies that primary key of a relation cannot
have null value. The reason behind this constraint is that we know primary key contains no
duplicates. However, if we allow null values for a primary key then there can be multiple
tuples for which primary key is having null values. This would imply that we are allowing
duplicate values (NULL) for a primary key which itself violates the definition of primary key.
• Referential Integrity Constraint: This constraint is specified between two relations.
Foreign key in a relation R1 is the set of attributes in R1 that refer to primary key in another
relation R2 if the domain of foreign key attributes is same as that of primary key attributes
and the value of foreign key either occurs as a value of primary key in some tuple of R2 or
is NULL.

R1 is called the referencing relation and R2 is called referenced relation, and a referential
integrity constraint holds from R1 to R2. The main purpose of this constraint is to check that
data entered in one relation is consistent with the data entered in another relation. For
example, consider two relation schemas:

Department (Dept_Name, Dept_ID, No_of_Teachers)


Teacher (Teacher_Name, Teacher_ID, Dept_ID, Subject)

➔ Dept_ID is the primary key of Department relation.


➔ Teacher_ID is the primary key of Teacher relation.

Dept_ID- the primary key of relation in Department, is also present in relation Teacher. The reason
is that every teacher belongs to a particular department. That means Dept_ID of Teacher relation
must have a value that exists in Dept_ID attribute of Department relation or it can be NULL in case
a teacher has not yet been assigned to a department. We say that Dept_ID of Teacher relation is a
foreign key that references primary key of Department relation (Dept_ID).

It is not necessary to have same name for foreign key as of the corresponding referenced primary
key.

A foreign key may also refer to the same relation.

Eg: Residents (Name, RID, Block_no, House_no, Floor, Neighbor_RID)

The Primary key of this relation is RID (Resident ID). In order to store information about neighbor
we have created a foreign key Neighbor_RID that references RID of Residents. Note that the
referencing and referenced relation are same in this case.

Structured Query Language (SQL)SQL is a language that is used to manage data stored in a
RDBMS. It comprises of a Data Definition Language (DDL) and a Data Manipulation Language
(DML) where DDL is a language which is used to define structure and constraints of data and DML
is used to insert, modify and delete data in a database.

SQL commands are used to perform all the operations. SQL uses the terms table, row and column
for the relational model terms relation, tuple and attribute.

You might also like