Relational Databases - 3rd Semester
Relational Databases - 3rd Semester
Relational databases are a type of data storage system that organizes information into tables with rows
and columns. Each table represents a specific type of data, and the columns define the attributes of that
data.
Data is typically structured across multiple tables, which can be joined together via a primary key or a
foreign key. For example, a student table may be linked to a course table through a common student ID.
History: Entity–relationship modeling was developed for database and design by Peter Chen and
published in a 1976 paper with variants of the idea existing previously, but today it is commonly
used for teaching students the basics of database structure. (Wikipedia)
WHAT IS ER DIAGRAM?
- The diagram represents the database and the inter relation between the tables.
- It is a visual representation of the data model which uses entities to illustrate the associations
between these entities.
- In a relational database, a relationship between entities is implemented by storing the
primary key of one entity as a pointer or "foreign key" in the table of another entity.
- Each column will be represented by a key and those keys help connect tables. The keys include
primary/foreign key, secondary key.
Entity could be person, place, object, event or concept about which data is to be maintained. For
example: Car, Student, etc.
❖ Strong Entity is a type of entity that has a key Attribute. It does not depend on other entities in the
Schema. It has a primary key that helps in identifying it uniquely.
➢ It is represented by a rectangle.
➢ Relations can be created between entities.
➢ Creates smaller tables (normalization)
➢ Hierarchy dataset
➢ Partitional relations
➢ Independent
Example: In a university database, the "Student" entity is a strong entity. Each student can be
uniquely identified by their student ID, which serves as the primary key.
❖ Weak entity is an entity that cannot be uniquely identified by its attributes alone. It relies on a
related strong entity (known as the "owner" or "parent" entity) for identification. The existence of
a weak entity is dependent on the existence of a related strong entity.
➢ No primary key.
➢ Weak relations.
➢ Bigger tables
➢ No hierarchy
➢ Optional relations
➢ Dependent
Entity set is essentially a table with connections. An Entity is an object of Entity Type and a set of all
entities is called an entity set.
Attribute is the property or characteristic of the entity. For example: Color of the car entity, Name of
the student entity. In the ER diagram, the attribute is represented by an oval. Types of attributes are
string/char/int/bool.
❖ KEY ATTRIBUTE: The attribute which uniquely identifies each entity in the entity set is called
the key attribute. It is represented by an oval with underlying lines.
Example: the Address attribute of the student Entity type consists of Street, City, State, and
Country.
❖ MULTIVALUED ATTRIBUTE: An attribute consisting of more than one value for a given entity.
It is represented by a double oval.
❖ DERIVED ATTRIBUTE: An attribute that can be derived from other attributes of the entity type
is known as a derived attribute. It is represented by a dashed oval.
The Complete Entity Type Student with its Attributes can be represented as the following.
Relation is the association between the instances of one or more entity types. For example: Blue car
belongs to Student Jack.
The relationship type is represented by a diamond and connects the entities with lines.
A set of relationships of the same type is known as a relationship set. The following relationship set
depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in C3.
Relationship cardinality(size)
The number of times an entity of an entity set participates in a relationship set is known as cardinality.
Cardinality can be of different types:
❖ One-to-One: When each entity in each entity set can take part only once in the relationship, the
cardinality is one-to-one.
Example: Let us assume that a male can marry one female and a female can marry one male. So
the relationship will be one-to-one. The total number of tables that can be used in this is 2.
❖ One-to-Many: In one-to-many mapping as well where each entity can be related to more than one
relationship and the total number of tables that can be used in this is 2.
Example: Let us assume that one surgeon department can accommodate many doctors. So the
Cardinality will be 1 to M. It means one department has many Doctors.
❖ Many-to-One: When entities in one entity set can take part only once in the relationship set and
entities in other entity sets can take part more than once in the relationship set, cardinality is many
to one. The total number of tables that can be used in this is 3.
Example: Let us assume that a student can take only one course but one course can be taken by
many students. So the cardinality will be n to 1. It means that for one course there can be n
students but for one student, there will be only one course.
❖ Many-to-Many: When entities in all entity sets can take part more than once in the relationship
cardinality is many to many. the total number of tables that can be used in this is 3.
Example: Let us assume that a student can take more than one course and one course can be taken
by many students. So the relationship will be many to many.
The four basic relations demonstrated here are enough. And If a many-to-many relation is related it’d
create a weak entity set.
A relational database is a database based on the relational model of data. A system used to maintain
relational databases is a relational database management system (RDBMS). Many relational database
systems are equipped with the option of using SQL (Structured Query Language) for querying and
updating the database.
● DATA: Data refers to facts, information, or raw material that is often in the form of numbers, text,
symbols, or multimedia. In the context of computing and databases, data can represent a wide
range of things, from individual pieces of information to large sets of structured or unstructured
information.
● DATABASE: A database is a structured collection of data that is organized to be easily accessed,
managed, and updated. It acts as a centralized repository for storing and managing data.
DATA INDEPENDENCES
There are two types of data independences. In both cases, data independence is about making
improvements (logical or physical) without causing a lot of disruption. It's like updating a phone app –
you get new features, but you don't need to relearn how to use the app. Similarly, in databases, you can
make changes without breaking how data is stored or accessed.
● LOGICAL: The ability to make changes to the logical structure of the database schema without
affecting the application programs that access the data. Logical Data Independence is used to
change the conceptual scheme without changing the following things :
○ External view
○ External API or programs
Example: Imagine your database is like a library, and each book is a piece of data. If you decide
to rearrange the sections of the library, logical data independence means you can move books to
different shelves without telling people how to find them. The way books are organized (logical
structure) can change, but readers don't need to know or change how they search for a book
(applications still work).
● PHYSICAL: The ability to make changes to the physical storage structures (like indexes, storage
devices, or file organization) without affecting the application programs or the logical structure.
Example: Imagine your library has electronic systems to track books, and you decide to upgrade
to a faster system. Physical data independence is like upgrading the electronic system without
changing how the books are organized on the shelves. Readers can still find books in the same
way (logical structure remains), even though the behind-the-scenes technology (physical
structure) has been improved.
➔ It is by the use of new storage device like Hard Drive or Magnetic Tapes
➔ Modifying the file organization technique in the Database
➔ Switching to different data structures for better performance
➔ Changing the access method.
➔ Modifying indexes.
➔ To change the compression techniques or hashing algorithms.
➔ To change the Location of Database from say C drive to D Drive.
In essence, data independence ensures smooth operations, easy adaptation, and efficient development in
the dynamic world of databases.
1. Maintenance Ease:
a. Physical: Simplifies changes to storage without affecting applications.
b. Logical: Supports modifications to data structure without disrupting applications.
2. System Evolution:
a. Physical: Allows adopting new technologies without changing data structure.
b. Logical: Facilitates adapting the database design to evolving needs.
3. Application Development:
a. Physical: Lets developers focus on application logic, not storage details.
b. Logical: Enables modular development without worrying about data changes.
4. Minimizing Disruptions:
a. Physical: Changes at the storage level with minimal impact on users.
b. Logical: Modifications to data structure without disrupting existing functionality.
5. Database Performance:
a. Physical: Optimizes storage for better performance.
b. Logical: Supports efficient data organization for improved query processing.
6. Adaptability to Changes:
a. Physical: Adapts to new technologies without affecting applications.
b. Logical: Adjusts data design to change business rules without impacting storage.
LEVELS OF DATABASE
Database systems involve complex data structures. In order to make the system efficient for retrieval of
data and reduce the complexity of the users, developers use the method of Data Abstraction.
● Internal Level/Schema: The internal schema defines the physical storage structure of the
database. The internal schema is a very low-level representation of the entire database. It contains
multiple occurrences of multiple types of internal record. In the ANSI term, it is also called
“stored record’. Facts:
○ The internal schema is the lowest level of data abstraction
○ It helps you to keep information about the actual representation of the entire database.
Like the actual storage of the data on the disk in the form of records
○ The internal view tells us what data is stored in the database and how
○ It never deals with the physical devices. Instead, internal schema views a physical device
as a collection of physical pages
● Conceptual Schema/Level: It describes the Database structure of the whole database for the
community of users. This schema hides information about the physical storage structures and
focuses on describing data types, entities, relationships, etc.
○ This logical level comes between the user level and physical storage view. However,
there is only a single conceptual view of a single database.
○ Defines all database entities, their attributes, and their relationships
○ Security and integrity information
○ In the conceptual level, the data available to a user must be contained in or derivable from
the physical level
● External Schema/Level: An external schema describes the part of the database which a specific
user is interested in. It hides the unrelated details of the database from the user. There may be “n”
number of external views for each database.
○ Each external view is defined using an external schema, which consists of definitions of
various types of external record of that specific view.
○ An external view is just the content of the database as it is seen by some specific user. For
example, a user from the sales department will see only sales related data.
○ An external level is only related to the data which is viewed by specific end users.
○ This level includes some external schemas.
○ External schema level is nearest to the user
○ The external schema describes the segment of the database which is needed for a certain
user group and hides the remaining details from the database from the specific user group
CONSTRAINTS ON RDM
These are the restrictions or sets of rules imposed on the database contents. It validates the quality of the
database. It validates the various operations like data insertion, updation, and other processes that have to
be performed without affecting the integrity of the data. It protects us against threats/damages to the
database.
● Domain constraints
○ Every domain must contain atomic values(smallest indivisible units) which means
composite and multi-valued attributes are not allowed.
○ We perform a data type check here, which means when we assign a data type to a column
we limit the values that it can contain. Eg. If we assign the datatype of attribute age as int,
we can’t give it values other than int datatype.
In the relational model, there are several fundamental operations used to manipulate data in tables. These
operations are part of relational algebra, a set of mathematical operations designed for querying and
transforming relational databases. Here are some key operations in the relational model:
- Selection (σ): Selects rows from a table that satisfy a specified condition.
Example: σ_age > 25(Student) selects rows from the "Student" table where the age is
greater than 25.
- Projection (π): Selects specific columns from a table.
Example: π_name, age(Student) selects only the "name" and "age" columns from the
"Student" table.
- Insertion(union) (∪): Combines rows from two tables without duplicates.
Example: R ∪ S combines all rows from tables R and S, removing duplicate rows.
Intersection (∩):
- Delete (- or \ - Set Difference): Removes rows from a table based on a specified condition.
Example: R - S returns rows from table R that are not present in table S, effectively
deleting rows that exist in both tables.
1. Limitations:
a. Some relations have fixed limits that can't be expanded.
2. Complexity:
a. Can be complex and challenging, especially for users unfamiliar with SQL and database
design.
3. Performance Issues:
a. Suffers from performance challenges with large data sets and complex queries.
b. Joins and indexing strategies may slow down operations.
4. Scalability Challenges:
a. Managing scalability becomes complex as the database grows.
b. Adding tables or indexes can be time-consuming.
5. Cost:
a. Relational databases can be expensive to license and maintain, especially for large
deployments.
b. Requires dedicated hardware and specialized software.
6. Limited Flexibility:
a. Designed for predefined structures, limiting flexibility with unstructured or
semi-structured data.
7. Data Redundancy:
a. May lead to data redundancy, causing inefficiencies and challenges in maintaining
consistency.
1. Relational Databases
a. Database (row and column)
b. Application
2. Non-relational databases
a. Apache Hbase
b. IBM Domino
c. Oracle NoSQL Database
SQL
● Structured Query Language
○ SQL is a domain-specific language used for managing and manipulating relational
databases. SQL provides a standardized way to interact with relational database
management systems (RDBMS) and is widely used for tasks such as querying data,
updating data, inserting data, and creating or modifying database structures.
● MySQL
○ an open-source relational database management system (RDBMS) that uses SQL. It is
known for its reliability, ease of use, and wide adoption.
● MS Access
○ Microsoft Access is a relational database management system (RDBMS) developed by
Microsoft. It provides a user-friendly interface for building and managing databases. MS
Access is often used for smaller-scale applications and is part of the Microsoft Office
suite.
● Oracle
○ Oracle Database, often referred to simply as Oracle, is a powerful and widely used
relational database management system. It is known for its scalability, reliability, and
comprehensive feature set. Oracle is commonly used in enterprise-level applications.
● Sybase
○ Sybase is a relational database management system developed by Sybase Inc. It is known
for its performance and is often used in financial and banking applications. Sybase's
Adaptive Server Enterprise (ASE) is a popular version of the database.
● Postgres
○ PostgreSQL, commonly referred to as Postgres, is an open-source relational database
management system. It is known for its extensibility, support for complex data types, and
compliance with SQL standards. Postgres is used in a variety of applications.
● SQL Server
○ Microsoft SQL Server is a comprehensive relational database management system
developed by Microsoft. It is widely used in enterprise environments for various
applications, including business intelligence, data warehousing, and web development.
TYPES OF SQL
1. Data Definition Language (DDL)
a. Allows you to store shared data
b. Data independence improved integrity
c. Allows multiple users
d. Improved security efficient data access
i. INSERT
ii. UPDATE
iii. DELETE
Maintenance refers to the activities and tasks performed to ensure the proper functioning,
performance, and security of a database system.
➢ Offline: Offline maintenance involves temporarily taking the database or specific services offline
to perform maintenance tasks without any active user interactions. E.g. Major updates,
configuration changes, or tasks that require exclusive access to the database.
○ Schedule tasks: Regularly scheduled tasks, such as backups, indexing, and data
consistency checks, are essential for maintaining the health and performance of the
database.
○ Update: Updating the database involves applying patches, bug fixes, or new versions to
ensure that the database software is up-to-date.
➢ Online: Online maintenance allows the database to remain operational while certain maintenance
tasks are performed, minimizing disruptions.
○ MongoDB->Sync: In the context of MongoDB, synchronization refers to keeping
multiple instances of the database consistent with each other, ensuring that data changes
in one instance are reflected in others. e.g.Supporting high availability, fault tolerance,
and load balancing in distributed MongoDB deployments.
Maintenance
- Index recognize->reorganized
- Index rebuild
- Update
- Integrity
- repair
Maintenance process
• upload, insert, DML
• Delete/Drop
• Structure modification
• Search, sort, filter, order
Maintenance query
• When?
• Table create
• Update
• Join
• Delete
• XOR
Stock operations
• Open/import option
• Save modes (backup)
• Security functions
• Joins update
NORMALIZATION
Normalization is the process of organizing data in a database. It includes creating tables and establishing
relationships between those tables according to rules designed both to protect the data and to make the
database more flexible by eliminating redundancy and inconsistent dependency.
In a simple way, normalization is like tidying up your data to make it organized and efficient. Imagine you
have a bunch of information scattered all over the place. Normalization helps by putting that information
neatly into tables and making sure these tables work together smoothly. It's a set of rules that not only
keeps your data safe but also makes your database more flexible. Redundancy (repeating the same stuff)
is minimized, and we ensure that one piece of information doesn't rely on something that might change
randomly. In simple terms, it's about keeping things neat, organized, and well-connected in your database
world.
Database Normal Forms: rules for organizing information in a neat and efficient way. The main idea is
to reduce redundancy (not repeating things too much) and make sure your data is organized well.
States of Transactions: Different phases a transaction goes through, including active, partially
committed, committed, and aborted states.
• Active State
• Partially Committed
• Committed State
• Failed State
• Terminated State
ACID Properties: Fundamental principles ensuring reliable and consistent transactions
• Atomicity,
• Consistency,
• Isolation,
• Durability
Types of Transactions
• Based on Application areas
• Based on Actions
• Based on Structure
Schedule
Initial Product Quantity is 10
Transaction 1: Update Product Quantity to 50
Transaction 2: Read Product Quantity
Type of equivalence
1. RESULT EQUIVALENCE
2. View Equivalence
3. CONFLICT Equivalence
Serializability: Ensuring that the outcome of concurrent transactions is equivalent to some sequential
execution of those transactions.
• Conflict
• View
DATA WAREHOUSE