0% found this document useful (0 votes)
6 views30 pages

DBMS

A Database Management System (DBMS) is software that facilitates the storage, retrieval, and management of data, ensuring data consistency, security, and efficient access. It offers various functions such as data storage, manipulation, security, and backup, and is characterized by features like self-describing nature and support for multiple views. The document also discusses DBMS architecture, client/server models, data models, and the Entity-Relationship (ER) model, providing insights into database design and SQL operations.

Uploaded by

citrus4candy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views30 pages

DBMS

A Database Management System (DBMS) is software that facilitates the storage, retrieval, and management of data, ensuring data consistency, security, and efficient access. It offers various functions such as data storage, manipulation, security, and backup, and is characterized by features like self-describing nature and support for multiple views. The document also discusses DBMS architecture, client/server models, data models, and the Entity-Relationship (ER) model, providing insights into database design and SQL operations.

Uploaded by

citrus4candy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

1.

Overview of Database Management System (DBMS)

A Database Management System (DBMS) is software designed to store, retrieve, define, and manage data in a
structured manner. It provides an interface between users and databases, ensuring data consistency, security,
and efficient access.

Key Functions of a DBMS:

 Data Storage & Retrieval: Efficiently stores and retrieves data.

 Data Definition: Allows defining data structures (tables, constraints).

 Data Manipulation: Supports operations like insertion, deletion, and updates.

 Data Security & Integrity: Enforces access control and validation rules.

 Concurrency Control: Manages multiple users accessing data simultaneously.

 Backup & Recovery: Provides mechanisms to recover data after failures.

Advantages of DBMS over File Systems:

 Redundancy Control: Minimizes data duplication.

 Data Sharing: Allows multiple users to access data concurrently.

 Data Consistency: Ensures uniform and accurate data.

 Security & Privacy: Provides user authentication and authorization.

 Efficient Query Processing: Optimizes data retrieval using query languages (e.g., SQL).

2. Characteristics of the Database Approach

The database approach differs from traditional file-based systems in several ways:

1. Self-Describing Nature:

o The database contains metadata (data about data), such as table schemas, constraints, and
relationships.

2. Program-Data Independence:

o Changes in the database structure do not require changes in application programs.

3. Support for Multiple Views:

o Different users can have customized views of the same data.

4. Data Sharing & Multi-User Access:

o Multiple users can access and modify data simultaneously.

5. Data Integrity & Constraints:


o Enforces rules (e.g., primary keys, foreign keys) to maintain data correctness.

6. Transaction Management:

o Ensures ACID (Atomicity, Consistency, Isolation, Durability) properties.

7. Backup & Recovery:

o Provides mechanisms to restore data after failures.

3. DBMS Architecture

DBMS architecture defines how data is stored, accessed, and managed. The most common architecture is
the Three-Schema Architecture:

Three-Schema Architecture:

1. External Schema (View Level):

o Defines how different users view the data (e.g., customized reports).

o Example: A student sees only their grades, while a teacher sees all grades.

2. Conceptual Schema (Logical Level):

o Describes the entire database structure (tables, relationships, constraints).

o Example: University database schema with tables for students, courses, and enrollments.

3. Internal Schema (Physical Level):

o Defines how data is stored physically (file structures, indexing).

o Example: Data stored in B-trees or hash files.

Advantages:

 Data Independence: Changes in one schema do not affect others.

 Security: Users see only relevant data.

 Flexibility: Different applications can use the same database.

4. Client/Server Architecture in DBMS

The client/server model divides DBMS functionality between:

 Client: Requests data (e.g., a web application).

 Server: Processes requests and returns results (e.g., database server).

Types of Client/Server DBMS:


1. Two-Tier Architecture:

o Client directly interacts with the database server.

o Example: Desktop applications using SQL queries.

2. Three-Tier Architecture:

o Client Tier (UI) → Application Tier (Business Logic) → Database Tier (DBMS).

o Example: Web applications with middleware (e.g., PHP, Node.js).

Advantages:

 Scalability: Servers can handle multiple clients.

 Centralized Management: Easier to maintain and secure.

 Improved Performance: Servers optimize query processing.

5. Data Models

A data model defines how data is structured, stored, and manipulated. Common models include:

1. Relational Model:

o Data stored in tables (relations) with rows (tuples) and columns (attributes).

o Uses SQL for querying.

o Example: MySQL, PostgreSQL.

2. Hierarchical Model:

o Tree-like structure (parent-child relationships).

o Example: IBM’s IMS.

3. Network Model:

o Graph-like structure allowing multiple parent-child relationships.

o Example: CODASYL DBMS.

4. Object-Oriented Model:

o Data stored as objects (similar to OOP concepts).

o Example: MongoDB (document-based).

5. Entity-Relationship (ER) Model:

o Used for database design (entities, attributes, relationships).


6. Introduction to Distributed Database Processing

A distributed database system stores data across multiple locations (servers, sites).

Types of Distributed Databases:

1. Homogeneous: Same DBMS at all locations.

2. Heterogeneous: Different DBMS at different locations.

Advantages:

 Improved Availability: Data accessible even if one site fails.

 Faster Query Processing: Parallel execution across nodes.

 Scalability: Easier to expand.

Challenges:

 Complexity: Requires synchronization.

 Security Risks: Multiple access points.

7. Schema and Instances

 Schema: The logical structure of the database (e.g., table definitions).

o Example: Student (ID, Name, Age)

 Instance: The actual data stored at a given time.

o Example: (101, "Alice", 20)

Types of Schemas:

1. Physical Schema: Storage details (file structures).

2. Logical Schema: Tables, relationships.

3. Subschema (View): User-specific views.

8. Data Independence

The ability to modify one schema level without affecting others.

1. Logical Data Independence:

o Changes in the conceptual schema (e.g., adding a column) do not affect external schemas.

2. Physical Data Independence:

o Changes in the internal schema (e.g., indexing) do not affect the conceptual schema.
Importance:

 Flexibility: Easier to modify databases.

 Reduced Maintenance Costs: Applications remain unaffected by structural changes.

Entity-Relationship (ER) Model: Detailed Notes

The Entity-Relationship (ER) Model is a conceptual framework used to design and represent databases. It
helps in visualizing data requirements and relationships in a structured manner.

1. Entity

 An entity is a real-world object or concept that can be distinctly identified.

 Examples:

o Student, Employee, Book, Customer, Order.

 Entities are represented as rectangles in ER diagrams.

2. Entity Type

 An entity type defines a category or a collection of similar entities.

 It describes the structure (attributes) of entities.

 Example:

o Student (with attributes: ID, Name, Age, Course).

o Employee (with attributes: EmpID, Name, Salary, Department).

3. Entity Set

 An entity set is a collection of entities of the same entity type.

 Example:

o All students in a university form the Student entity set.

o All books in a library form the Book entity set.

4. Notation for ER Diagrams


Symbol Representation Meaning

![Rectangle] Rectangle Entity

![Ellipse] Ellipse Attribute

![Diamond] Diamond Relationship

![Double Rectangle] Double Rectangle Weak Entity

![Double Ellipse] Double Ellipse Multivalued Attribute

![Dashed Ellipse] Dashed Ellipse Derived Attribute

Underlined
![Underlined Attribute] Primary Key
Attribute

![Line] Line Connects entities to relationships/attributes

5. Attributes and Keys

Attributes

 Properties or characteristics of an entity.

 Example:

o For Student: Roll_No, Name, Age, Course.

 Represented as ovals connected to the entity.

Types of Attributes

1. Simple (Atomic) Attribute

o Cannot be divided further.

o Example: Age, Roll_No.

2. Composite Attribute

o Can be divided into smaller sub-attributes.

o Example:

 Address → (Street, City, State, Pincode).

3. Derived Attribute

o Value is derived from another attribute.

o Represented with a dashed oval.

o Example:
 Age (derived from Date_of_Birth).

 Total_Salary (sum of Basic_Salary + Bonus).

4. Multivalued Attribute

o Can hold multiple values.

o Represented with a double oval.

o Example:

 Phone_Number (a student can have multiple numbers).

 Email_Addresses.

5. Key Attribute

o Uniquely identifies an entity.

o Example:

 Student_ID for a Student entity.

o Represented with an underlined name.

Keys in DBMS

1. Super Key

o A set of one or more attributes that can uniquely identify an entity.

o Example:

 For Student, (Roll_No, Name) is a super key, but Roll_No alone is sufficient.

2. Candidate Key

o A minimal super key (no subset is a super key).

o Example:

 Roll_No or Aadhar_No (if both are unique).

3. Primary Key

o The chosen candidate key to uniquely identify records.

o Example:

 Roll_No is selected as the primary key for Student.

4. Foreign Key

o An attribute in one table that refers to the primary key of another table.
o Example:

 Dept_ID in Employee refers to Dept_ID in Department.

6. Relationships

 A relationship defines how two or more entities are related.

 Represented as a diamond in ER diagrams.

 Example:

o Student "enrolls in" Course.

o Employee "works in" Department.

Types of Relationships

1. One-to-One (1:1)

o One entity relates to exactly one other entity.

o Example:

 One Person has one Passport.

2. One-to-Many (1:N)

o One entity relates to multiple entities.

o Example:

 One Department has many Employees.

3. Many-to-One (N:1)

o Many entities relate to one entity.

o Example:

 Many Students belong to one Class.

4. Many-to-Many (M:N)

o Multiple entities relate to multiple entities.

o Example:

 Students enroll in multiple Courses, and a Course has many Students.

7. Weak Entity

 An entity that cannot be uniquely identified without a relationship with another entity.
 Depends on a strong (owner) entity.

 Represented with a double rectangle.

 Example:

o Dependent (weak entity) depends on Employee (strong entity).

o Order_Item depends on Order.

8. Enhanced ER Model (EER)

Extends the basic ER model with additional concepts:

1. Specialization

o Dividing an entity into subclasses based on characteristics.

o Example:

 Employee → Manager, Engineer, Clerk.

o Represented with a triangle (ISA relationship).

2. Generalization

o Combining multiple entities into a higher-level entity.

o Example:

 Car, Bike, Truck → Vehicle.

o Opposite of specialization.

3. Aggregation

o Treating a relationship as an entity for higher-level relationships.

o Example:

 Employee works on a Project using a Tool.


UNIT2
1. SELECT

 Purpose: Retrieves data from a database table.

 Basic Syntax:

 SELECT column1, column2 FROM table_name;

o This retrieves only the specified columns (column1, column2) from the table.

 Selecting All Columns:

 SELECT * FROM table_name;

o The asterisk (*) selects all columns in the table.

 Where to Use: To view data stored in a table without modifying anything.

 Example: To view names and ages from an employees table:

 SELECT name, age FROM employees;

2. INSERT

 Purpose: Adds new rows of data into a table.

 Basic Syntax:

 INSERT INTO table_name (column1, column2)

 VALUES (value1, value2);

o You specify the table, columns, and the values that should be inserted.

 Example: Inserting a new employee into the employees table:

 INSERT INTO employees (name, age, department)

 VALUES ('John Doe', 30, 'HR');

 Important Note: The number of values must match the number of columns listed.

3. UPDATE

 Purpose: Modifies existing data in a table.

 Basic Syntax:

 UPDATE table_name

 SET column1 = value1, column2 = value2

 WHERE condition;
o You specify which columns you want to update and the new values.

o The WHERE clause is essential to avoid updating every row in the table.

 Example: To update the age of an employee in the employees table:

 UPDATE employees

 SET age = 31

 WHERE name = 'John Doe';

 Important Note: Without the WHERE clause, all rows in the table will be updated.

4. DELETE

 Purpose: Deletes existing rows from a table.

 Basic Syntax:

 DELETE FROM table_name

 WHERE condition;

o You specify which rows to delete using a condition in the WHERE clause.

 Example: To delete an employee named 'John Doe' from the employees table:

 DELETE FROM employees

 WHERE name = 'John Doe';

 Important Note: If you omit the WHERE clause, ALL rows will be deleted, and data will be permanently
lost.

5. CREATE TABLE

 Purpose: Creates a new table in the database.

 Basic Syntax:

 CREATE TABLE table_name (

 column1 datatype,

 column2 datatype,

 column3 datatype

 );

o You define the table's name and the columns it will have, along with their data types (e.g., INT,
VARCHAR).

 Example: Creating an employees table:

 CREATE TABLE employees (


 id INT PRIMARY KEY,

 name VARCHAR(100),

 age INT,

 department VARCHAR(50)

 );

 Data Types: You define the type of data each column will hold (e.g., VARCHAR for text, INT for integers).

6. ALTER TABLE

 Purpose: Modifies an existing table’s structure.

 Basic Syntax to Add a Column:

 ALTER TABLE table_name

 ADD column_name datatype;

o You can add new columns, modify data types, or even rename columns.

 Example: Adding an email column to the employees table:

 ALTER TABLE employees

 ADD email VARCHAR(100);

 Important Note: You can also use DROP to remove columns, or MODIFY to change a column's datatype.

7. DROP TABLE

 Purpose: Completely deletes a table from the database, including its structure and all data.

 Basic Syntax:

 DROP TABLE table_name;

 Example: To drop the employees table:

 DROP TABLE employees;

 Important Note: This action is irreversible and cannot be undone.

8. WHERE

 Purpose: Filters rows based on a condition.

 Basic Syntax:

 SELECT * FROM table_name WHERE condition;

 Example: To select only employees older than 30:

 SELECT * FROM employees WHERE age > 30;


 Operators in WHERE:

o =, !=, >, <, >=, <=

o AND, OR, NOT

o IN, BETWEEN, LIKE (for pattern matching)

9. ORDER BY

 Purpose: Sorts the result set by one or more columns, either in ascending or descending order.

 Basic Syntax:

 SELECT * FROM table_name

 ORDER BY column_name [ASC | DESC];

 Example: To order employees by their age in descending order:

 SELECT * FROM employees

 ORDER BY age DESC;

 Note: By default, ORDER BY sorts in ascending order (ASC).

10. GROUP BY

 Purpose: Groups rows based on the values in one or more columns, often used with aggregate
functions like COUNT, SUM, AVG, etc.

 Basic Syntax:

 SELECT column_name, COUNT(*)

 FROM table_name

 GROUP BY column_name;

 Example: To count how many employees are in each department:

 SELECT department, COUNT(*)

 FROM employees

 GROUP BY department;

 Important Note: After GROUP BY, you can apply aggregate functions like:

o COUNT(), SUM(), AVG(), MAX(), MIN().

11. HAVING

 Purpose: Filters results after GROUP BY. It’s similar to WHERE, but HAVING works on aggregated data.

 Basic Syntax:
 SELECT column_name, COUNT(*)

 FROM table_name

 GROUP BY column_name

 HAVING COUNT(*) > 5;

 Example: To display only departments with more than 5 employees:

 SELECT department, COUNT(*)

 FROM employees

 GROUP BY department

 HAVING COUNT(*) > 5;

 Difference from WHERE: Use WHERE to filter rows before grouping; use HAVING to filter after
grouping.

12. JOIN

 Purpose: Combines rows from two or more tables based on a related column, allowing you to retrieve
data from multiple tables at once.

 Types of Joins:

o INNER JOIN: Returns only rows that have matching values in both tables.

o LEFT JOIN (or LEFT OUTER JOIN): Returns all rows from the left table, and matching rows from
the right table. If there’s no match, NULL is returned for columns from the right table.

o RIGHT JOIN (or RIGHT OUTER JOIN): Returns all rows from the right table, and matching rows
from the left table.

o FULL JOIN (or FULL OUTER JOIN): Returns all rows from both tables, with NULL where there is
no match.

 Basic Syntax:

 SELECT columns

 FROM table1

 JOIN table2

 ON table1.column_name = table2.column_name;

 Example:

 SELECT employees.name, departments.department_name

 FROM employees

 INNER JOIN departments


 ON employees.department_id = departments.id;

 Use Case: Joining employees and departments based on department_id.

1. BETWEEN

 The BETWEEN operator is used to filter the result set within a specified range.

 It can be used for both numerical and date ranges.

 It is inclusive, meaning the boundary values are included in the results.

Syntax:

SELECT column_name

FROM table_name

WHERE column_name BETWEEN value1 AND value2;

Example:

SELECT * FROM products

WHERE price BETWEEN 100 AND 500;

2. IN

 The IN operator allows you to specify multiple possible values for a column.

 It is equivalent to multiple OR conditions.

Syntax:

SELECT column_name

FROM table_name

WHERE column_name IN (value1, value2, ...);

Example:

SELECT * FROM employees

WHERE department IN ('HR', 'Sales', 'IT');

3. AND

 The AND operator is used to combine two or more conditions.

 All conditions must be true for a record to be included in the result set.

Syntax:

SELECT column_name

FROM table_name
WHERE condition1 AND condition2;

Example:

SELECT * FROM employees

WHERE department = 'Sales' AND salary > 50000;

4. OR

 The OR operator is used to combine multiple conditions where at least one condition must be true for a
record to be included in the result set.

Syntax:

SELECT column_name

FROM table_name

WHERE condition1 OR condition2;

Example:

SELECT * FROM employees

WHERE department = 'Sales' OR department = 'HR';

5. NOT

 The NOT operator is used to negate a condition.

 It can be combined with other operators like BETWEEN, IN, and LIKE.

Syntax:

SELECT column_name

FROM table_name

WHERE NOT condition;

Example:

SELECT * FROM employees

WHERE department NOT IN ('Sales', 'HR');

Combining these operators:

 You can combine BETWEEN, IN, AND, OR, and NOT to form more complex queries.

Example:

SELECT * FROM employees

WHERE (department = 'Sales' OR department = 'HR')


AND salary BETWEEN 40000 AND 70000

AND NOT name = 'John Doe';

Integrity Constraints in SQL

Integrity constraints are rules that ensure the accuracy and consistency of data in relational databases. These
constraints are applied to columns in a table to enforce business logic and maintain data integrity. Below is a
summary of various types of integrity constraints along with other important SQL features such as nested
queries and set-comparison operators.

1. Primary Key Constraint

 Definition: A Primary Key constraint uniquely identifies each row in a table.

 Key Points:

o A primary key must contain unique values.

o It cannot contain NULL values.

o A table can have only one primary key.

Example:

CREATE TABLE students (

student_id INT PRIMARY KEY,

name VARCHAR(100)

);

2. Not Null Constraint

 Definition: The Not Null constraint ensures that a column cannot have NULL values.

 Key Points:

o Forces the column to always contain a value.

o It is used to prevent missing or incomplete data.

Example:

CREATE TABLE employees (

employee_id INT PRIMARY KEY,

email VARCHAR(100) NOT NULL

);
3. Unique Constraint

 Definition: The Unique constraint ensures that all values in a column (or a group of columns) are
unique.

 Key Points:

o Unlike the primary key, a table can have multiple unique constraints.

o A unique constraint allows NULL values (depending on the DBMS).

Example:

CREATE TABLE products (

product_id INT PRIMARY KEY,

product_code VARCHAR(50) UNIQUE

);

4. Check Constraint

 Definition: The Check constraint ensures that values in a column meet a specified condition.

 Key Points:

o Validates domain integrity (e.g., ensuring a valid age range or salary limit).

Example:

CREATE TABLE employees (

employee_id INT PRIMARY KEY,

salary INT CHECK (salary > 0)

);

5. Referential (Foreign Key) Constraint

 Definition: A Foreign Key constraint enforces a link between two tables. It ensures referential integrity
by making sure that values in one table match the primary key of another table.

 Key Points:

o Foreign keys establish relationships between tables, enforcing consistency.

o Can be defined on one or more columns.

Example:
CREATE TABLE orders (

order_id INT PRIMARY KEY,

customer_id INT,

FOREIGN KEY (customer_id) REFERENCES customers(customer_id)

);

Introduction to Nested Queries

 Definition: A nested query (also known as a subquery) is a query within another query. It is often used
to retrieve data that will be used in the outer query.

 Key Points:

o A nested query can be placed in the SELECT, WHERE, or FROM clause.

o It is typically used for complex filtering or calculations.

Example:

SELECT name

FROM employees

WHERE employee_id = (SELECT employee_id FROM employees WHERE name = 'John Doe');

Correlated Nested Queries

 Definition: A correlated subquery is a type of nested query that depends on the outer query for its
values.

 Key Points:

o Unlike regular subqueries, correlated subqueries use values from the outer query.

o They are often executed once for each row selected by the outer query.

Example:

SELECT name, salary

FROM employees e

WHERE salary > (SELECT AVG(salary) FROM employees WHERE department = e.department);

Set-Comparison Operators
 Definition: Set-comparison operators allow comparison of a value to a set of values. These operators
are used to perform operations on multiple values.

 Key Points:

o Operators include =, >, <, >=, <=, <> when applied to a set of values.

o Common operators: IN, ANY, ALL.

Examples:

o IN: Check if a value exists in a list.

o SELECT * FROM employees WHERE department IN ('Sales', 'IT');

o ANY/ALL: Compare a value with any or all values in a subquery.

o SELECT * FROM employees WHERE salary > ALL (SELECT salary FROM employees WHERE
department = 'Sales');

Aggregate Functions in SQL

 Definition: Aggregate functions are used to perform a calculation on a set of values and return a single
result. They are often used with the GROUP BY clause.

 Key Points:

o Common aggregate functions include:

 COUNT(): Counts the number of rows.

 SUM(): Returns the sum of values.

 AVG(): Returns the average of values.

 MIN(): Returns the minimum value.

 MAX(): Returns the maximum value.

Examples:

o COUNT:

o SELECT COUNT(*) FROM employees WHERE department = 'Sales';

o SUM:

o SELECT department, SUM(salary) FROM employees GROUP BY department;


Relational Model Terminology and Concepts

1. Domains

 A domain is a set of atomic (indivisible) values from which attributes can take their values.

 It defines the possible values for a column in a relation.

 Example:

o Student_Age domain: Positive integers between 10 and 30.

o Gender domain: {'Male', 'Female', 'Other'}.

Characteristics of Domains:

 Data Type: Specifies the type of data (e.g., integer, string, date).

 Constraints: May have constraints like range or format restrictions.

 Atomicity: Values must be indivisible (no composite or multivalued attributes).

2. Attributes

 An attribute is a named column in a relation that represents a property of the entity.

 Each attribute is associated with a domain.

 Example: In a STUDENT relation, attributes could be Roll_No, Name, Age.

Types of Attributes:

1. Simple (Atomic) Attribute: Cannot be divided further (e.g., Age).

2. Composite Attribute: Can be divided into sub-attributes (e.g., Name → First_Name, Last_Name).

3. Single-Valued Attribute: Holds only one value (e.g., Roll_No).

4. Multi-Valued Attribute: Can hold multiple values (not directly supported in relational model; requires
normalization).

5. Derived Attribute: Computed from other attributes (e.g., Age derived from DOB).

3. Tuples

 A tuple is a single row in a relation representing a record.

 Each tuple contains values corresponding to the attributes of the relation.

 Example:

o In a STUDENT relation, a tuple could be (101, "John", 20).


Properties of Tuples:

 Unordered: No inherent order among tuples.

 Unique: No duplicate tuples in a relation (due to key constraints).

 Atomic Values: Each attribute value must be atomic (no repeating groups or nested relations).

4. Relations

 A relation is a table with rows (tuples) and columns (attributes).

 Mathematically, a relation is a subset of the Cartesian product of domains.

 Example:

o STUDENT (Roll_No, Name, Age) is a relation.

Characteristics of Relations:

1. No Duplicate Tuples: Every tuple must be unique.

2. No Ordering of Tuples: Tuples have no predefined sequence.

3. No Ordering of Attributes: Columns can be rearranged without changing the relation.

4. Atomic Values: Each cell contains a single value (no composite/multi-valued attributes).

5. Every Relation has a Primary Key: Ensures uniqueness of tuples.

5. Relational Constraints

Constraints ensure data integrity in a relational database.

(a) Domain Constraints

 Restrict attribute values to their defined domain.

 Example:

o Age must be an integer ≥ 0.

o Gender must be in {'Male', 'Female', 'Other'}.

(b) Key Constraints

 Ensure that tuples in a relation are uniquely identifiable.

 Types of Keys:

1. Super Key: A set of attributes that uniquely identifies a tuple.

2. Candidate Key: A minimal super key (no subset is a super key).


3. Primary Key: The chosen candidate key to uniquely identify tuples.

4. Foreign Key: An attribute that refers to the primary key of another relation (ensures referential
integrity).

(c) Constraints on NULL Values

 NULL represents missing or unknown data.

 Constraints:

o NOT NULL: Attribute cannot be NULL (e.g., Roll_No).

o UNIQUE: Ensures no duplicates (but allows NULL unless combined with NOT NULL).

o Primary Key: Implicitly NOT NULL and UNIQUE.

6. Relational Database Schema

 A relational schema defines the structure of a relation, including:

o Relation name

o Attribute names and domains

o Constraints (key, domain, referential integrity)

 Example:

Copy

STUDENT (Roll_No: INT, Name: STRING, Age: INT)

PRIMARY KEY (Roll_No)

Components of a Relational Schema:

1. Relation Name: Identifier for the table (e.g., STUDENT).

2. Attributes & Domains: Columns with their data types.

3. Constraints: Rules governing data integrity (PK, FK, NOT NULL, etc.).

7. Codd’s 12 Rules

Dr. E.F. Codd proposed 12 rules that a DBMS must satisfy to be considered a true relational database.

Summary of Codd’s Rules:

1. Information Rule: All data must be represented as values in tables.

2. Guaranteed Access Rule: Every data item must be accessible via table name, primary key, and column
name.
3. Systematic Treatment of NULLs: NULL must represent missing/inapplicable data uniformly.

4. Active Online Catalog: Metadata (data dictionary) must be stored as relations.

5. Comprehensive Data Sublanguage: Must support a language (like SQL) for defining and manipulating
data.

6. View Updating Rule: Views (virtual tables) must be updatable.

7. High-Level Insert, Update, Delete: Must support set-at-a-time operations.

8. Physical Data Independence: Applications should not be affected by storage changes.

9. Logical Data Independence: Applications should not break if table structures change.

10. Integrity Independence: Integrity constraints must be definable in the data language.

11. Distribution Independence: DBMS should work even if data is distributed.

12. Non-Subversion Rule: No backdoor to bypass integrity constraints.

Importance of Codd’s Rules:

 Ensures relational databases maintain data integrity, consistency, and flexibility.

 Modern RDBMS like Oracle, MySQL, and PostgreSQL follow most of these rules.

Relational Algebra and Database Concepts

1. Relational Algebra: Basic Operations

Relational algebra is a procedural query language that operates on relations (tables) and produces relations as
results. It consists of a set of operations that can be performed on relations.

1.1 Selection (σ)

 Purpose: Selects rows (tuples) from a relation that satisfy a given condition.

 Notation: σ<sub>condition</sub>(R)

 Example:

o Relation: Employee(EmpID, Name, Salary, Dept)

o Query: Select employees with Salary > 50000

o Relational Algebra: σ<sub>Salary > 50000</sub>(Employee)

1.2 Projection (π)

 Purpose: Selects specific columns (attributes) from a relation, eliminating duplicates.

 Notation: π<sub>attribute_list</sub>(R)

 Example:
o Relation: Employee(EmpID, Name, Salary, Dept)

o Query: Retrieve only Name and Dept

o Relational Algebra: π<sub>Name, Dept</sub>(Employee)

2. Set Theoretic Operations

These operations require relations to be union-compatible (same number of attributes with matching
domains).

2.1 Union (∪)

 Purpose: Combines tuples from two relations, removing duplicates.

 Notation: R ∪ S

 Example:

o R: Employees in Dept A

o S: Employees in Dept B

o Query: All employees in either Dept A or B

o Relational Algebra: R ∪ S

2.2 Intersection (∩)

 Purpose: Retrieves tuples present in both relations.

 Notation: R ∩ S

 Example:

o R: Employees with Salary > 50000

o S: Employees in Dept IT

o Query: High-salary IT employees

o Relational Algebra: R ∩ S

2.3 Set Difference (–)

 Purpose: Retrieves tuples in R but not in S.

 Notation: R – S

 Example:

o R: All employees

o S: Employees in Dept HR
o Query: Employees not in HR

o Relational Algebra: R – S

2.4 Division (÷)

 Purpose: Finds tuples in R that are related to all tuples in S.

 Notation: R ÷ S

 Example:

o R(Supplier, Part): Suppliers and parts they supply

o S(Part): Specific parts needed

o Query: Suppliers who supply all parts in S

o Relational Algebra: R ÷ S

3. Join Operations

Joins combine related tuples from two relations based on a condition.

3.1 Inner Join (⨝)

 Purpose: Returns only matching tuples from both relations.

 Notation: R ⨝<sub>condition</sub> S

 Example:

o Employee(EmpID, Name, DeptID)

o Department(DeptID, DeptName)

o Query: Employees with their department names

o Relational Algebra: Employee ⨝<sub>Employee.DeptID = Department.DeptID</sub> Department

3.2 Left Outer Join (⟕)

 Purpose: Returns all tuples from the left relation and matching tuples from the right (fills with NULL if
no match).

 Notation: R ⟕ S

 Example:

o Query: All employees, including those without a department

o Relational Algebra: Employee ⟕ Department

3.3 Right Outer Join (⟖)


 Purpose: Returns all tuples from the right relation and matching tuples from the left (fills with NULL if
no match).

 Notation: R ⟖ S

 Example:

o Query: All departments, including those without employees

o Relational Algebra: Employee ⟖ Department

3.4 Full Outer Join (⟗)

 Purpose: Returns all tuples from both relations, filling NULL where no match exists.

 Notation: R ⟗ S

 Example:

o Query: All employees and all departments, with NULL where no match exists

o Relational Algebra: Employee ⟗ Department

4. ER to Relational Mapping

Steps to convert an Entity-Relationship (ER) Diagram into a relational schema:

4.1 Mapping Entities

 Each entity type becomes a table.

 Attributes become columns.

 The primary key of the entity becomes the primary key of the table.

4.2 Mapping Relationships

 One-to-One (1:1): Foreign key in either table.

 One-to-Many (1:N): Foreign key in the "many" side table.

 Many-to-Many (M:N): Create a junction table with foreign keys from both entities.

4.3 Handling Weak Entities

 Weak entities depend on a strong entity.

 The table includes the primary key of the strong entity as a foreign key (part of its composite primary
key).

4.4 Handling Multi-valued Attributes

 Create a separate table for the multi-valued attribute linked via a foreign key.
5. Data Normalization

Normalization reduces redundancy and anomalies by decomposing tables.

5.1 Functional Dependencies (FDs)

 A functional dependency X → Y means that the value of Y is determined by X.

 Example: EmpID → Name (EmpID uniquely determines Name).

5.2 Armstrong’s Inference Rules

Used to derive all possible FDs from a given set:

1. Reflexivity: If Y ⊆ X, then X → Y.

2. Augmentation: If X → Y, then XZ → YZ.

3. Transitivity: If X → Y and Y → Z, then X → Z.

5.3 Normal Forms

Normal Form Condition

1NF All attributes are atomic (no repeating groups).

2NF In 1NF + no partial dependency (non-prime attributes depend on the full primary key).

In 2NF + no transitive dependency (non-prime attributes depend only on the primary


3NF
key).

BCNF In 3NF + for any FD X → Y, X must be a superkey.

Example: Normalization Steps

1. Unnormalized Table:

Copy

StudentCourse(StudentID, Name, {CourseID, CourseName, Grade})

2. 1NF: Remove repeating groups →

Copy

Student(StudentID, Name)

Course(CourseID, CourseName)

Enrollment(StudentID, CourseID, Grade)

3. 2NF: Remove partial dependencies (if any).

4. 3NF: Remove transitive dependencies (e.g., if CourseName depends on CourseID, it should be in the
Course table).
5. BCNF: Ensure every determinant is a superkey.

You might also like