DBMS
DBMS
1-Tier Architecture
○ In this architecture, the database is directly available to the user. It means the
user can directly sit on the DBMS and use it.
○ Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.
○ The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for quick response.
2-Tier Architecture
○ The 2-Tier architecture is same as the basic client-server. In the two-tier
architecture, applications on the client end can directly communicate with the
database at the server side. For this interaction, API's like: ODBC, JDBC are used.
○ The user interfaces and application programs are run on the client-side.
○ The server side is responsible to provide the functionalities like: query processing
and transaction management.
3-Tier Architecture
○ The 3-Tier architecture contains another layer between the client and server. In
this architecture, client can't directly communicate with the server.
○ The application on the client-end interacts with an application server which
further communicates with the database system.
○ End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the
application.
1 Three-tier Architecture:
● Presentation Tier (Front-end): The user interface or application layer where users
interact with the database.
● Application (Middle) Tier: Manages application logic and processing, including
business rules and data validation.
● Data Tier (Back-end): Stores and manages the data. It includes the database
server and storage.
Components of DBMS:
● Data Definition Language (DDL): Used to define the structure of the database,
such as creating, altering, and deleting tables and relationships.
● Data Manipulation Language (DML): Involves operations like inserting, updating,
and querying data.
● Database Engine: Responsible for processing SQL queries, managing
transactions, and handling data integrity.
Database Schema:
● Physical Schema: Describes how data is stored in the database, including details
like table structures, indexes, and storage mechanisms.
● Logical Schema: Represents the logical organization of data without concerning
itself with how the data will be stored physically.
Concurrency Control:
● Transaction Management: Involves ensuring the consistency and isolation of
transactions, typically through techniques like locking and timestamps to
manage concurrent access to data.
Data Independence:
● Logical Data Independence: Allows modification of the logical structure of the
database without affecting the application programs using the data.
● Physical Data Independence: Allows modification of the physical storage
structures without affecting the application programs.
Query Optimization:
● The DBMS optimizes queries to execute them in the most efficient manner,
considering factors like indexing, caching, and query execution plans.
Security and Authorization:
● DBMS provides mechanisms for authentication and authorization to ensure that
only authorized users can access the database and perform specific operations.
Backup and Recovery:
● DBMS includes features for regular data backups and recovery mechanisms to
restore the database to a consistent state in case of failures.
Data models
Data models provide a structured way to represent and organize data, defining how data
elements relate to each other and how they can be manipulated. They serve as
blueprints for designing databases and help ensure the efficient storage, retrieval, and
manipulation of information within a system. Data models can be conceptual, logical, or
physical, offering varying levels of abstraction and detail in describing the structure and
relationships of data in a given context.
○ It develops a conceptual design for the database. It also develops a very simple
and easy to design view of data.
For example, Suppose we design a school database. In this database, the student will
be an entity with attributes like address, name, id, age, etc. The address can be another
entity with attributes like city, street name, pin code, etc and there will be a relationship
between them.
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.
An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent
an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents
a primary key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued
attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It
can be represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another
attribute like Date of birth.
3. Relationship
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known
as one to one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity
on the right associates with the relationship then this is known as a one-to-many
relationship.
For example, Scientist can invent many inventions, but the invention is done by the only
specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity
on the right associates with the relationship then it is known as a many-to-one
relationship.
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then it is known as a many-to-many
relationship.
For example, Employee can assign by many projects and project can have many
employees.
An object-based data model is a type of data model that represents data using the
concepts of objects, classes, and inheritance. It is closely associated with
object-oriented programming and is designed to store and manipulate data in a way that
mirrors the real-world entities and their relationships. Here are some key points about
object-based data models:
★ Objects:
○ An object is an instance of a class, representing a real-world entity or
concept.
○ Objects encapsulate both data (attributes) and behaviors (methods or
functions).
★ Classes:
○ A class is a blueprint or template for creating objects.
○ It defines the properties (attributes) and behaviors (methods) that objects
of the class will have.
★ Attributes:
○ Attributes are properties or characteristics associated with objects.
○ They represent the data that objects of a class can hold.
★ Methods:
○ Methods are functions or procedures associated with a class.
○ They define the behavior or actions that objects of the class can perform.
★ Encapsulation:
○ Encapsulation is the bundling of data and methods within a class,
restricting access to the internal details of the object.
○ It promotes information hiding and helps maintain data integrity.
★ Inheritance:
○ Inheritance allows a class to inherit attributes and behaviors from another
class.
○ It promotes code reuse and the creation of a hierarchy of classes.
★ Polymorphism:
○ Polymorphism enables objects of different classes to be treated as
objects of a common base class.
○ It allows for flexibility in handling various object types through a common
interface.
★ Object Identity:
○ Each object has a unique identity, allowing it to be distinguished from
other objects.
○ Object identity is often represented by a unique identifier.
★ Complex Relationships:
○ Object-based data models can represent complex relationships between
objects, mirroring real-world scenarios more accurately.
★ Persistence:
○ Object-based data models often support the persistence of objects,
allowing them to be stored in databases and retrieved later.
★ Object Query Language (OQL):
○ OQL is a query language designed for object-oriented databases, allowing
users to query and manipulate objects using a syntax similar to SQL.
The semi-structured data model
The semi-structured data model is a type of data model that does not require a rigid
schema, offering flexibility in representing data with varying structures. Unlike the
structured data model of relational databases, semi-structured data does not adhere to
a strict, predefined schema. Instead, it allows for irregularities and variations in the data
structure. Here are key characteristics and points about the semi-structured data model:
1. Flexible Schema:
a. Semi-structured data does not conform to a fixed schema, allowing for
variations in the structure of the data.
2. Self-Describing:
a. Data is often self-describing, meaning that it may include metadata or tags
that provide information about the structure and meaning of the data
elements.
3. Common Representations:
a. Semi-structured data is commonly represented in formats like XML
(eXtensible Markup Language) and JSON (JavaScript Object Notation),
which can represent nested and hierarchical structures.
4. Hierarchy and Nesting:
a. Semi-structured data often exhibits hierarchical relationships and can be
organized in nested structures, such as trees or graphs.
5. Example Formats:
a. XML (eXtensible Markup Language): Uses tags to define elements and
their relationships in a hierarchical structure.
b. JSON (JavaScript Object Notation): Represents data as key-value pairs
and supports nested structures.
6. No Formal Data Definition Language (DDL):
a. Unlike relational databases with a formal Data Definition Language (DDL),
semi-structured data does not require a predefined schema.
Normalization
Normalization is a database design technique used to organize data in a relational
database efficiently. The goal of normalization is to eliminate data redundancy, reduce
the likelihood of data anomalies, and ensure data integrity. This process involves
breaking down a large table into smaller, related tables while maintaining the
relationships between them. The result is a set of well-structured tables that adhere to
specific normal forms. The most common normal forms are the First Normal Form
(1NF), Second Normal Form (2NF), and Third Normal Form (3NF). Here is an overview of
these normal forms:
Indexing:
1. Definition:
a. Indexing is a data structure technique that involves creating an additional
data structure, known as an index, to improve the speed of data retrieval
operations.
b. The index contains a subset of the columns from the actual table and a
pointer to the corresponding rows.
2. Types of Indexes:
a. Clustered Index: Determines the physical order of data rows in the table
based on the indexed column. There can be only one clustered index per
table.
b. Non-Clustered Index: Creates a separate structure containing the indexed
column values and pointers to the actual rows.
3. Advantages:
a. Faster data retrieval: Indexes provide a shortcut to the relevant rows,
reducing the number of data pages that need to be accessed.
b. Improved query performance: Queries involving the indexed columns
benefit from quicker access to the required data.
c. Supports range queries: Indexes are beneficial for range-based searches.
4. Disadvantages:
a. Increased storage space: Indexes consume additional storage space, as
they are separate structures.
b. Overhead during updates: Maintaining indexes during insert, update, or
delete operations can impact performance.
Hashing:
1. Definition:
a. Hashing is a technique that uses a hash function to map data to a
fixed-size array, known as a hash table.
b. The hash function takes an input (e.g., a key) and generates a hash code,
which is used as an index to access the corresponding location in the
hash table.
2. Hash Collisions:
a. Hash collisions occur when two different inputs produce the same hash
code. There are various methods to handle collisions, such as chaining
(linked lists at each hash table slot) or open addressing (probing for the
next available slot).
3. Advantages:
a. Constant-time access: Hashing can provide constant-time access to data
when there are no collisions.
b. Well-suited for equality searches: Hashing is efficient for equality-based
queries.
4. Disadvantages:
a. Poor support for range queries: Hashing is not well-suited for range-based
searches.
b. Collision resolution: Handling collisions adds complexity to the
implementation.
The choice between indexing and hashing depends on the specific requirements of the
application, the nature of queries, and the expected workload on the database. In some
cases, a combination of both techniques may be employed to address different types of
queries efficiently.
Questions
1` Collections of raw facts and figure is DATA but, Manipulated n
processed data is Information