0% found this document useful (0 votes)
5 views

ADBMS

Uploaded by

Alex Parker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

ADBMS

Uploaded by

Alex Parker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ADBMS –

Chapter 1:

Indexing and Hashing:

What is Indexing?

A single-level index is an auxiliary file that makes it more efficient to search for a record in the data file.

The index file usually occupies considerably less disk blocks than the data file because its entries are much smaller

A binary search on the index yields a pointer to the file record

Indexes can also be characterized as dense or sparse

- A dense index has an index entry for every search key value (and hence every record) in the data file.
- A sparse (or non-dense) index, on the other hand, has index entries for only some of the search values

YT Link - https://youtu.be/E--yzX05_k8?feature=shared

Types of Indexing:

YT Link - https://youtu.be/vjrHiaIfOl8?feature=shared

-Single Level Indexing:

Primary Index

1) Defined on an ordered data file


2) The data file is ordered on a key field
3) Includes one index entry for each block in the data file; the index entry has the key field value for the first
record in the block, which is called the block anchor
4) A similar scheme can use the last record in a block.
5) A primary index is a non-dense (sparse) index, since it includes an entry for each disk block of the data file
and the keys of its anchor record rather than for every search value.

YT Link - https://youtu.be/4E-MGnjMhRw?feature=shared

Clustering Index

1) Defined on an ordered data file


2) The data file is ordered on a non-key field unlike primary index, which requires that the ordering field of the
data file have a distinct value for each record.
3) Includes one index entry for each distinct value of the field; the index entry points to the first data block that
contains records with that field value.
4) It is another example of sparse index where Insertion and Deletion is relatively straightforward with a
clustering index.

YT Link - https://youtu.be/UpJ9ICmzaAM?feature=shared

Secondary Index –
1) A secondary index provides a secondary means of accessing a file for which some primary access already
exists.
2) The secondary index may be on a field which is a candidate key and has a unique value in every record, or a
non-key with duplicate values.
3) The index is an ordered file with two fields.
4) The first field is of the same data type as some non-ordering field of the data file that is an indexing field.
5) The second field is either a block pointer or a record pointer.
6) There can be many secondary indexes (and hence, indexing fields) for the same file.
7) Includes one entry for each record in the data file; hence, it is a dense index

YT Link - https://youtu.be/Ua08uVgsk4k?feature=shared

Multi-Level Indexing –

1) Because a single-level index is an ordered file, we can create a primary index to the index itself
2) In this case, the original index file is called the first-level index and the index to the index is called the
second-level index.
3) We can repeat the process, creating a third, fourth, ..., top level until all entries of the top level fit in one disk
block
4) A multi-level index can be created for any type of first level index (primary, secondary, clustering) as long as
the first-level index consists of more than one disk block

B- Tress (Balance Trees) –

YT Link - https://youtu.be/KcApkM5WYGw?feature=shared

Insertion of Elements in B- Trees-

YT Link - https://youtu.be/YUtUNlLNB5c?feature=shared
Types of Databases –
Multimedia database is the collection of interrelated multimedia data that includes text, graphics
(sketches, drawings), images, animations, video, audio etc and have vast amounts of multisource
multimedia data. The framework that manages different types of multimedia data which can be stored,
delivered and utilized in different ways is known as multimedia database management system. There are
three classes of the multimedia database which includes static media, dynamic media and dimensional
media.
Content of Multimedia Database management system:
1) Media data – The actual data representing an object.
2) Media format data – Information such as sampling rate, resolution, encoding scheme etc. about
the format of the media data after it goes through the acquisition, processing and encoding
phase.
3) Media keyword data – Keywords description relating to the generation of data. It is also known
as content descriptive data. Example: date, time and place of recording.
4) Media feature data – Content dependent data such as the distribution of colours, kinds of
texture and different shapes present in data.

There are still many challenges to multimedia databases, some of which are:
1) Modelling
2) Design
3) Storage
4) Performance
5) Queries and Retrieval

Areas where multimedia database is applied are:


1) Documents and record management

2) Knowledge dissemination

3) Education and training

4) Marketing, advertising, retailing, entertainment and travel. Example: a virtual tour


of cities.
5) Real-time control and monitoring
- Mobility Database:
A Mobility Database generally refers to a digital repository or collection of data related
to transportation and mobility patterns within a geographic area. These databases are
typically used to gather, store, and analyse information about how people and goods
move within a region. Here are some key aspects of mobility databases:

1. Data Types: Mobility databases may include data on various transportation modes,
such as road networks, public transit systems, walking and biking routes, traffic flow,
and more. They can also encompass data on travel times, congestion levels, vehicle
counts, and even data from GPS and mobile devices.

2. Applications:

- Urban Planning: Helps cities plan infrastructure based on real mobility data.
- Traffic Management: Enables real-time traffic monitoring and congestion mitigation.
- Transportation Research: Supports studies on travel behaviour and mode choices.
-

3. Drawbacks:

- Privacy Concerns: Involves tracking individuals' movements, raising privacy issues.


- Data Security: Vulnerable to breaches, potentially exposing sensitive information.
- Data Accuracy: Relies on accurate data collection, which can be challenging.
- Accessibility: Not all stakeholders may have access to the database, limiting its utility.
- Ethical Issues: Ethical dilemmas arise when using personal mobility data for public
benefit.
- Cost: Establishing and maintaining a comprehensive database can be expensive

- NOSQL

NoSQL is a type of database management system (DBMS) that is designed to


handle and store large volumes of unstructured and semi-structured data. Unlike
traditional relational databases that use tables with pre-defined schemas to store
data, NoSQL databases use flexible data models that can adapt to changes in
data structures and are capable of scaling horizontally to handle growing
amounts of data.

NoSQL databases are generally classified into four main categories:


1. Document databases: These databases store data as semi-structured
documents, such as JSON or XML, and can be queried using document-oriented
query languages.
2. Key-value stores: These databases store data as key-value pairs, and are
optimized for simple and fast read/write operations.
3. Column-family stores: These databases store data as column families, which
are sets of columns that are treated as a single entity. They are optimized for
fast and efficient querying of large amounts of data.
4. Graph databases: These databases store data as nodes and edges, and are
designed to handle complex relationships between data.

Key Features of NoSQL :


1. Dynamic schema: NoSQL databases do not have a fixed schema and can
accommodate changing data structures without the need for migrations or
schema alterations.
2. Horizontal scalability: NoSQL databases are designed to scale out by adding
more nodes to a database cluster, making them well-suited for handling large
amounts of data and high levels of traffic.
3. Document-based: Some NoSQL databases, such as MongoDB, use a document-
based data model, where data is stored in semi-structured format, such as JSON
or BSON.
4. Performance: NoSQL databases are optimized for high performance and can
handle a high volume of reads and writes, making them suitable for big data and
real-time applications.

Advantages of NoSQL: There are many advantages of working with NoSQL databases
such as MongoDB and Cassandra. The main advantages are high scalability and high
availability.
1. High availability: Auto replication feature in NoSQL databases makes it highly
available because in case of any failure data replicates itself to the previous
consistent state.
2. Scalability: NoSQL databases are highly scalable, which means that they can
handle large amounts of data and traffic with ease.
3. Performance: NoSQL databases are designed to handle large amounts of data
and traffic, which means that they can offer improved performance compared
to traditional relational databases.
4. Cost-effectiveness: NoSQL databases are often more cost-effective than
traditional relational databases, as they are typically less complex and do not
require expensive hardware or software.
5. Agility: Ideal for agile development.
Disadvantages of NoSQL: NoSQL has the following disadvantages.

1. Lack of ACID compliance: NoSQL databases are not fully ACID-compliant, which
means that they do not guarantee the consistency, integrity, and durability of data.
2. GUI is not available: GUI mode tools to access the database are not flexibly
available in the market.
3. Backup: Backup is a great weak point for some NoSQL databases like MongoDB.
MongoDB has no approach for the backup of data in a consistent manner.

- XML Database

An XML database is a type of database management system (DBMS) that is designed


to store and manage XML (Extensible Markup Language) data. XML is a popular format
for structuring and organizing data, commonly used for representing and exchanging
information between different systems, applications, and platforms. XML databases are
specifically optimized for the storage and retrieval of XML data, making them well-
suited for applications where structured and semi-structured data needs to be managed.

Here are some key characteristics and features of XML databases:

1. Native XML Storage: XML databases store XML documents in their native format,
preserving the hierarchical structure and metadata associated with the data. This allows
for efficient querying and retrieval of XML content.

2. Querying and Indexing: XML databases provide query languages (such as XQuery or
XPath) and indexing mechanisms tailored for XML data. Users can search, filter, and
extract specific elements or attributes from XML documents.

3. Web Services Integration: XML databases are commonly used in conjunction with
web services and web applications, as XML is a fundamental data format for
representing data exchanged over the internet.

4. Semi-Structured Data: XML databases can handle both structured and semi-
structured data, making them suitable for applications with flexible data schemas.

5. Scalability Depending on the specific database system, XML databases can be


designed for scalability to handle large volumes of XML data efficiently.
- Graph Database:

A graph database (GDB) is a database that uses graph structures for storing data. It
uses nodes, edges, and properties instead of tables or documents to represent and store
data. The edges represent relationships between the nodes. This helps in retrieving data
more easily and, in many cases, with one operation. Graph databases are commonly
referred to as a NoSQL.

1. It solves Many-To-Many relationship problems

If we have friends of friends and stuff like that, these are many to many relationships.
Used when the query in the relational database is very complex.

2. When relationships between data elements are more important

For example- there is a profile and the profile has some specific information in it but the
major selling point is the relationship between these different profiles that is how you
get connected within a network.
In the same way, if there is data element such as user data element inside a graph
database there could be multiple user data elements but the relationship is what is
going to be the factor for all these data elements which are stored inside the graph
database.
Advantages: Frequent schema changes, managing volume of data, real-time query
response time, and more intelligent data activation requirements are done by graph
model.
Disadvantages: Note that graph databases aren’t always the best solution for an
application. We will need to assess the needs of application before deciding the
architecture.
Limitations of Graph Databases:
• Graph Databases may not be offering better choice over the NoSQL variations.
• If application needs to scale horizontally this may introduces poor performance.
• Not very efficient when it needs to update all nodes with a given parameter.

- Federated Database:

Federated Database Management Systems (FDBMS) are distributed database


management systems that integrate data from multiple sources, providing a
unified view for users. These systems are useful for integrating data across
multiple autonomous databases, offering a hybrid of distributed and centralized
systems, with each server acting as an autonomous and centralized DBMS.
Advantages of using a federated database management system:

Data integration: A federated database management system integrates data from various
databases and platforms, enabling organizations to analyze and gain insights from data
distributed across multiple systems.

Scalability: A federated database management system can scale to accommodate large


volumes of data and high traffic loads, by adding or removing members from the
federation as required.

Flexibility: A federated database management system is flexible and can be customized


to meet the specific needs of an organization, enabling it to accommodate changes in
data requirements, business processes, and technology platforms.

Cost-effective: Federated database management systems offer cost-effective alternatives to


centralized systems by leveraging existing infrastructure and reducing the need for new
investments.

Security: A federated database management system can provide a higher level of


security, as each member of the federation can implement their own security policies
and access controls to protect their data.

Disadvantages of federated database management:

Complexity: Federated databases are more complex than traditional centralized ones,
requiring multiple data sources, schemas, and distributed transactions, making system
design, implementation, and maintenance challenging.

Performance: Federated databases may face performance issues due to the overhead of
managing distributed transactions and retrieving data from multiple sources, resulting in
slower response times and increased network traffic.

Security: Federated databases may be more vulnerable to security breaches since they
are spread across multiple locations and may be accessed by different users and
applications. Ensuring data privacy, integrity, and security across all the distributed
databases can be a significant challenge.

Cost: Federated databases can be costly to implement and maintain due to specialized
hardware, software, network infrastructure, licensing, and support costs for individual
databases

You might also like