Unit-5 DBMS
Unit-5 DBMS
Increased Cost
Staff Training and Expense: A huge amount of cost is also required for training and
educating staff that maintains the database, Hiring new staff and giving them also
increases the overall expense.
Cost of Data Conversion: we have to convert all our data into a database
management system, and for that skilled database designers are required for
designing the entire database, Hence a large amount of money is required for their
salaries and the software required to design the database. all these add ups to
increased costs.
Complexity
The database management system is very complex to use and normal people cannot
understand how to use its software before proper training. So, for proper design of
the database and management of the database skilled engineers, developers, and
database administrators are required. The database structure can also be Complex
and if it is designed or mapped in the wrong way it can lead to data loss or wrong
management of data which could affect the organization’s data, As it is a complex
task to maintain data in the database management system, it requires a lot of
manpower, staff, and software needed to do so.
Database Failure
Performance
The database management system works very fast when the data is less to work on,
But as the data of the organization grows, the system becomes heavier and heavier
and the performance of a DBMS decreases, so sometimes the file management
system is preferred over the database management system.
Frequent Updates/Upgrades
Huge Size
As the data acquired by the organization increases, more storage space is needed to
set up. But increasing the storage space makes the database heavier, so searching
and storing of data becomes slow and DBMS software takes more time to fetch
queries which makes it inefficient.
Multimedia Database
Multimedia data is an interactive way to represent information to a user. It
includes several categories of data like textual data, audio data, video data, etc.
The database which is used to hold these different kinds of multimedia data is
known as a multimedia database.
Nowadays, we as a user take the help of various forms of media such as text, images,
audio, video, and graphic objects for communication or to gain any kind of
information. These media forms are collectively known as
multimedia. Multimedia provides us with an interactive way to display information
to a user. Hence, managing and storing these different kinds of multimedia data is
essential. This is done using a database known as a multimedia database.
A Multimedia database is a special type of database that helps us to organize, query,
and store inter-related multimedia data. It facilitates the storage and retrieval of
multimedia data elements. In these databases, all the media files are stored in the
form of binary strings and are encoded according to their file types. Let's look at
different types of multimedia databases.
Media data: It is the actual multimedia data or the primary data stored in the
multimedia database. It represents a multimedia object and can be an image,
audio, video, animation, graphic object, or text.
Media format data: It is the information related to the format of the
multimedia data. It contains data such as frame rates, encoding schemes, etc.
Media keyword data: It is also knowns as content descriptive data and
contains information related to the generation of multimedia data like date
and time in the case of images and videos, etc.
Media feature data: It is used to describe the features of multimedia data,
such as the distribution of colors, etc.
A temporal database is a database that needs some aspect of time for the
organization of information. In the temporal database, each tuple in relation is
associated with time. It stores information about the states of the real world and
time. The temporal database does store information about past states it only
stores information about current states. Whenever the state of the database
changes, the information in the database gets updated. In many fields, it is very
necessary to store information about past states. For example, a stock database
must store information about past stock prizes for analysis. Historical information
can be stored manually in the schema.
There are various terminologies in the temporal database:
Valid Time: The valid time is a time in which the facts are true with respect
to the real world.
Transaction Time: The transaction time of the database is the time at which
the fact is currently present in the database.
Decision Time: Decision time in the temporal database is the time at which
the decision is made about the fact.
Temporal databases use a relational database for support. But relational databases
have some problems in temporal database, i.e. it does not provide support for
complex operations. Query operations also provide poor support for performing
temporal queries.
Applications of Temporal Databases
Finance: It is used to maintain the stock price histories.
1. It can be used in Factory Monitoring System for storing information about
current and past readings of sensors in the factory.
2. Healthcare: The histories of the patient need to be maintained for giving
the right treatment.
3. Banking: For maintaining the credit histories of the user.
Temporal Relation
The temporal database provides built-in support for the time dimension.
Temporal database stores data related to the time aspects.
A temporal database contains Historical data instead of current data.
It provides a uniform way to deal with historical data.
Spatial Databases
A spatial database is a database that is enhanced to store and access spatial data or
data that defines a geometric space. These data are often associated with
geographic locations and features or constructed features like cities. Data on spatial
databases are stored as coordinates, points, lines, polygons, and topology. Some
spatial databases handle more complex data like three-dimensional objects,
topological coverage, and linear networks.
Spatial data is associated with geographic locations such as cities,towns etc. A
spatial database is optimized to store and query data representing objects. These
are the objects which are defined in a geometric space.
It is a database system
It offers spatial data types (SDTs) in its data model and query language.
It supports spatial data types in its implementation, providing at least spatial
indexing and efficient algorithms for spatial join.
Example
Vector data: This data is represented as discrete points, lines and polygons
Rastor data: This data is represented as a matrix of square cells.
The spatial data in the form of points, lines, polygons etc. is used by many different
databases.
Spatial data is diverse. Over the years, spatial data has grown. Now, spatial data
covers everything from simple vector data (points lines, or polygons) to imagery,
complex 3D scenes, and even indoor locations. Representing real-world objects with
accuracy or performing analysis can be quite complex. This is why we need spatial
databases (also known as geospatial databases).
Spatial databases are built to store and provide powerful query capabilities for
spatial data. Spatial data is often much larger in size than traditional data because of
its additional locational component. Spatial databases make the storage of complex
spatial data possible. Traditional database management systems are not capable of
storing, querying, and indexing spatial data.
You can find spatial databases supported natively through a database (i.e. Microsoft
SQL Server), or as an extension to an existing database (i.e. the ever-popular and
powerful PostGIS extension for PostgreSQL).
How do Spatial Databases differ from each other?
Together, these three components comprise the basis of a spatial database. These
three components will help you decide which spatial database is most suitable for
your enterprise or business.
Spatial data comes in all shapes and sizes. All databases typically support points,
lines, and polygons, but some support many more spatial data types. Some
databases abide by the standards set by the Open Geospatial Consortium. Yet, that
doesn’t mean it is easy to move the data between databases.
This is where the FME platform reveals some of its strengths. Database barriers no
longer matter, as you can move your data wherever you want. With support for over
450 different systems and applications, it can handle all your data tasks, spatial and
otherwise.
FME platform supports over 450 different systems and applications
Spatial Queries
Spatial queries perform an action on spatial data stored in the database. Some
spatial queries can be used to perform simple operations. However, some queries
can become much more complex, invoking spatial functions that span multiple tables.
A spatial query using SQL allows you to retrieve a specific subset of spatial data. This
helps you retrieve only what you need from your database.
This is how data is retrieved in spatial databases. The spatial query capabilities can
vary from database to database, both in terms of performance and functionality.
This is important to consider when you select your database.
Spatial queries drive a whole new class of business decisions retrieving requested
data efficiently for your business systems.
Spatial Indexes
What does the added size and complexity of spatial data mean for your data? Will
your database run slower? Will large spatial databases be too bulky for your
database to store?
This is why spatial indexes are important. Spatial indexes are created with SQL
commands. These are generated from the database management interface or
external program (i.e FME) with access to your spatial database. Spatial indexes vary
from database to database and are responsible for the database performance
necessary for adding spatial to your decision making.
Cloud Databases
A cloud database is a database that is deployed in a cloud environment as opposed
to an on-premise environment. The database itself can be offered as
a SaaS (Software-as-a-Service) application or simply be hosted in a cloud-based
virtual machine. Applications can then access all the data stored in a cloud database
over a network from any device.
With a cloud database, there is no need for dedicated hardware to host a database.
Rather than the organization itself installing, configuring, and maintaining a database
instance or instances, the cloud provider can provision, manage, and scale the
underlying database cluster.
You can deploy any type of database in the cloud. This includes traditional SQL
databases and more modern NoSQL types of databases. MongoDB Atlas is a
general-purpose document database that can be deployed on any of the major cloud
providers, like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud.
Cloud databases work in most cases that traditional databases do. They are
particularly valuable when building software products that:
Bigtable is ideal for storing large amounts of single-keyed data with low latency. It
supports high read and write throughput at low latency, and it's an ideal data source
for MapReduce operations.
Bigtable was designed to support applications requiring massive scalability; from its
first iteration, the technology was intended to be used with petabytes of data. The
database was designed to be deployed on clustered systems and uses a simple data
model that Google has described as "a sparse, distributed, persistent
multidimensional sorted map." Data is assembled in order by row key, and indexing
of the map is arranged according to row, column keys and
timestamps. Compression algorithms help achieve high capacity.
Google Bigtable serves as the database for applications such as the Google App
Engine Datastore, Google Personalized Search, Google Earth and Google Analytics.
Google has maintained the software as a proprietary, in-house technology.
Nevertheless, Bigtable has had a large impact on NoSQL database design. Google
software developers publicly disclosed Bigtable details in a technical paper
presented at the USENIX Symposium on Operating Systems and Design
Implementation in 2006.
No SQL
NoSQL Database is a non-relational Data Management System, that does not require
a fixed schema. It avoids joins, and is easy to scale. The major purpose of using a
NoSQL database is for distributed data stores with humongous data storage needs.
NoSQL is used for Big data and real-time web apps. For example, companies like
Twitter, Facebook and Google collect terabytes of user data every single day.
NoSQL database stands for “Not Only SQL” or “Not SQL.” Though a better term
would be “NoREL”, NoSQL caught on. Carl Strozz introduced the NoSQL concept in
1998.
Traditional RDBMS uses SQL syntax to store and retrieve data for further insights.
Instead, a NoSQL database system encompasses a wide range of database
technologies that can store structured, semi-structured, unstructured and
polymorphic data. Let’s understand about NoSQL with a diagram in this NoSQL
database tutorial:
Features of NoSQL
Non-relational
Schema-free
NoSQL is Schema-Free
Simple API
Offers easy to use interfaces for storage and querying data provided
APIs allow low-level data manipulation & selection methods
Text-based protocols mostly used with HTTP REST with JSON
Mostly used no standard based NoSQL query language
Web-enabled databases running as internet-facing services
Distributed
For example, a key-value pair may contain a key like “Website” associated with a
value like “Guru99”.
It is one of the most basic NoSQL database example. This kind of NoSQL database is
used as a collection, dictionaries, associative arrays, etc. Key value stores help the
developer to store schema-less data. They work best for shopping cart contents.
Redis, Dynamo, Riak are some NoSQL examples of key-value store DataBases. They
are all based on Amazon’s Dynamo paper.
Column-based
Column-oriented databases work on columns and are based on BigTable paper by
Google. Every column is treated separately. Values of single column databases are
stored contiguously.
Column based NoSQL database
They deliver high performance on aggregation queries like SUM, COUNT, AVG, MIN
etc. as the data is readily available in a column.
HBase, Cassandra, HBase, Hypertable are NoSQL query examples of column based
database.
Document-Oriented:
Document-Oriented NoSQL DB stores and retrieves data as a key value pair but the
value part is stored as a document. The document is stored in JSON or XML formats.
The value is understood by the DB and can be queried.
In this diagram on your left you can see we have rows and columns, and in the right,
we have a document database which has a similar structure to JSON. Now for the
relational database, you have to know what columns you have and so on. However,
for a document database, you have data store like JSON object. You do not require
to define which make it flexible.
The document type is mostly used for CMS systems, blogging platforms, real-time
analytics & e-commerce applications. It should not use for complex transactions
which require multiple operations or queries against varying aggregate structures.
Amazon SimpleDB, CouchDB, MongoDB, Riak, Lotus Notes, MongoDB, are popular
Document originated DBMS systems.
Graph-Based
A graph type database stores entities as well the relations amongst those entities.
The entity is stored as a node with the relationship as edges. An edge gives a
relationship between nodes. Every node and edge has a unique identifier.
Graph base database mostly used for social networks, logistics, spatial data.
Neo4J, Infinite Graph, OrientDB, FlockDB are some popular graph-based databases.
Document store Database offers more difficult queries as they understand the value
in a key-value pair. For example, CouchDB allows defining views with MapReduce
1. Consistency
2. Availability
3. Partition Tolerance
Consistency:
The data should remain consistent even after the execution of an operation. This
means once data is written, any future read request should contain that data. For
example, after updating the order status, all the clients should be able to see the
same data.
Availability:
The database should always be available and responsive. It should not have any
downtime.
Partition Tolerance:
Partition Tolerance means that the system should continue to function even if the
communication among the servers is not stable. For example, the servers can be
partitioned into multiple groups which may not communicate with each other. Here,
if part of the database is unavailable, other parts are always unaffected.
Eventual Consistency
The term “eventual consistency” means to have copies of data on multiple machines
to get high availability and scalability. Thus, changes made to any data item on one
machine has to be propagated to other replicas.
Basically, available means DB is available all the time as per CAP theorem
Soft state means even without an input; the system state may change
Eventual consistency means that the system will become consistent over
time
Advantages of NoSQL
Disadvantages of NoSQL
No standardization rules
Limited query capabilities
RDBMS databases and tools are comparatively mature
It does not offer any traditional database capabilities, like consistency when
multiple transactions are performed simultaneously.
When the volume of data increases it is difficult to maintain unique values as
keys become difficult
Doesn’t work as well with relational data
The learning curve is stiff for new developers
Open source options so not so popular for enterprises.
Features of NoSQL
Non-relational
Schema-free
NoSQL is Schema-Free
Simple API
Offers easy to use interfaces for storage and querying data provided
APIs allow low-level data manipulation & selection methods
Text-based protocols mostly used with HTTP REST with JSON
Mostly used no standard based NoSQL query language
Web-enabled databases running as internet-facing services
Distributed
Key-value pair storage databases store data as a hash table where each key is unique,
and the value can be a JSON, BLOB(Binary Large Objects), string, etc.
For example, a key-value pair may contain a key like “Website” associated with a
value like “Guru99”.
It is one of the most basic NoSQL database example. This kind of NoSQL database is
used as a collection, dictionaries, associative arrays, etc. Key value stores help the
developer to store schema-less data. They work best for shopping cart contents.
Redis, Dynamo, Riak are some NoSQL examples of key-value store DataBases. They
are all based on Amazon’s Dynamo paper.
Column-based
Column-oriented databases work on columns and are based on BigTable paper by
Google. Every column is treated separately. Values of single column databases are
stored contiguously.
Column based NoSQL database
They deliver high performance on aggregation queries like SUM, COUNT, AVG, MIN
etc. as the data is readily available in a column.
HBase, Cassandra, HBase, Hypertable are NoSQL query examples of column based
database.
Document-Oriented:
Document-Oriented NoSQL DB stores and retrieves data as a key value pair but the
value part is stored as a document. The document is stored in JSON or XML formats.
The value is understood by the DB and can be queried.
In this diagram on your left you can see we have rows and columns, and in the right,
we have a document database which has a similar structure to JSON. Now for the
relational database, you have to know what columns you have and so on. However,
for a document database, you have data store like JSON object. You do not require
to define which make it flexible.
The document type is mostly used for CMS systems, blogging platforms, real-time
analytics & e-commerce applications. It should not use for complex transactions
which require multiple operations or queries against varying aggregate structures.
Amazon SimpleDB, CouchDB, MongoDB, Riak, Lotus Notes, MongoDB, are popular
Document originated DBMS systems.
Graph-Based
A graph type database stores entities as well the relations amongst those entities.
The entity is stored as a node with the relationship as edges. An edge gives a
relationship between nodes. Every node and edge has a unique identifier.
Graph base database mostly used for social networks, logistics, spatial data.
Neo4J, Infinite Graph, OrientDB, FlockDB are some popular graph-based databases.
Document store Database offers more difficult queries as they understand the value
in a key-value pair. For example, CouchDB allows defining views with MapReduce
1. Consistency
2. Availability
3. Partition Tolerance
Consistency:
The data should remain consistent even after the execution of an operation. This
means once data is written, any future read request should contain that data. For
example, after updating the order status, all the clients should be able to see the
same data.
Availability:
The database should always be available and responsive. It should not have any
downtime.
Partition Tolerance:
Partition Tolerance means that the system should continue to function even if the
communication among the servers is not stable. For example, the servers can be
partitioned into multiple groups which may not communicate with each other. Here,
if part of the database is unavailable, other parts are always unaffected.
Eventual Consistency
The term “eventual consistency” means to have copies of data on multiple machines
to get high availability and scalability. Thus, changes made to any data item on one
machine has to be propagated to other replicas.
Advantages of NoSQL
No standardization rules
Limited query capabilities
RDBMS databases and tools are comparatively mature
It does not offer any traditional database capabilities, like consistency when
multiple transactions are performed simultaneously.
When the volume of data increases it is difficult to maintain unique values as
keys become difficult
Doesn’t work as well with relational data
The learning curve is stiff for new developers
Open source options so not so popular for enterprises.