0% found this document useful (0 votes)
11 views48 pages

Dw Midterms Notes

The document provides an overview of data warehousing, including its architecture, components, and key characteristics such as being subject-oriented, integrated, time-variant, and non-volatile. It discusses the differences between data warehouses, databases, data lakes, and data marts, as well as the benefits and challenges associated with data warehousing. Additionally, it outlines the types of data warehouses, their uses, implementation steps, and best practices for effective data management.

Uploaded by

Livia Swift
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views48 pages

Dw Midterms Notes

The document provides an overview of data warehousing, including its architecture, components, and key characteristics such as being subject-oriented, integrated, time-variant, and non-volatile. It discusses the differences between data warehouses, databases, data lakes, and data marts, as well as the benefits and challenges associated with data warehousing. Additionally, it outlines the types of data warehouses, their uses, implementation steps, and best practices for effective data management.

Uploaded by

Livia Swift
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

INTRODUCTION TO DATA WAREHOUSING AND COMPONENTS OF DATA WAREHOUSE ARCHITECTURE

MANAGEMENT ETL (Extract, Transform, Load)


− moving of data from a data source into data warehouse
Data Warehouse/Enterprise Data Warehouse (EDW) − converts data into a usable format so that once it’s in
− system that aggregates data from different the data warehouse, it can be analyzed/queried/etc.
(heterogeneous) sources into a single, central, consistent
data store to support data analysis, data mining, artificial Metadata
intelligence (AI), and machine learning. − data about data
− enables an organization to run powerful analytics on huge − describes all of the data that are stored in a system to
volumes (petabytes) of historical data in ways that a make it searchable (authors, dates or locations of an
standard database cannot. article, create date of a file, the size of a file, etc)
− core of the BI system which is built for data analysis and − allows data to be organized to make it usable, so it
reporting. can be analyzed to create dashboards and reports

KEY CHARACTERISTICS OF DATA WAREHOUSE SQL Query Processing


1. Subject-Oriented − SQL is the de facto standard language for querying
- it provides topic-wise information rather than a data; the language that analysts use to pull out
business's overall processes (sales, promotion, inventory, insights from their data stored in the data warehouse.
etc.)
- ex: analyzing a company’s sales data needs a data Data Layer
warehouse that concentrates on sales. − the access layer that allows users to actually get to
the data
2. Integrated − typically where data mart is.
- developed by integrating data from varied sources into a − partitions segments of your data out depending on
consistent format. who you want to give access to, so you can get very
- the data must be stored in the warehouse in a consistent granular across your organization
and universally acceptable manner in terms of naming,
format, and coding for effective analysis. Governance and Security
− related to the data layer in that you need to be able
3. Time-Variant to provide fine-grained access and security policies
- data stored in a data warehouse is documented with an across all of your organization’s data
element of time (explicitly or implicitly)
- ex: the Primary Key, which must have an element of time DATA WAREHOUSE SYSTEM
like the day, week, or month. • Decision Support System (DSS)
• Executive Information System
4. Non-Volatile • Management Information System
- data once entered into a data warehouse must remain • Business Intelligence Solution
unchanged thus all data is read-only. • Analytic Application
- previous data is not erased when current data is entered • Data Warehouse
helping in analyzing what has happened and when.
HOW DATA WAREHOUSE WORKS?
A SHORT HISTORY OF DATA WAREHOUSE ARCHITECTURE − It works as a central repository where information
 built around a relational database system, either on- arrives from one or more data sources.
premise or in the cloud, where data is both stored and − data flows into a data warehouse from the
processed. transactional system and other relational databases
 other components would include metadata − data is processed, transformed, and ingested so that
management and an API connectivity layer allowing the users can access the processed data in the Data
warehouse to pull data from organizational sources and Warehouse through Business Intelligence tools, SQL
provide access to analytics and visualization tools. clients, and spreadsheets
 typical data warehouse has four main components: a − data warehouse merges information coming from
central database, ETL tools, metadata, and access tools. different sources into one comprehensive database
 born in the 1980s, it addressed the need for optimized so an organization can analyze its customers more
analytics on data. As companies’ business applications holistically while considering all the information
began to grow and generate/store more data, they available.
needed a system that could both manage the data and − data warehousing makes data mining possible.
analyze it. − data may be:
 at a high level, database admins could pull data from  Structured - generally stored in tables in the form of rows
their operational systems and add a schema to it via and columns (relational data)
transformation before loading it into their data Semi-Structured - organized up to some extent
warehouse only and the rest is unstructured (XML/RDF)
 metadata became important when data warehouse  Unstructured - unprocessed and unorganized data (Text
architecture evolved and grew in popularity, more files, Emails, Media logs )
people within a company started using it to access data–
− data warehousing makes data mining (looking for
and the data warehouse made it easy to do so with
patterns in the data that may lead to higher sales and
structured data
profits) possible
 reporting and dashboarding became a key use case, and
SQL (structured query language) became the de facto
way of interacting with that data.
DATA WAREHOUSE VS. DATABASE, DATA LAKE, AND DATA TYPES OF DATA WAREHOUSES
MART Cloud Data Warehouse
− specifically built to run in the cloud, and it is offered
Data Warehouse to customers as a managed service
– gathers raw data from multiple sources into a central − seek to reduce on-premises data center footprint
repository, structured using predefined schemas − the physical data warehouse infrastructure is
designed for data analytics managed by the cloud company

Data Lake Data Warehouse Software (on-premise/license)


– centralized repository designed to store, process, and – business can purchase a data warehouse license and
secure large amounts of structured, semi-structured, and then deploy a data warehouse on their own on-
unstructured data (without predefined schemas) premises infrastructure
– can store data in its native format and process any variety – expensive but might be a better choice for
of it, ignoring size limits; compared to a hierarchical data government entities, financial institutions, or other
warehouse, which stores data in files or folders, a data organizations that want more control over their data
lake uses a flat architecture and object storage to store or need to comply with strict security or data privacy
the data standards or regulations
– commonly built on big data platforms such as Apache Data Warehouse Appliance
Hadoop – pre-integrated bundle of hardware and software—
CPUs, storage, operating system, and data warehouse
software—that a business can connect to its network
o designed to capture raw data (structured, semi- and start using as-is
structured, and unstructured) – sits somewhere between cloud and on-premises
o made for large amounts of data implementations in terms of upfront cost, speed of
o used for ML and AI in its current state or for deployment, ease of scalability, and management
analytics with processing control
o can organize and put into databases or DW
BENEFITS OF A DATA WAREHOUSE
1. Better data quality: A data warehouse centralizes data
Data Mart from a variety of data sources, such as transactional
– subset of a data warehouse that contains data specific to systems, operational databases, and flat files. It then
a particular business line or department cleanses it, eliminates duplicates, and standardizes it to
– enable a department or business line to discover more- create a single source of the truth.
focused insights more quickly than possible when 2. Faster, business insights: Data warehouses enable data
working with the broader data warehouse data set integration, allowing business users to leverage all of a
– built from an existing data warehouse (or other data company’s data into each business decision.
sources) through a complex procedure that involves 3. Smarter decision-making: Data warehouse supports large-
multiple technologies and tools to design and construct scale BI functions such as data mining (finding unseen
a physical database, populate it with data, and set up patterns and relationships in data), artificial intelligence,
intricate access and management protocols and machine learning.
4. Gaining and growing competitive advantage: All of the
Database above combine to help an organization finding more
– an organized collection of structured information, or opportunities.
data, typically stored electronically in a computer system
– usually controlled by a database management system CHALLENGES WITH DATA WAREHOUSE ARCHITECTURE
(DBMS) • companies start housing more data and needing more
– used to store and manage large amounts of unstructured advanced analytics and a wide range of data, the data
data structured and , and they can be used to support a warehouse starts to become expensive and not so flexible
wide range of activities, including data storage, data • open data lakehouse allows you to run warehouse
analysis, and data management workloads on all kinds of data in an open and flexible
architecture (instead of a tightly coupled system it is much
❖ Relational Database more flexible and also can manage unstructured and semi-
o designed to capture and record data (OLTP) structured data like photos, videos, IoT data etc)
o live, real-time data • data lakehouse can also support your data science, ML
o data stored in tables with rows and columns and AI workloads in addition to your reporting and
o data is highly detailed dashboarding workloads (upgrade from data warehouse
o flexible schema (how data is organized) architecture, then developing an open data lakehouse is
the way to go)

TYPES OF DATA WAREHOUSE


1. ENTERPRISE DATA WAREHOUSE (EDW)
– a centralized warehouse that provides decision
support service across the enterprise
– also provide the ability to classify data according
to the subject and give access according to
divisions
2. OPERATIONAL DATA STORE
– nothing but data store required when neither
data warehouse nor OLTP systems support
organizations reporting needs
WHAT IS A DATA WAREHOUSE USED FOR?
3. DATA MART
✓ Airline
– subset of the data warehouse
✓ Banking
– specially designed for a particular line of
✓ Healthcare
business, such as sales, finance, sales or finance.
✓ Public Sector
Data can collect directly from sources
✓ Investment and insurance sector
✓ Retain chain
STAGES OF USE OF THE DATA WAREHOUSE
✓ Telecommunication
➢ OFFLINE OPERATIONAL DATABASE
✓ Hospitality industry
– data is just copied from an operational system to
another system (loading, processing, and reporting
STEPS TO IMPLEMENT DATA WAREHOUSE
of the copied data do not impact the operational
a) ENTERPRISE STRATEGY
system’s performance)
Identify technical including current architecture and tools
as well as facts, dimensions, and attributes (data mapping
➢ OFFLINE DATA WAREHOUSE
and transformation)
– data in the data warehouse is regularly updated from
the operational database
b) PHASED DELIVERY
– data in data warehouse is mapped and transformed
Data warehouse implementation should be phased based
to meet the data warehouse objectives
on subject areas. Related business entities like booking
and billing should be first implemented and then
➢ REAL TIME DATA WAREHOUSE
integrated with each other
– data warehouses are updated whenever any
transaction takes place in operational database c) INTERATIVE PROTOTYPING
– ex: Airline or railway booking system Data warehouse should be developed and tested
interatively
➢ INTEGRATED DATA WAREHOUSE
– data warehouses are updated continuously when
the operational system performs a transaction (data BEST PRACTICES TO IMPLEMENT A DATA WAREHOUSE
warehouse then generates transactions which are o The data warehouse must be well integrated, well defined
passed back to the operational system) and time stamped.
o While designing data warehouse make sure you use right
FOUR COMPONENTS OF DATA WAREHOUSE tool, stick to life cycle, take care about data conflicts and
Load Manager (the front components) ready to learn your mistakes.
– It performs with all the operations associated with o Never replace operational systems and reports.
the extraction and load of data into the warehouse. o Don’t spend too much time on extracting, cleaning and
– It prepares the data for entering into the DW loading data.
o Ensure to involve all stakeholders including business
Warehouse Manager personnel in data warehouse implementation process.
– performs operations associated with the o Prepare a training plan for the end users.
management of the data in the warehouse.
– It performs operations like analysis of data to ensure ADVANTAGES OF DATA WAREHOUSE
consistency, creation of indexes and views, ▪ Allows business users to quickly access critical data from
generation of denormalization and aggregations, some sources all in one place.
transformation and merging of source data and ▪ Provides consistent information on various cross-
archiving and back up data functional activities.
▪ Helps to integrate many sources of data to reduce stress
Query Manager (backend component) on the production systems.
– performs all the operations related to the ▪ Helps reduce total turnaround time for analysis and
management of user queries. reporting.
▪ Restructuring and integration make it easier for the user
End-User Access Tools to used for reporting and analysis.
– categorized into five different groups: ▪ Stores a large amount of historical data which helps users
1. Data reporting to analyze different time periods and trends to make
2. Query Tools future predictions
3. Application Development tools
4. Executive Information System (EIS) tools DISADVANTAGES OF DATA WAREHOUSE
5. OLAP  Not an ideal option for unstructured data.
 Creation and implementation of DW is surely time
WHO NEEDS DATA WAREHOUSE? confusing affair.
• Decision makers who rely on mass amount of data  Can be outdated relatively quickly.
• Users who use customized, complex processes to obtain  Difficult to make changes in data types and ranges, data
information from multiple data sources source schema, indexes, and queries.
• People who want simple technology to access the data  May seem easy, but actually, it is too complex for the
• People who want a systematic approach for making average users.
decision  Sometimes warehouse users will develop different
• Users who want a huge amount of data which is a business rules
necessity for reports, grids or charts
• DW is a first step if you want to discover ‘hidden patterns’
or data-flows and groupings
DATA WAREHOUSE TOOLS
MarkLogic
- useful data warehousing solution that makes data
integration easier and faster using an array of
enterprise. This tool helps to perform very complex
search operations. It can query different types of data
like documents, relationships, and metadata

Oracle
- the industry-leading database. If offers a wide range of
choice of data warehouse solutions for both on-
premises and in the cloud

Amazon Redshift
- a simple and cost-effective tool to analyze all types of
data using standard SQL and existing BI tools
INTRODUCTION TO
DATABASE
RGRAFIA
DATA AND ITS MANAGEMENT

• Basics of Data

• Database Systems

• Database Architecture

• Data Management
INTRODUCTION TO DATABASE: ITS CONCEPT
• When data is stored in Database Systems, it can be stored in any format. Data can be
presented in either a structured or unstructured format. The complex combination of
structured and unstructured data sets is known as Big Data.
• Due to the 3V’s (Volume, Velocity, Variety) of Big Data, traditional technologies and
methods can’t be used to analyze them.
• Database Systems have been developed to address the issues of Big Data
WHAT IS DATABASE SYSTEMS OR DBMS?

• Database Systems or DBMS is software that caters to the collection of


electronic and digital records to extract useful information and store that
information is known as Database Systems/ Database Management
Systems or DBMS.
• The purpose of a standard database is to store and retrieve data.
Databases, such as Standard Relational Databases, are specifically
designed to store and process structured data.
WHAT IS DATABASE SYSTEMS OR DBMS?

• Generally, Databases have a table to store data, they use Structured Query
Language (SQL) to access the data from these tables.
• Databases and Database Systems play a vital role in processing hard, fast
and diverse datasets. Without a Database Management System, businesses
won’t receive valuable insights and deep analytics.
• In the Database environment, data is accessed, modified, controlled, and
then presented into a well-organized form, allowing the business
corporations to execute multiple data-processing operations.
WHAT IS DATABASE SYSTEMS OR DBMS?

• The data is usually organized in the form of rows and columns to minimize
the workload pressure and achieve accurate results instantly.
• Different types of data that can be stored, processed, or retrieved in
Database Management System include numerical, time series, textual and
binary data.
LANGUAGES SUPPORTED BY DATABASE SYSTEMS
• Database Systems comprise of specific languages that are used by operators,
programmers and end-users to interact with Database queries and updates.
• There are generally 4 types of Database Languages:
• Data Definition Language (DDL)
• Data Control Language (DCL)
• Data Manipulation Language (DML)
• Transaction Control Language (TCL)
• Data Definition Language (DDL)
• It is also called Data Description Language and is used to describe data
structures, create and modify data. SQL commands and statements like
Create, Alter, Drop, Truncate, Rename, and Comment are used to form
the pattern of the Database.
• Data Control Language (DCL)
• DCL commands include Revoke and Grant used to retrieve previously
stored and saved data. The syntax of DCL commands is similar to
programming languages. These statements play an essential role to
describe the ‘‘Rights & Permissions’’ across the Database system.
• Data Manipulation Language (DML)
• DML commands include Select, Insert, Update, Delete, Merge and Call.
These are used to access and manipulate data in the Database. These
statements are commonly meant for handling user requests.
• Transactional Control Language (TCL)
• TCL is used to handle all the transactions within Database Systems. TCL
commands include Commit, Rollback and SavePoint.
DATABASE SYSTEMS LANGUAGE EXAMPLES
• SQL: SQL unifies data definition, data manipulation, and querying in a single language. It
was one of the earliest commercial languages for the relational paradigm, albeit it differs
in some ways from Codd’s description (for example, rows and columns in a table can be
sorted).
• OQL: It is an object model language standard (developed by the Object Data Management
Group). It inspired the design of various subsequent query languages, such as JDOQL and
EJB QL.
• XQuery: XQuery is a standard XML query language that is supported by XML database
systems like MarkLogic and eXist, relational databases with XML capabilities like Oracle
and Db2, and in-memory XML processors like Saxon.
TYPES OF DATABASE SYSTEMS

• There are 4 mainly types of Database Systems:


• Hierarchical Database System
• Network Database System
• Relational Database System
• Object-Oriented Database System
INTRODUCTION OF ER MODEL
The Entity Relationship Model is a model for identifying entities (like student, car or
company) to be represented in the database and representation of how those entities are
related. The ER data model specifies enterprise schema that represents the overall logical
structure of a database graphically.

Several steps to be follow in designing a database for an application.


• Gather the requirements (functional and data) by asking questions to the database users.

• Do a logical or conceptual design of the database. This is where ER model plays a role. It is
the most used graphical representation of the conceptual design of a database.
• Physical Database Design (Like indexing) and external design (like views)
WHY USE ER DIAGRAMS IN DBMS?

• ER diagrams represent the E-R model in a database, making them


easy to convert into relations (tables).
• ER diagrams provide the purpose of real-world modeling of objects
which makes them intently useful.
• ER diagrams require no technical knowledge of the underlying DBMS
used.
• It gives a standard solution for visualizing the data logically.
SYMBOLS USED IN ER MODEL

ER Model is used to model the logical view of the system from a data
perspective which consists of these symbols:
• Rectangles: Rectangles represent Entities in the ER Model.
• Ellipses: Ellipses represent Attributes in the ER Model.
• Diamond: Diamonds represent Relationships among Entities.
• Lines: Lines represent attributes to entities and entity sets with other
relationship types.
• Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
• Double Rectangle: Double Rectangle represents a Weak Entity.
SYMBOLS USED IN ER MODEL
COMPONENTS OF ER DIAGRAM
• ER Model consists of Entities, Attributes, and Relationships among Entities in a
Database System.
WHAT IS ENTITY?

• An Entity may be an object with a


physical existence – a particular
person, car, house, or employee – or
it may be an object with a
conceptual existence – a company, a
job, or a university course.
TYPES OF ENTITY
1. Strong Entity
• A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not
depend on other Entity in the Schema. It has a primary key, that helps in identifying
it uniquely, and it is represented by a rectangle. These are called Strong Entity
Types.
2. Weak Entity
• An Entity type has a key attribute that uniquely identifies each entity in the entity
set. But some entity type exists for which key attributes can’t be defined.
TYPES OF ENTITY
A weak entity type is represented by a Double Rectangle. The participation of weak
entity types is always total. The relationship between the weak entity type and its
identifying strong entity type is called identifying relationship and it is represented by
a double diamond.
WHAT IS ATTRIBUTES?
• Attributes are the properties that define the entity type.
• For example, Roll_No, Name, DOB, Age, Address, and Mobile_No
are the attributes that define entity type Student. In ER diagram,
the attribute is represented by an oval.
TYPES OF ATTRIBUTES
1. Key Attribute
• The attribute which uniquely identifies each entity in the entity
set is called the key attribute. For example, Roll_No will be unique
for each student. In ER diagram, the key attribute is represented by
an oval with underlying lines.
TYPES OF ATTRIBUTES
2. Composite Attribute
• An attribute composed of many other attributes is called a composite attribute.
For example, the Address attribute of the student Entity type consists of Street,
City, State, and Country. In ER diagram, the composite attribute is represented
by an oval comprising of ovals.
TYPES OF ATTRIBUTES
3. Multivalued Attribute
• An attribute consisting of more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram, a
multivalued attribute is represented by a double oval.
TYPES OF ATTRIBUTES
4. Derived Attribute
• An attribute that can be derived from other attributes of the entity type is
known as a derived attribute. e.g.; Age (can be derived from DOB). In ER
diagram, the derived attribute is represented by a dashed oval.
THE COMPLETE ENTITY TYPE STUDENT WITH ITS ATTRIBUTES
CAN BE REPRESENTED AS:
RELATIONSHIP TYPE AND RELATIONSHIP SET
• A Relationship Type represents the association between entity types. For example,
‘Enrolled in’ is a relationship type that exists between entity type Student and
Course. In ER diagram, the relationship type is represented by a diamond and
connecting the entities with lines.
DEGREE OF A RELATIONSHIP SET
• The number of different entity sets participating in a relationship set is called
the degree of a relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation,
the relationship is called a unary relationship. For example, one person is married to
only one person.
DEGREE OF A RELATIONSHIP SET
2. Binary Relationship: When there are TWO entities set participating in a
relationship, the relationship is called a binary relationship. For example, a Student is
enrolled in a Course.
DEGREE OF A RELATIONSHIP SET

• 3. Ternary Relationship: When there are three entity sets participating in a


relationship, the relationship is called a ternary relationship.
• 4. N-ary Relationship: When there are n entities set participating in a relationship,
the relationship is called an n-ary relationship.
WHAT IS CARDINALITY?
• The number of times an entity of an entity set participates in a relationship set is
known as cardinality . Cardinality can be of different types:
• 1. One-to-One: When each entity in each entity set can take part only once in the
relationship, the cardinality is one-to-one. Let us assume that a male can marry one
female and a female can marry one male. So the relationship will be one-to-one.

the total number of tables that can be used in this is 2.


WHAT IS CARDINALITY?
2. One-to-Many: In one-to-many mapping as well where each entity can be related to
more than one entity and the total number of tables that can be used in this is 2. Let
us assume that one surgeon department can accommodate many doctors. So the
Cardinality will be 1 to M. It means one department has many Doctors.

the total number of tables that can be used in this is 3.


WHAT IS CARDINALITY?
3. Many-to-One: When entities in one entity set can take part only once in the
relationship set and entities in other entity sets can take part more than once in the
relationship set, cardinality is many to one. Let us assume that a student can take only
one course but one course can be taken by many students. So the cardinality will be n
to 1. It means that for one course there can be n students but for one student, there
will be only one course.

the total number of tables that can be used in this is 3.


WHAT IS CARDINALITY?
4. Many-to-Many: When entities in all entity sets can take part more than once in the
relationship cardinality is many to many. Let us assume that a student can take more
than one course and one course can be taken by many students. So the relationship
will be many to many.

the total number of tables that can be used in this is 3.


ACTIVITY #1. LOGICAL DATABASE DESIGN USING ENTITY
RELATIONSHIP DIAGRAM
ER diagram of the Company has the following description :
• Company has several departments.
• Each department may have several Locations.
• Departments are identified by a name, d_no, Location.
• A Manager control a particular department.
• Each department is associated with number of projects.
• Employees are identified by name, id, address, job, date_of_joining.
• An employee works in only one department but can work on several project.
• We also keep track of number of hours worked by an employee on a single project.
• Each employee has a dependent
• Dependent has D_name, Gender, and relationship.
QUESTIONS ?
GET A QUOTE CONTACT US ESPAÑOL

PRIVATE CLOUD !
! PRODUCTS !
! SOLUTIONS !
! ABOUT US !
! BLOG

10 popular database management systems (DBMS)


18/04/2023 Systems Databases

We have collected some of the most popular database management systems (DBMS) nowadays. Let’s start by defining what a database
management system is.

Table of contents

1 What is a database management system?

2 Popular database management systems

2.1 MySQL

2.2 MariaDB

2.3 Microsoft SQL Server

2.4 Oracle DBMS

2.5 PostgreSQL

2.6 MongoDB

2.7 Redis

2.8 IBM DB2

2.9 Elasticsearch

2.10 SQLite

3 Comparing database management systems

4 Top 10 database management systems

What is a database management system?


A database management system (DBMS) is a software used to define, manipulate, retrieve, store and manage data in databases.

To sum up, database management systems are in charge of:

• Defining rules to validate and manipulate data.


• Interacting with databases, applications and end users.
• Retrieving, storing and analyzing data.
• Updating data.

Popular database management systems

MySQL

MySQL is a free, open source relational database management system (RDBMS). It was initially owned by MySQL AB, before being acquired by
Sun Microsystems (part of Oracle Corporation since 2010). MySQL was originally developed by Ulf Michael Widenius, Swedes David Axmark and
Allan Larsson, founders of MySQL AB.

Many database-driven web applications, such as WordPress, Joomla and phpBB, as well as many popular websites like MediaWiki, Twitter and
Facebook, use MySQL.

Developer: Oracle Corporation.

Original author: MySQL AB.

Latest MySQL release: MySQL 8.0.32.

MySQL license: GNU General Public License version 2 and proprietary.

MariaDB

MariaDB is a community-developed, free and open source relational database management system. It is a fork of MySQL. MariaDB was originally
developed by Ulf Michael Widenius, Swedes David Axmark and Allan Larsson, founders of MySQL AB and the MariaDB Foundation. Ulf Michael
Widenius is the current lead developer and CTO of MariaDB.

MariaDB is also included in numerous Linux distributions, such as CentOS, Debian and RHEL. Besides, it is used by many organizations such as
Wikipedia, Google or Tumblr.

Developer: MariaDB Corporation Ab and MariaDB Foundation.

Latest MariaDB release: MariaDB 11.1.0.

MariaDB license: GPL version 2.

Microsoft SQL Server

Microsoft SQL Server is a commercial relational database management system. It is available in multiple editions, divided into three main
categories: mainstream, specialized and discontinued editions.

Developer: Microsoft.

Latest Microsoft SQL Server release: Microsoft SQL Server 2022.

Microsoft SQL Server license: proprietary license.

Oracle DBMS

Oracle DBMS is a commercial, multi-model database management system. It is also known as Oracle Database or just Oracle. It is commonly used
for running: online transaction processing (OLTP) and data warehousing (DW).

Developer: Oracle Corporation.

Latest Oracle DBMS long-term release: Oracle DBMS 19c.

Latest Oracle DBMS release: Oracle DBMS 23c beta.

Oracle DBMS license: proprietary license.

PostgreSQL

PostgreSQL is a free, open source relational database management system (RDBMS). It was initially developed as a successor of the Ingres
database, developed at the University of California, Berkeley.

Developer: PostgreSQL Global Development Group.

Latest PostgreSQL release: PostgreSQL 15.2.

PostgreSQL license: PostgreSQL license.

MongoDB

MongoDB is an open source, NoSQL, document-oriented database management system. MongoDB Inc. offers an integrated suite of cloud
database services, as well as commercial support. This document-oriented database software is commonly used for high-volume data storage.

Developer: MongoDB Inc.

Latest MongoDB release: MongoDB 6.0.4.

MongoDB license: Server Side Public License (SSPL).

Redis

Redis, short for “Remote Dictionary Server”, is an open source, NoSQL, key-value database management system.

Developer: Redis.

Original author: Salvatore Sanfilippo.

Latest Redis release: Redis 7.0.

Redis license: BSD 3-clause.

IBM DB2

IBM DB2 is a database management product developed by IBM, formerly known as DB2 for Linux, UNIX and Windows.

Developer: IBM.

Latest IBM DB2 release: IBM DB2 11.5.8.

IBM DB2 license: proprietary license.

Elasticsearch

Elasticsearch is a distributed, RESTful search and analytics engine. It is based on the Lucene library. Elasticsearch is the successor to a previous
search engine called Compass, also designed by Shay Banon.

Developer: Elastic NV.

Original author: Shay Banon.

Latest Elasticsearch release: Elasticsearch 8.7.

Elasticsearch license: dual-licensed Elastic license and Server Side Public License.

SQLite

SQLite is a public domain database engine that belongs to the embedded, relational database management systems family. It has bindings to
many programming languages.

Developer: Dwayne Richard Hipp.

Latest SQLite release: SQLite 3.41.2.

SQLite license: Public domain.

Comparing database management systems

DBMS Type Operating systems License Written in

Canonical, FreeBSD, Linux, MacOS,


MySQL RDBMS GNU GPL v2 and proprietary C and C++
Solaris and Windows

Bash, C, C++,
MariaDB RDBMS Linux, MacOS and Windows GNU GPL v2
and Perl

Microsoft
RDBMS Linux and Windows Proprietary C and C++
SQL Server

Assembly
Oracle Multi-model database AIX, BS2000, HP-UX, Linux, MacOS
Proprietary language, C and
DBMS management system and Windows
C++

FreeBSD, Linux, MacOS, OpenBSD


PostgreSQL RDBMS PostgreSQL license C
and Windows

Document-oriented FreeBSD, Linux, MacOS and C++, JavaScript


MongoDB Server Side Public License
database Windows and Python

Redis Key-value database Unix-like BSD 3-clause C

Assembly, C,
IBM DB2 RDBMS Linux, Unix-like and Windows Proprietary
C++ and Java

Dual-licensed Elastic license and


Elasticsearch Search and index Linux, MacOS and Windows Java
Server Side Public License

Android, BSD, iOS, Linux, MacOS,


SQLite RDBMS Public domain C
Solaris, VxWorks and Windows

Top 10 database management systems


Finally, according to DB-Engines ranking, as of April 2023*, these are the top 10 database management systems:

1. Oracle
2. MySQL
3. Microsoft SQL Server
4. PostgreSQL
5. MongoDB
6. Redis
7. IBM DB2
8. Elasticsearch
9. SQLite
10. Microsoft Access

*DB-Engines ranking is updated on a monthly basis.

Share it on Social Media!

Managed services
System administration and IT outsourcing adapted to the needs of
each project.

DISCOVER MORE

Related articles

Protecting and Controlling The Linux Kernel surpasses 40 Million Take Your WordPress Blog to the Next
Information: It’s More Than Just lines of code: A historic nilestone in Level: How to Handle 100,000 Daily
Backups Open-Source software Visitors Without Breaking a Sweat

Products About us
Do you need help boosting your IT?
Solutions Case studies
Data centers Blog
Contact us

Copyright 2012-2024 Stackscale B.V.


MSA – Privacy Policy – Cookies Policy – Legal Notice – Commitment to Integrity and Compliance – Corporate information

You might also like