Maria DB Concepts
Maria DB Concepts
Objective
This document explains how a Relational Database deployed with MariaDB works. It will help to
have a better understanding about it in case of troubleshooting.
• Atomicity: Every transaction (set of multiple statements) is treated as a ”single unit”, meaning
either a complete transaction succeeds or fails. It prevents partial updates on the database.
• Consistency: Every change on the database has to bring the database from one consistent
state to another. Only the right data must be written, according to specific rules, constraints,
etc.
• Durability: It means that completed transactions (or their effects) are recorded in non-
volatile memory.
1
Figure 1: Tablespaces and data files
Basically, A tablespace is a set of physical files that store tables and indexes of the database.
Each table and indexes is stored in one tablespace that can be composed by one or more files (data
files). A datafile is a single file that is part of some tablespace.
The tablespaces are useful because they allow to isolate the data in multiple disks or locations.
It can improve database performance, data management, access data control and table and index
size management.
The default storage engine can be changed by changing the default storage engine variable.
A different default can be specified for temporary tables by setting default tmp storage engine.
MariaDB uses Aria for system tables and temporary tables created internally to store the intermediate
results of a query.
2
Figure 2: MariaDB / MySQL innoDB Architecture
InnoDB
InnoDB primary keys are the equivalent to clustered indexes, it means an InnoDB table is always
ordered by the primary key. If the table does not have a user-defined primary key, the first unique
index whose columns are not null will be used as a primary key. If there is no index, the table will
have a clustered index, that is a 6 bytes value that is added to the table (invisible to the users).
• For performance reasons, a primary key value should be inserted in order (the last value
should be the highest).
• A big primary keys means that all secondary indexes are also big.
• We shouldn’t explicitly include the primary key in a secondary index. If we do so, the primary
key column will be duplicated in the index
Tablespaces
It is a set of files that contain data related to tables and indexes that are stored in pages and
storage blocks. The types of tablespaces are:
• System tablespace: It is stored in the file ibdata. It contains information used by InnoDB
internally (rollback segments, system tables, etc). Related to the tables created by the users,
they are created in the system tablespace only if the innodb file per table system variable is
set to 0 at the moment of the table creation. By default, innodb file per table is 1.
3
• File-per-table tablespaces: They are the tablespaces associated to the user tables. These are
.ibd files.
• Temporary tablespaces: They are written into temporary tablespaces, which means ibtmp*
files.
Note: It is important to remember that tablespaces can never shrink. If a file-per-table tablespace
grows too much, deleting data won’t recover space. You can reclaim the space by running OPTIMIZE
TABLE on that table. OPTIMIZE TABLE will create a new identical empty table. Then it will copy
row by row data from the old table to the new one. In this process, a new .ibd tablespace will be
created and space will be reclaimed.
Transaction logs
These logs are useful during Crash Recovery scenario. The redo log is written to two files,
called ib logfile0 and ib logfile1, while the undo log by default is written to the system tablespace,
which is in the ibdata1 file. InnoDB transaction logs are written in a circular fashion: their size
is normally fixed, and when the end is reached, InnoDB continues to write from the beginning.
However, if very long transactions are running, InnoDB cannot overwrite the oldest data, so it has
to expand the log size instead.
The most important server system variable is innodb buffer pool size. This size should contain
most of the active data set of your server so that SQL request can work directly with information in
the buffer pool cache. Starting at several gigabytes of memory is a good starting point if you have
that RAM available.
In case of a system crash, hardware failure or power outage, a page could be half-written on
disk. For some pages, this causes a disaster. Therefore, InnoDB writes essential pages to disk
twice. A backup copy of the new page version is written first. Then, the old page is overwritten.
The backup copies are written into a file called the doublewrite buffer.
Note: An InnoDB page is a fixed-size storage block (similar to Linux pages). Data is organized
in pages and it is stored in tables. Pages are linked to another pages sequentially.
4
Aria
Even if we only create InnoDB tables, we use Aria indirectly, in two ways:
• For system tables.
• For internal temporary tables.
Aria is a non-transactional storage engine. By default it is crash-safe, meaning that all changes
to data are written and fsynced to a write-ahead log and can always be recovered in case of a crash.
Aria caches indexes into the pagecache. Data are not directly cached by Aria, so it’s important that
the underlying filesystem caches reads and writes.
Databases
MariaDB does not support the concept of schema. In MariaDB SQL, schema and schemas are
synonyms for database and databases. When a user connects to MariaDB, they don’t connect to a
specific database. Instead, they can access any table they have permissions for.
A database is a container for database objects like tables and views. A database serves the
following purposes:
• A database is a namespace.
• A database is a logical container to separate objects.
• A database has a default character set and collation, which are inherited by their tables.
• Permissions can be assigned on a whole database, to make permission maintenance simpler.
• Physical data files are stored in a directory which has the same name as the database to which
they belong.
System Databases
• mysql, for internal use. It should be not read o written directly.
• information schema, it contains information about all databases.
• performance schema, it contains information about MariaDB runtime. It is disabled by
default. Enabling it requires setting the performance schema system variable to 1 and
restarting MariaDB.
Plugins
Storage engines are a special type of plugin. MariaDB supports the use of plugins, software
components that may be added to the core software without having to rebuild the MariaDB server
from source code. Therefore, plugins can be loaded at start-up, or loaded and unloaded while
the server is running without interruption. Plugins are commonly used for adding desired storage
engines, additional security requirements, and logging special information about the server.
5
Thread Pool
If we don’t use the thread pool, MariaDB will use its traditional method to handle connections. It
consists of using a dedicated thread for each client connection. Creating a new thread has a cost in
terms of CPU time. To mitigate this cost, after a client disconnects, the thread may be preserved
for a certain time in the thread cache.
Handling too many Connections Systems that get too busy can return the too many connections
error. When the number of threads connected exceeds the max connections server variable, it
is time to make a change. Viewing the threads connected status variable shows only the current
number of connections, but it is more useful to see what the value has peaked at, and this is shown
by the max used connections status variable. This error may be a symptom of slow queries and
other bottlenecks, but if the system is running smoothly this can be addressed by increasing the
value of max connections.
• Static or Dynamic.
• Global or Session.