0% found this document useful (0 votes)
6 views

SQL

SQL is a programming language used for managing and retrieving data in relational databases. It includes commands for data definition, manipulation, control, and transaction management, as well as concepts like data integrity and backup strategies. Additionally, it covers performance optimization techniques, indexing, and database partitioning to enhance query efficiency.

Uploaded by

abraraw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

SQL

SQL is a programming language used for managing and retrieving data in relational databases. It includes commands for data definition, manipulation, control, and transaction management, as well as concepts like data integrity and backup strategies. Additionally, it covers performance optimization techniques, indexing, and database partitioning to enhance query efficiency.

Uploaded by

abraraw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

SQL (Structured Query Language) is computer programing language designed for the

retrieval and management of data in relational databases like MySQL, MS Access, SQL Server,
Oracle, Sybase, Informix, PostgreSQL etc. SQL is not a database management system, but it
is a query language, which is used to store and retrieve the data from a database or in simple
words, SQL is a language that communicates with databases.

SQL Basic Commands

A Data Definition Language (DDL) is a computer language, which is used to create and
modify the structure of database objects, which include tables, views, schemas, and indexes
etc.

Command Description Syntax


CREATE Create database or its CREATE TABLE table_name
objects (table, index, (column1 data_type, column2
data_type, ...);
function, views, store
procedure, and triggers)
DROP Delete objects from the DROP TABLE table_name;
database
ALTER Alter the structure of the ALTER TABLE table_name ADD
database COLUMN column_name
data_type;
TRUNCATE Remove all records from a TRUNCATE TABLE table_name;
table, including all spaces
allocated for the records
are removed
COMMENT Add comments to the data COMMENT 'comment_text' ON
dictionary TABLE table_name;
RENAME Rename an object existing RENAME TABLE old_table_name
in the database TO new_table_name;

Data Manipulation Language (DML) is a computer programming language, which is used


for adding, deleting, and modifying data in a database.
Command Description Syntax
INSERT Insert data into a INSERT INTO table_name (column1,
table column2, ...) VALUES (value1, value2,
...);
UPDATE Update existing UPDATE table_name SET column1 = value1,
data within a table column2 = value2 WHERE condition;

DELETE Delete records DELETE FROM table_name WHERE condition;


from a database
table
LOCK Table control LOCK TABLE table_name IN lock_mode;
concurrency
CALL Call a PL/SQL or CALL procedure_name(arguments);
JAVA subprogram
EXPLAIN Describe the EXPLAIN PLAN FOR SELECT * FROM
PLAN access path to table_name;
data

Data Control Language (DCL) is a computer programming language which is used to


control access to data stored in a database.

Command Description

GRANT Gives a privilege to user

REVOKE Takes back privileges granted from user.

Transactional Control Commands (TCL) are only used with the DML Commands such as -
INSERT, UPDATE and DELETE. They cannot be used while creating tables or dropping them
because these operations are automatically committed in the database. Following commands
are used to control transactions.

Command Description

COMMIT To save the changes.

ROLLBACK To roll back the changes.


SAVEPOINT Creates points within the groups of transactions in which
to ROLLBACK.
SET TRANSACTION Places a name on a transaction.

Data Integrity

The following categories of data integrity exist with each RDBMS −

 Entity Integrity − This ensures that there are no duplicate rows in a table.

 Domain Integrity − Enforces valid entries for a given column by restricting the type,
the format, or the range of values.

 Referential integrity − Rows cannot be deleted, which are used by other records.

 User-Defined Integrity − Enforces some specific business rules that do not fall into
entity, domain or referential integrity.

Types of Backups in SQL

Full backup:- When your backup software takes a full backup, it copies the entire dataset,
regardless of whether any changes were made to the data. This type of backup is generally
taken less frequently for practical reasons. For instance, it can be time-consuming and also
take up a large amount of storage space. Alternatives to full data backups include differential
or incremental backups. => ’ BACKUP DATABASE database_name TO medium = 'filepath'
GO

Incremental backup:- An incremental backup only copies modified data since the last
backup. For example, if you took a full backup on Sunday, your incremental backup on
Monday would only copy changes since the Sunday backup. On Tuesday, it would only copy
changes to the backup image file since the Monday backup.

Differential backup:- A differential backup strategy copies only newly added and changed
data since the last full backup. If your last full backup was on Sunday, a backup on Monday
would copy all changes since Sunday. If you took another backup on Tuesday, it would also
copy all changes since Sunday. The backup file size would increase progressively until the
next full backup.=> ‘BACKUP DATABASE my_db TO medium = 'filepath' WITH
DIFFERENTIAL; GO

Transaction Log (T-log) backup:- A transaction log backup includes all the transactions
since the last transaction log backup. BACKUP LOG command is used to perform the
Transaction Log backup.=> ‘BACKUP LOG database_name TO medium = 'filepath'; GO

Database Recovery Models

1. Simple

Description

- No transaction log backups.


- In the Simple Recovery Model, the transaction log is automatically purged and the file
size is kept intact. Because of this, you can’t make log backups in this case.
- Supports only full and bulk_logged backup operations.
- The unsupported features of in simple recovery model are: Log shipping, AlwaysOn
or Mirroring and Point-in-time restore

Data loss

- Yes; we cannot restore the database to an arbitrary point in time. This means, in the
event of a failure, it is only possible to restore database as current as the last full or
differential backup, and data loss is a real possibility

Point-in-time restore

- No
2. Full

Description

- Supports transaction log backups.


- No work is lost due to a lost or damaged data file. The Full database recovery model
completely records every transaction that occurs on the database.
- One could arbitrarily choose a point in time for database restore.
- This is possible by first restoring a full backup, most recent differential then replaying
the changes recorded in the transaction log. If a differential backup doesn’t exist, then
the series of t-logs are applied.

Supports Point-in-time data recovery.

- If the database uses the full recovery model, the transaction log would grow infinitely,
and that will be a problem. So, we need to make sure that we take transaction log
backups on a regular basis.

Data loss

- Minimal or zero data loss.


- Point-in-time restore
- This setup enables more options.
- Point-in-time recovery.
3. Bulk logged

Description

- This model is similar to the Full Recovery Model, in that a transaction log is kept, but
certain transactions such as bulk loading operations are minimally logged, although
it logs other transaction. This makes the bulk data imports perform quicker and keeps
the file size of the transaction log down, but does not support the point in time
recovery of the data.
- This can help increase performance bulk load operations.
- Reduces log space usage by using minimal logging for most bulk operations.

Data loss

- If you run transactions under the bulk-logged recovery model that might require a
transaction log restore, these transactions could be exposed to data loss.

Point-in-time restore

- Point-in-time recovery is not possible with bulk-logged model


- It is possible only if the following conditions are satisfied:
- Users are currently not allowed in the database.
- If you’re able to re-running the bulk processes.

DROP vs DELETE vs TRUNCATE

DROP DELETE TRUNCATE


Definition It completely removes It removes one or more It removes all the
the table from the records from the table. rows from the existing
database. table
Type of It is a DDL command It is a DML command It is a DDL command
Command
Syntax DROP TABLE DELETE FROM TRUNCATE TABLE
table_name; tble_nameWHERE table_name;
conditions;
Memory It completely removes It doesn’t free the allocated It doesn’t free the
Management the allocated space for space of the table. allocated space of the
the table from table.
memory.
Effect on Removes the entire Doesn’t affect the table Doesn’t affect the
Table table structure. structure table structure
Speed and It is faster than It is slower than the DROP It is faster than both
Performance DELETE but slower and TRUNCATE the DELETE and
than TRUNCATE as commands as it deletes one DROP commands as
it firstly deletes the row at a time based on the it deletes all the
rows and then the specified conditions. records at a time
table from the without any condition.
database.
Use with Not applicable as it Can be used It can’t be used as it is
WHERE operates on the entire applicable to the
clause table entire table

SQL - Clone Tables

SQL Cloning Operation allows to create the exact copy of an existing table along with its
definition. There are three types of cloning possible using SQL in various RDBMS; they are
listed below.

Simple Cloning creates a new replica table from the existing table and copies all the records
in newly created table. To break this process down, a new table is created using the CREATE
TABLE statement; and the data from the existing table, as a result of SELECT statement, is
copied into the new table. Here, clone table inherits only the basic column definitions like
the NULL settings and default values from the original table. It does not inherit the indices
and AUTO_INCREMENT definitions.

 CREATE TABLE new_table SELECT * FROM original_table;

Shallow Cloning creates a new replica table from the existing table but does not copy any
data records into newly created table, so only new but empty table is created. Here, the clone
table contains only the structure of the original table along with the column attributes
including indices and AUTO_INCREMENT definition.

 CREATE TABLE new_table LIKE original_table;

Deep Cloning is a combination of simple cloning and shallow cloning. It not only copies the
structure of the existing table but also its data into the newly created table. Hence, the new
table will have all the contents from existing table and all the attributes including indices and
the AUTO_INCREMENT definitions.

 CREATE TABLE new_table LIKE original_table;


INSERT INTO new_table SELECT * FROM original_table;

Temporary Tables

They are the tables which are created in a database to store temporary data. We can perform
SQL operations similar to the operations on permanent tables like CREATE, UPDATE,
DELETE, INSERT, JOIN, etc. But these tables will be automatically deleted once the current
client session is terminated. In addition to that, they can also be explicitly deleted if the users
decide to drop them manually.

ALTER TABLE command is a part of Data Definition Language (DDL) and modifies the
structure of a table. Add or delete columns, create or destroy indexes, change the type of
existing columns, or rename columns or the table itself.

DROP TABLE statement is a Data Definition Language (DDL) command that is used to
remove a table's definition, and its data, indexes, triggers, constraints and permission
specifications (if any).
T-SQL Vs PL/SQL

PL/SQL T-SQL
PL SQL was developed by Oracle. T-sql was developed by Microsoft.
It is a natural programming which is compatible it provides highest degree of control to
with the sql and also provides better functionality. the programmers

PL sql is good with its performance with oracle T-sql is good with the performance with
database server Microsoft sql server.

It is complex to understand and Use it is much simpler and easy to Use


There is a AUTOCOMMIT Command which saves There is no AUTOCOMMIT command all
transactions automatically. transactions are saved manually.

here INSERT INTO statement must be used. here SELECT INTO statement must be
used.

PL sql provides OOPs concepts like data- This allows inserting multiple rows into
encapsulation, function overriding etc table using BULK INSERT statement.

OLTP Vs OLAP

Feature OLTP OLAP


Purpose Process transactions Analyze data
Data Current & detailed Historical &
summarized
Operations Insert, Update, Delete Complex queries,
Aggregations
Users Many concurrent users Few analytical users

Performance High for small High for large queries


operations

ACID Properties

The acronym ACID stands for Atomicity, Consistency, Isolation, and Durability. ACID
properties are essential for ensuring database transactions are reliable and consistent.
Property Description Example
Atomicity Ensures that all parts of a transaction are All items in a customer's order must
completed; if one part fails, the entire be added to the database, or none at
transaction fails. all.

Consistency Ensures that the database remains in a A bank transfer should never result
valid state before and after a transaction. in money disappearing from both
accounts.
Isolation Ensures that concurrent transactions do Two users withdrawing money from
not interfere with each other. an ATM do not affect each other’s
transactions.

Durability Ensures that once a transaction is After a power outage, the bank’s
committed, its effects are permanent, system still shows the correct
even in the case of a crash. account balance.

What are database indexes, and why are they used?

Indexing in SQL is a technique used to improve the speed of data retrieval from a database
by creating a lookup table for faster access.

They Used for-

- Works like a book index to find data quickly.


- Reduces the time required for SELECT queries.
- Automatically updated when data is inserted, updated, or deleted.
- Indexes are created on columns frequently used in WHERE, JOIN, and ORDER BY
clauses.

Index type Description Use case


Clustered Determines the physical order of the Primary key columns where sorted data
index data in the table. access is essential.
Non- Creates a separate structure with Frequently queried columns like email
clustered pointers to the data. or date_of_birth.
index
Unique Ensures that all values in the index Ensuring uniqueness in fields like
index are unique. email or username.
Composite Indexes multiple columns in Queries filtering on multiple columns,
index combination. like first_name and last_name.
Full-text Facilitates fast text searches in large Searching through large text fields like
index text fields. description or comments.
Bitmap Uses bitmaps (binary representations) Best suited for columns with low
Index to represent the presence or absence cardinality (few unique values) such as
of a value in a column gender, status flags, or categories.
B-Tree Organizes data in a hierarchical Ideal for columns with a high
Index structure, enabling efficient data cardinality (many unique values).
retrieval operations

Difference between Clustered and Non-Clustered Index

Clustered Index Non-Clustered Index


Stores data physically sorted. Stores pointers to data.
Only one per table. Multiple indexes allowed.
Faster for data retrieval. Slower than Clustered Index.
Automatically created on Primary Key. Manually created on any column.

Affects physical order of table. Does not affect physical order.

How do you optimize a slow-running query or how to Improve Query Performance ?

Use Indexes, Avoid SELECT *,Limit Results, Use EXISTS Instead of IN, Avoid Functions in
WHERE Clause.

How would you handle database deadlocks?

A Deadlock occurs when two or more transactions block each other by holding locks on
resources that the other transactions need.

Ways to Avoid Deadlocks:

1. Access Tables in the Same Order: Always access tables in a consistent sequence across
transactions.
2. Minimize Lock Time: Keep transactions short and fast to reduce lock holding time.
3. Use Lower Isolation Levels: Use READ COMMITTED instead of SERIALIZABLE
isolation level if possible.
4. Avoid User Interaction Inside Transactions: Do not wait for user input during a
transaction.
5. Use NOLOCK or Read Uncommitted: Allow non-blocking reads for read-only operations
6. Proper Indexing: Use Indexes to minimize the number of rows locked.
7. Break Large Transactions: Split large transactions into smaller batches

What is database partitioning and when would you use it?

Table Partitioning is a technique used to divide large tables into smaller, more manageable
pieces without changing the table structure. We use partitioning when a table grows so large
that query performance starts to degrade.

Used for:

- Improves query performance on large datasets.


- Simplifies data management.
- Helps in faster data retrieval.
- Each partition is stored separately.
- Data can be partitioned by range, list, hash, or composite methods.

Types of Partitioning:

1. Range Partitioning – Divides data based on value ranges.


2. List Partitioning – Divides data based on specific column values.
3. Hash Partitioning – Distributes data evenly using a hash function.
4. Composite Partitioning – Combination of Range and Hash partitioning.

What is database replication, and when would you use it?

Database replication involves copying and maintaining database objects across multiple
servers to ensure data redundancy and high availability. It can be synchronous or
asynchronous.

Synchronous replication ensures that changes are reflected in real time across servers.

Asynchronous replication updates replicas with a slight delay.


Replication is particularly useful in scenarios where uptime is critical, such as for e-
commerce platforms, where users expect the database to always be available, even during
maintenance or hardware failures.

Types of database replication:

1. Master-slave replication: In this setup, one database (the master) handles all write
operations, while one or more replicas (slaves) handle read operations.
2. Master-master replication: In a master-master setup, two or more databases can
handle both read and write operations.
3. Snapshot replication: This involves taking a snapshot of the database at a specific
point in time and copying it to another location.
4. Transactional replication: This method replicates data incrementally as
transactions occur.

What are stored procedures, and how do they improve database performance?

A stored procedure is a precompiled set of SQL statements that can be executed as a unit.
Stored procedures improve performance by reducing the amount of data sent between the
database and the application, as multiple queries can be executed with a single call.

What is database sharding, and when would you implement it?

Sharding is a horizontal partitioning strategy where a large database is split into smaller,
more manageable pieces called shards. Sharding is typically used when dealing with large
datasets where the database needs to handle high transaction volumes and millions of users.

What methods would you use to ensure database scalability?

1. Vertical scaling: This involves adding more resources, such as CPU, memory, or
storage, to the existing database server.
2. Horizontal scaling (sharding): For larger databases or when dealing with massive
datasets, horizontal scaling, or sharding, is more effective. This involves distributing
the database across multiple servers or nodes, where each shard holds a subset of the
data.
3. Replication: Replication involves copying data to multiple database servers to
distribute the read workload.
4. Database indexing and query optimization: Efficient indexing and query
optimization can significantly improve performance, making the database more
scalable.
5. Caching: Implementing a caching layer, like Redis or Memcached, helps offload
frequently accessed data from the database.
6. Partitioning: Database partitioning involves splitting a large table into smaller, more
manageable pieces, improving query performance and making data management
more efficient.

What are database views, and what are their benefits?

A View is a virtual table based on the result of a SQL query. It doesn't store data itself but
displays data retrieved from one or more underlying tables. Views also enhance security by
restricting user access to specific data fields without giving them access to the underlying
tables.

What Is a Trigger in SQL, what is there use for?

A Trigger is an automatic action executed when a specified event occurs in a table.

Uses for:

- Automatically executes on INSERT, UPDATE, or DELETE.


- Used for data validation and logging.
- Cannot be called manually.
- Improves data integrity.

Types Of Triggers:

1. A row-level trigger is executed once for each row affected by the triggering event,
which is typically an INSERT, UPDATE, or DELETE statement.
2. A statement-level trigger is executed once for the entire triggering event, instead of
once for each row affected by the event.

What is database Mirroring or Shadowing?

Mirroring also known as Shadowing, is the process of creating multiple copies of data and
database. Generally, in mirroring, the database is copied on a very different machine or
location from its main database. Technique used to improve the availability and reliability of
SQL Server databases. This is one of those SQL Server High Availability Solutions that can be
configured on databases with a full recovery model.

Uses of Mirroring or Shadowing:

1. High Availability: The principle justification for using database mirroring is high
availability.
2. Disaster Recovery: The mirror database will be promoted as principal in a disaster.
This way, businesses can return to operations quickly with minimal data loss.
3. Data Protection: Synchronous mirroring ensures that all transactions are committed
to both the primary and mirror databases. It provides excellent data protection in the
event of a server failure.
4. Scalability: Database mirroring can be configured for numerous mirrors so
businesses can scale up their high-availability solution to meet their needs

SQL Log Shipping

SQL Log Shipping is a technique in which there are two or more SQL Server instances that
are copying transaction log files from one SQL server instance to another.

Always on Failover Cluster

This technology consists of a number of servers known as cluster nodes. All of these servers
have the same hardware and software components to facilitate high availability for the
failover cluster instance.

Always on Availability Groups


As one of the high-availability SQL server solutions, it consists of a primary server named
Primary Replica and up to 8 secondary servers named Secondary Replicas. The primary
servers will help you with read-write connections, while the secondary servers will help you
with read-only connections for reporting purposes.

File Streaming

SQL Server FILESTREAM is a feature that allows you to store large binary data (e.g., images,
videos, and documents) directly in the file system while maintaining transactional
consistency with your SQL Server database. This is particularly beneficial for applications
that require quick access to large binary objects (BLOBs) without the overhead of traditional
SQL Server data types like VARBINARY.

SQL Normalization

Normalization is a database design technique used to minimize redundancy and dependency


by organizing the attributes and relations of a database. The goal is to ensure that the
database structure is efficient and free from anomalies during insertion, deletion, and update
operations. The process involves dividing large tables into smaller ones and defining
relationships between them.

SQL Normalization categorized by type

1. First Normal Form (1NF)


A table is in First Normal Form (1NF) if:
- All columns contain atomic (indivisible) values. There should be no repeating
groups or arrays in a column.
- Each record (row) is unique, i.e., there must be a primary key to identify
each row.
2. Second Normal Form (2NF)
A table is in Second Normal Form (2NF) if:
- It is already in 1NF.
- There are no partial dependencies. This means that non-key attributes must depend
on the entire primary key and not just a part of it. If a table has a composite primary
key, all non-key attributes must be fully functionally dependent on the entire
composite key.
3. Third Normal Form (3NF)
A table is in Third Normal Form (3NF) if:
- It is already in 2NF.
- There are no transitive dependencies. This means that non-key attributes should not
depend on other non-key attributes. Every non- key attribute must depend only on
the primary key, not on any other non-key attribute.
4. Boyce-Codd Normal Form (BCNF)
A table is in Boyce-Codd Normal Form (BCNF) if:
- It is already in 3NF.
- Every determinant is a candidate key. In other words, if any non- prime attribute (i.e.,
attribute not part of a candidatekey) determines another attribute, the determinant
must be a candidate key.

You might also like