Advanced Databased Integration RepotMaamJho
Advanced Databased Integration RepotMaamJho
Integration
Members:
Cherry May Paghasian
Ranel O. Duhaylungsod
Fritchel Vian C. Batbat
Jayhan Christine Casar
Jenn Claire Fe E. Bation
Advance Database Integration is essential for
organizations that rely on diverse data sources
for operations, analytics , and strategic decision-
making. It enables better data management
enhances operational efficiency , and supports
data-driven initiatives.
1. Data Warehousing
5. Microservices Architecture
7. Data Synchronization
Conceptual Modeling
1
Defines the high-level, business-oriented view of
the data, focusing on entities and relationships.
Logical Modeling
2
Translates the conceptual model into a detailed,
technology-independent representation of the data
structure.
Physical Modeling
3
Maps the logical model to the actual
implementation within the chosen DBMS,
optimizing for performance.
Normalization Principles
2 Data Manipulation
Master the art of selecting, filtering, and
transforming data to meet your specific
needs.
1. Subqueries
A subquery is a query nested within another query. It allows you
to retrieve data based on the results of another query. They can be
used in SELECT, FROM, and WHERE clauses.
For example, to find customers who made purchases in the last month:
SQL
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH)
);
Explanation
This query retrieves the names of customers who have made purchases in the last month.
It does this by:
1. Selecting customer_id from the orders table where the order_date is within the
last month.
2. Using these customer_id values to find corresponding customer_name entries in
the customers table.
Hypothetical Example
Alice
Bob
2. Joins
Joins are used to combine data from multiple tables
based on a related column. Types of joins include:
•INNER JOIN: Returns rows when there is a match in both tables.
•LEFT JOIN: Returns all rows from the left table, and matched rows from the
right table.
•RIGHT JOIN: Returns all rows from the right table, and matched rows from the
left table.
•FULL JOIN: Returns rows when there is a match in one of the tables.
Here are some key techniques:
SQL
WITH Sales_CTE AS (
SELECT salesperson_id, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY salesperson_id
)
SELECT * FROM Sales_CTE WHERE total_sales > 10000;
Explanation
This query calculates the total sales for each salesperson and then filters
to show only those with total sales greater than 10,000.
Sample Data
sales table: Result
Table Based on the sample data,
the Sales_CTE would produce:
salesperson_id sales_amount Table
1 5000 salesperson_id total_sales
1 7000 1 12000
2 15000 2 15000
3 8000 3 11000
3 3000
After applying the filter (total_sales >
10000), the result will be:
Query Execution Table
1. CTE Calculation: The Sales_CTE calculates the total sales
for each salesperson: salesperson_id total_sales
SQL 1 12000
WITH Sales_CTE AS (
2 15000
SELECT salesperson_id, SUM(sales_amount) AS
total_sales
3 11000
FROM sales
GROUP BY salesperson_id This output shows the salespeople
) who have achieved total sales greater
than 10,000.
2. Filtering: The main query selects only those salespeople
with total sales greater than 10,000:
SQL
SELECT * FROM Sales_CTE WHERE total_sales >
10000;
Here are some key techniques:
4. Window Functions
Window functions perform calculations across a set of table rows
related to the current row. They are often used for ranking,
cumulative sums, moving averages, etc.
SQL
SELECT employee_id, department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;
Explanation
This query ranks employees within each department based on their
salary in descending order. The RANK() function assigns a rank to each
employee, with the highest salary getting rank 1 within each
department.
Sample Data
Result
employees table: Based on the sample data, the result will be:
Table Table
employee_id department_id salary employee_id department_id salary rank
Explanation
This query pivots the sales_data table to transform rows into columns,
aggregating sales data by year for each product.
Sample Data
sales_data table:
Table
product_id year sales Result
Based on the sample data, the result will be:
1 2021 500 Table
1 2022 600 product_id 2021 2022 2023
2 2022 400
Explanation of the Result
2 2023 500 The PIVOT operator transforms
the year values into columns.
Query Execution The SUM(sales) function aggregates the
The query: sales for each product_id by year.
SQL This output shows the total sales for each
SELECT * product across the specified years.
FROM (
SELECT product_id, year, sales
FROM sales_data
) src
PIVOT (
SUM(sales)
FOR year IN ([2021], [2022], [2023])
) pvt;
Here are some key techniques:
6. Recursive Queries
Recursive queries are used to query hierarchical data, such as
organizational charts or file systems. They are defined using a CTE
with a recursive union.
SQL
WITH RECURSIVE EmployeeHierarchy AS (
SELECT employee_id, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.manager_id, eh.level + 1
FROM employees e
INNER JOIN EmployeeHierarchy eh ON e.manager_id =
eh.employee_id
)
SELECT * FROM EmployeeHierarchy;
Explanation
This query uses a recursive Common Table Expression (CTE) to build an employee
hierarchy. It starts with employees who have no manager (i.e., manager_id IS NULL)
and recursively finds their subordinates, incrementing the level each time.
Sample Data Query Execution
employees table:
The query:
Table SQL
employee_id manager_id WITH RECURSIVE EmployeeHierarchy AS (
SELECT employee_id, manager_id, 1 AS level
1 NULL FROM employees
WHERE manager_id IS NULL
2 1
UNION ALL
3 1 SELECT e.employee_id, e.manager_id, eh.level + 1
FROM employees e
4 2
INNER JOIN EmployeeHierarchy eh ON e.manager_id =
5 2 eh.employee_id
)
6 3 SELECT * FROM EmployeeHierarchy;
Result
Based on the sample data, the result will be:
Table
employee_id manager_id level Explanation of the Result
1 NULL 1 Level 1: Employee 1 has no manager.
3 1 2
Level 3: Employees 4 and 5 report to Employee 2,
and Employee 6 reports to Employee 3.
4 2 3
This output shows the hierarchical structure of
5 2 3
employees, with each level indicating the depth of the
6 3 3 hierarchy.
Here are some key techniques:
7. Indexing
Indexes improve the speed of data retrieval operations on a
database table. Proper indexing can significantly enhance query
performance, especially for large datasets.
8. Query Optimization
Optimizing queries involves techniques like reducing the table
size by filtering data early, avoiding unnecessary columns in
SELECT statement, and using efficient join operation.
Using Object-
Relational
Mapping Tools
Object Relational Mapping
Tools
Object–relational mapping in computer science is a programming
technique for converting data between a relational database and the
memory of an object-oriented programming language. This creates, in
effect, a virtual object database that can be used from within the
programming language.
• When interacting with a database using OOP languages,
you'll have to perform different operations like creating,
reading, updating, and deleting (CRUD) data from a
database. By design, you use SQL for performing these
operations in relational databases.
• ORM tools can generate methods like the one in the last
example.
1. Hibernate 4. jOOQ
Hibernate enables developers to write
jOOQ generates Java code from data
data persistent classes following OOP
stored in a database. You can also use this
concepts like inheritance, polymorphism,
tool to build type safe SQL queries
association, composition. Hibernate is
highly performant and is also scalable.
2. Apache OpenJPA
Apache OpenJPA is also a Java
persistence tool. It can be used as a 5. Oracle
stand-alone POJO (plain old Java object) TopLink
persistence layer.
You can use Oracle TopLink to build high-
3. EclipseLink performance applications that store
EclipseLink is an open source Java persistent data. The data can be
persistence solution for relational, XML, transformed into either relational data or
and database web services. XML elements.
Popular ORM Tools for PHP
1 Laravel 3 Qcodo
2 CakePHP 4 RedBeanPHP
1 2 3 4
Entity Framework NHibernate Dapper Base One Foundation
Component Library
• Entity Framework is a • NHibernate is an • Dapper is a micro-ORM. It
open source object (BFC)
multi-database object- is mainly used to map
database mapper. It relational mapper queries to objects. This
supports SQL, SQLite, with tons of plugins tool doesn't do most of • BFC is a framework for
MySQL, PostgreSQL, and and tools to make the things an ORM tool creating networked
Azure Cosmos DB. development easier would do like SQL database applications
and faster. generation, caching with Visual Studio and
results, lazy loading, and DBMS software from
so on. Microsoft, Oracle, IBM,
Sybase, and MySQL
Advantages of Using ORM Tools
Here are some of the advantages of using an ORM
tool:
• It speeds up development time for teams.
• Decreases the cost of development.
• Handles the logic required to interact with
databases.
• Improves security. ORM tools are built to eliminate
the possibility of SQL injection attacks.
• You write less code when using ORM tools than
with SQL.
Disadvantages of Using ORM Tools