0% found this document useful (0 votes)
17 views

Advanced Databased Integration RepotMaamJho

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Advanced Databased Integration RepotMaamJho

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Advance Database

Integration

Members:
Cherry May Paghasian
Ranel O. Duhaylungsod
Fritchel Vian C. Batbat
Jayhan Christine Casar
Jenn Claire Fe E. Bation
Advance Database Integration is essential for
organizations that rely on diverse data sources
for operations, analytics , and strategic decision-
making. It enables better data management
enhances operational efficiency , and supports
data-driven initiatives.

Advanced database integration involves


several strategies and techniques to ensure that
different database systems can work together
efficiently, providing a seamless experience for
users and applications.
Here are some key concepts and methods:

1. Data Warehousing

• ETL (Extract, Transform, Load): Processes for


extracting data from multiple sources, transforming it
into a suitable format, and loading it into a central
warehouse for analysis.

• OLAP (Online Analytical Processing): Tools that


enable complex queries and analysis on the data
stored in the warehouse.
2. API Integration

• RESTful APIs: Allow different systems to


communicate over HTTP, enabling integration with
web services and databases.

• GraphQL: A query language for APIs that allows


clients to request only the data they need,
facilitating integration with multiple data sources.
3. Database Federation

• Virtual Database: A layer that provides a unified


view of data from different sources, allowing
queries across multiple databases without needing
to replicate data.

• Data Virtualization: Tools that abstract the


underlying data sources, providing a single
interface to access and query data.
4. Message Queues

• Event-Driven Architecture: Using message brokers


(like Kafka or RabbitMQ) to facilitate communication
between different databases and services, allowing for
real-time data updates and processing.

5. Microservices Architecture

• Breaking down applications into smaller, independently


deployable services that can each manage their own
databases, allowing for better scalability and integration.
6. Middleware Solutions

• Software that acts as a bridge between different


databases and applications, ensuring smooth data flow
and integration.

7. Data Synchronization

• Techniques for keeping data consistent across different


databases, using tools or custom scripts to ensure that
updates in one system are reflected in others.
8. Cloud Integration Graph Databases

• Integrating Utilizing cloud-based services (like AWS


Glue, Azure Data Factory) to integrate and manage
data across various cloud and on-premises
databases.

9. Security and Compliance

• Ensuring that data integration methods comply with


regulations (like GDPR) and include robust security
measures to protect sensitive data.
• What is a simple example of database?

• Some real-life examples of databases include


eCommerce platforms, healthcare systems, social
media platforms, online banking systems, hotel
booking systems, airline reservation systems,
HRMS, email services, ride-hailing applications,
and online learning platforms.
• Databases often store information about people,
such as customers or users. For example, social
media platforms use databases to store user
information, such as names, email addresses and
user behavior. The data is used to recommend
content to users and improve the user experience.
Database Design
and Normalization
This presentation will provide an overview of the key principles
and techniques involved in database design and normalization.
We'll cover the fundamentals of database concepts, modeling
approaches, and the normalization process to ensure efficient
and scalable data structures.
Introduction to Database Concepts

What is a Database? Database Management Data Models


Systems
A database is a structured Data models define how data is
collection of data that is DBMS software allows users to structured, including entities,
organized and stored to provide create, maintain, and query attributes, and relationships, to
efficient access, management, databases, ensuring data support business requirements.
and updating. integrity and security.
Database Modeling
Techniques

Conceptual Modeling
1
Defines the high-level, business-oriented view of
the data, focusing on entities and relationships.

Logical Modeling
2
Translates the conceptual model into a detailed,
technology-independent representation of the data
structure.

Physical Modeling
3
Maps the logical model to the actual
implementation within the chosen DBMS,
optimizing for performance.
Normalization Principles

Minimize Data Enforce Data


1 2
Duplication Integrity

Normalization aims to Proper normalization


eliminate redundant ensures that data is
data, reducing storage stored in a logically
requirements and consistent and
potential data coherent manner.
inconsistencies.

Improve Query Performance


3
Normalized databases typically have simpler table
structures, enabling more efficient querying and data
retrieval.
First Normal Form (1NF)

Eliminate Repeating Establish Primary


Groups Keys

1NF requires that all Each table must have a


attributes in a table be unique primary key that
single-valued, with no identifies each record
repeating groups or distinctly.
arrays.

Ensure Atomic Values

All attributes must be atomic, meaning they cannot be


further divided into smaller components.
Second Normal Form (2NF)

Full Functional Dependency

2NF requires that all non-key attributes be fully


dependent on the primary key, with no partial
dependencies.

Eliminate Partial Dependencies

Partial dependencies occur when a non-key


attribute depends on only a part of the primary key.

Create New Tables

To address partial dependencies, new tables may


need to be created with their own primary keys.
Third Normal Form (3NF)

Eliminate Transitive Identify Determinant Attributes Decompose Tables


Dependencies
Tables may need to be
3NF ensures that all non-key Determinant attributes are those decomposed into smaller, more
attributes are directly dependent that uniquely identify a record and focused entities to eliminate
on the primary key, with no can be used to remove transitive transitive dependencies.
transitive dependencies. dependencies.
Conclusion and Best
Practices

Best Practice Description

Understand the Business Thoroughly analyze the


organization's requirements
and data needs to create an
effective data model.

Normalize Appropriately Apply normalization principles


judiciously, balancing data
integrity and query
performance.

Maintain Documentation Ensure that the database


design and schema are well-
documented for easy
maintenance and future
reference.
Advance
Querying
Techniques
Unlock the full power of your data with
advanced SQL querying techniques.
Explore innovative ways to extract valuable
insights and make informed decisions.

Advanced querying techniques in databases


allow you to perform complex data
manipulations and analyses with greater
precision and efficiency.
Intro to SQL Queries
1 Basics of SQL
Learn the fundamental syntax and structure
of SQL queries to build a strong foundation.

2 Data Manipulation
Master the art of selecting, filtering, and
transforming data to meet your specific
needs.

3 Aggregation and Grouping


Unlock powerful insights by summarizing
and grouping data using aggregate
functions.
Here are some key techniques:

1. Subqueries
A subquery is a query nested within another query. It allows you
to retrieve data based on the results of another query. They can be
used in SELECT, FROM, and WHERE clauses.
For example, to find customers who made purchases in the last month:
SQL
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH)
);

Explanation
This query retrieves the names of customers who have made purchases in the last month.
It does this by:
1. Selecting customer_id from the orders table where the order_date is within the
last month.
2. Using these customer_id values to find corresponding customer_name entries in
the customers table.
Hypothetical Example

Assume we have the following data in our tables:


customers table: orders table:
Table Table
customer_id customer_name order_id customer_id order_date

1 Alice 101 1 2024-09-15

2 Bob 102 2 2024-09-20

3 Charlie 103 3 2024-08-10

Given the current date is October 6, 2024,


the query will look for orders placed from
September 6, 2024, onwards. Therefore, it
will find orders with order_date 2024-
The result of the query will be: 09-15 and 2024-09-20.
Table
customer_name

Alice

Bob

This output shows that Alice and Bob made


purchases in the last month.
Here are some key techniques:

2. Joins
Joins are used to combine data from multiple tables
based on a related column. Types of joins include:
•INNER JOIN: Returns rows when there is a match in both tables.
•LEFT JOIN: Returns all rows from the left table, and matched rows from the
right table.
•RIGHT JOIN: Returns all rows from the right table, and matched rows from the
left table.
•FULL JOIN: Returns rows when there is a match in one of the tables.
Here are some key techniques:

3. Common Table Expressions (CTEs)


CTEs provide a way to create temporary result sets that can be
referenced within a SELECT, INSERT, UPDATE, or DELETE statement.
They are defined using the WITH keyword.

SQL
WITH Sales_CTE AS (
SELECT salesperson_id, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY salesperson_id
)
SELECT * FROM Sales_CTE WHERE total_sales > 10000;

Explanation
This query calculates the total sales for each salesperson and then filters
to show only those with total sales greater than 10,000.
Sample Data
sales table: Result
Table Based on the sample data,
the Sales_CTE would produce:
salesperson_id sales_amount Table
1 5000 salesperson_id total_sales

1 7000 1 12000

2 15000 2 15000

3 8000 3 11000

3 3000
After applying the filter (total_sales >
10000), the result will be:
Query Execution Table
1. CTE Calculation: The Sales_CTE calculates the total sales
for each salesperson: salesperson_id total_sales
SQL 1 12000
WITH Sales_CTE AS (
2 15000
SELECT salesperson_id, SUM(sales_amount) AS
total_sales
3 11000
FROM sales
GROUP BY salesperson_id This output shows the salespeople
) who have achieved total sales greater
than 10,000.
2. Filtering: The main query selects only those salespeople
with total sales greater than 10,000:
SQL
SELECT * FROM Sales_CTE WHERE total_sales >
10000;
Here are some key techniques:

4. Window Functions
Window functions perform calculations across a set of table rows
related to the current row. They are often used for ranking,
cumulative sums, moving averages, etc.

SQL
SELECT employee_id, department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;

Explanation
This query ranks employees within each department based on their
salary in descending order. The RANK() function assigns a rank to each
employee, with the highest salary getting rank 1 within each
department.
Sample Data
Result
employees table: Based on the sample data, the result will be:
Table Table
employee_id department_id salary employee_id department_id salary rank

1 101 70000 1 101 70000 1

2 101 60000 2 101 60000 2

3 101 60000 3 101 60000 2

4 102 80000 4 102 80000 1

5 102 75000 5 102 75000 2

Query Execution Explanation of the Result


 In department 101, the employee
The query: with employee_id 1 has the highest salary and
SQL is ranked 1.
SELECT employee_id, department_id,  Employees with employee_id 2 and 3 have the
salary, same salary, so they share the rank 2.
RANK() OVER (PARTITION BY
 In department 102, the employee
department_id ORDER BY salary DESC)
with employee_id 4 has the highest salary and
AS rank
is ranked 1.
FROM employees;
 The employee with employee_id 5 is ranked 2
in department 102.
This output shows how the RANK() function works
within partitions defined by department_id,
ordering by salary in descending order
Here are some key techniques:

5. Pivoting and Unpivoting


Pivoting transforms rows into columns, and unpivoting does the
reverse. These techniques are useful for reshaping data for analysis.
SQL
-- Pivot example
SELECT *
FROM (
SELECT product_id, year, sales
FROM sales_data
) src
PIVOT (
SUM(sales)
FOR year IN ([2021], [2022], [2023])
) pvt;

Explanation
This query pivots the sales_data table to transform rows into columns,
aggregating sales data by year for each product.
Sample Data
sales_data table:
Table
product_id year sales Result
Based on the sample data, the result will be:
1 2021 500 Table
1 2022 600 product_id 2021 2022 2023

1 2023 700 1 500 600 700

2 2021 300 2 300 400 500

2 2022 400
Explanation of the Result
2 2023 500  The PIVOT operator transforms
the year values into columns.
Query Execution  The SUM(sales) function aggregates the
The query: sales for each product_id by year.
SQL This output shows the total sales for each
SELECT * product across the specified years.
FROM (
SELECT product_id, year, sales
FROM sales_data
) src
PIVOT (
SUM(sales)
FOR year IN ([2021], [2022], [2023])
) pvt;
Here are some key techniques:

6. Recursive Queries
Recursive queries are used to query hierarchical data, such as
organizational charts or file systems. They are defined using a CTE
with a recursive union.
SQL
WITH RECURSIVE EmployeeHierarchy AS (
SELECT employee_id, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.manager_id, eh.level + 1
FROM employees e
INNER JOIN EmployeeHierarchy eh ON e.manager_id =
eh.employee_id
)
SELECT * FROM EmployeeHierarchy;

Explanation
This query uses a recursive Common Table Expression (CTE) to build an employee
hierarchy. It starts with employees who have no manager (i.e., manager_id IS NULL)
and recursively finds their subordinates, incrementing the level each time.
Sample Data Query Execution
employees table:
The query:
Table SQL
employee_id manager_id WITH RECURSIVE EmployeeHierarchy AS (
SELECT employee_id, manager_id, 1 AS level
1 NULL FROM employees
WHERE manager_id IS NULL
2 1
UNION ALL
3 1 SELECT e.employee_id, e.manager_id, eh.level + 1
FROM employees e
4 2
INNER JOIN EmployeeHierarchy eh ON e.manager_id =
5 2 eh.employee_id
)
6 3 SELECT * FROM EmployeeHierarchy;

Result
Based on the sample data, the result will be:
Table
employee_id manager_id level Explanation of the Result
1 NULL 1  Level 1: Employee 1 has no manager.

2 1 2  Level 2: Employees 2 and 3 report to Employee 1.

3 1 2
 Level 3: Employees 4 and 5 report to Employee 2,
and Employee 6 reports to Employee 3.
4 2 3
This output shows the hierarchical structure of
5 2 3
employees, with each level indicating the depth of the
6 3 3 hierarchy.
Here are some key techniques:

7. Indexing
Indexes improve the speed of data retrieval operations on a
database table. Proper indexing can significantly enhance query
performance, especially for large datasets.

8. Query Optimization
Optimizing queries involves techniques like reducing the table
size by filtering data early, avoiding unnecessary columns in
SELECT statement, and using efficient join operation.
Using Object-
Relational
Mapping Tools
Object Relational Mapping
Tools
Object–relational mapping in computer science is a programming
technique for converting data between a relational database and the
memory of an object-oriented programming language. This creates, in
effect, a virtual object database that can be used from within the
programming language.
• When interacting with a database using OOP languages,
you'll have to perform different operations like creating,
reading, updating, and deleting (CRUD) data from a
database. By design, you use SQL for performing these
operations in relational databases.

• While using SQL for this purpose isn't necessarily a bad


idea, the ORM and ORM tools help simplify the
interaction between relational databases and different
OOP languages.
What is an ORM Tool?
• An ORM tool is software designed to help OOP
developers interact with relational databases. So
instead of creating your own ORM software from
scratch, you can make use of these tools.

• Here's an example of SQL code that retrieves


information about a particular user from a database:
"SELECT id, name, email, country, phone_number FROM users
WHERE id = 20“
• The code above returns information about a user — name, email, country, and
phone_number — from a table called users. Using the WHERE clause, we specified
that the information should be from a user with an id of 20.
• On the other hand, an ORM tool can do the same query as above with simpler
methods.
• So the code above does the same as the SQL query. Note
that every ORM tool is built differently so the methods are
never the same, but the general purpose is similar.

• ORM tools can generate methods like the one in the last
example.

• Most OOP languages have a variety of ORM tools that you


can choose from. Here are some of the most popular for
Java, Python, PHP, and .NET development:
Popular ORM Tools for Java

1. Hibernate 4. jOOQ
Hibernate enables developers to write
jOOQ generates Java code from data
data persistent classes following OOP
stored in a database. You can also use this
concepts like inheritance, polymorphism,
tool to build type safe SQL queries
association, composition. Hibernate is
highly performant and is also scalable.
2. Apache OpenJPA
Apache OpenJPA is also a Java
persistence tool. It can be used as a 5. Oracle
stand-alone POJO (plain old Java object) TopLink
persistence layer.
You can use Oracle TopLink to build high-
3. EclipseLink performance applications that store
EclipseLink is an open source Java persistent data. The data can be
persistence solution for relational, XML, transformed into either relational data or
and database web services. XML elements.
Popular ORM Tools for PHP

1 Laravel 3 Qcodo

Laravel comes with an object Qcodo provides different


relational manager called commands that can be run in
Eloquent which makes the terminal to interact with
interaction with databases databases.
easier.

2 CakePHP 4 RedBeanPHP

CakePHP provides two object types: RedBeanPHP is a zero config


repositories which give you access to a object relational mapper.
collection of data and entities which
represents individual records of data.
Popular ORM Tools for .NET

1 2 3 4
Entity Framework NHibernate Dapper Base One Foundation
Component Library
• Entity Framework is a • NHibernate is an • Dapper is a micro-ORM. It
open source object (BFC)
multi-database object- is mainly used to map
database mapper. It relational mapper queries to objects. This
supports SQL, SQLite, with tons of plugins tool doesn't do most of • BFC is a framework for
MySQL, PostgreSQL, and and tools to make the things an ORM tool creating networked
Azure Cosmos DB. development easier would do like SQL database applications
and faster. generation, caching with Visual Studio and
results, lazy loading, and DBMS software from
so on. Microsoft, Oracle, IBM,
Sybase, and MySQL
Advantages of Using ORM Tools
Here are some of the advantages of using an ORM
tool:
• It speeds up development time for teams.
• Decreases the cost of development.
• Handles the logic required to interact with
databases.
• Improves security. ORM tools are built to eliminate
the possibility of SQL injection attacks.
• You write less code when using ORM tools than
with SQL.
Disadvantages of Using ORM Tools

• Learning how to use ORM tools can be


time consuming.
• They are likely not going to perform better
when very complex queries are involved.
• ORMs are generally slower than using SQL.
Summary
• In this article, we talked about Object Relational Mapping. This is a technique
used to connect object oriented programs to relational databases.
• We listed some of the popular ORM tools for different programming languages.
• We concluded with some of the advantages and disadvantages of using ORM
tools. languages.
THANK YOU🩶🩶

You might also like