0% found this document useful (0 votes)
56 views23 pages

Managing and Visualisation of RDS Database Cloud Computing Project PDF

The document describes a methodology for managing and visualizing data from an RDS database using EC2 containers. Key steps include: 1. Creating a VPC with public and private subnets to isolate resources and regulate traffic. 2. Deploying an RDS MySQL database in the private subnet for high availability across availability zones. 3. Deploying an ECS cluster in the public subnet to run phpMyAdmin and Metabase containers. 4. Defining task definitions for the containers that specify images, resources, and configurations. 5. Adding EFS to the Metabase container to centrally store and access data files across multiple containers.

Uploaded by

Aditi Bhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views23 pages

Managing and Visualisation of RDS Database Cloud Computing Project PDF

The document describes a methodology for managing and visualizing data from an RDS database using EC2 containers. Key steps include: 1. Creating a VPC with public and private subnets to isolate resources and regulate traffic. 2. Deploying an RDS MySQL database in the private subnet for high availability across availability zones. 3. Deploying an ECS cluster in the public subnet to run phpMyAdmin and Metabase containers. 4. Defining task definitions for the containers that specify images, resources, and configurations. 5. Adding EFS to the Metabase container to centrally store and access data files across multiple containers.

Uploaded by

Aditi Bhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Managing and Visualising a

RDS database using EC2


containers

GAURAV AMARNANI D12A 02


CHETANIYA BAJAJ D12A 04
ADITI BHATIA D12A 06
INTRODUCTION
Containerized databases are becoming increasingly popular because they offer several advantages over traditional
databases that are installed directly on a host machine. Here are some of the key benefits of using a containerized database:

Consistency: Containerized databases are built to run consistently across different environments, which means that they
can be deployed more easily and with greater reliability than traditional databases.
Scalability: Containerization allows for easy horizontal scaling, meaning that you can easily add more instances of your
database to handle increased demand.
Portability: Containerized databases can be easily moved from one environment to another, making it easy to test and
deploy new versions of your application.
Isolation: By running the database in a container, you can ensure that it is isolated from the rest of the host system. This
makes it easier to manage dependencies and avoid conflicts.
Efficiency: Containers are lightweight and consume fewer resources than traditional virtual machines, making them more
efficient in terms of both memory and CPU usage.
Security: Containerization provides an additional layer of security by isolating the database from the rest of the system,
reducing the risk of security breaches.
PROBLEM DEFINITION
The primary aim of this project is to demonstrate the process of managing and visualizing data through the use of containers. The project
involves setting up a database to store information relevant to a computer store, such as customer lists, product lists, and order histories.

Two containers will be deployed to manage the database and visualize its contents.

To handle the database management, we will make use of a Database Management System (DBMS) tool. In this project, we have opted
for phpMyAdmin.

Data visualization tools are utilized to create visual representations of the stored data, such as charts and graphs. In this project,
Metabase, an open-source business intelligence platform, will be used for this purpose. Metabase can be embedded into any
application, enabling customers to explore their data independently.

The database will be deployed on Amazon RDS, a fully-managed database service provided by Amazon Web Services (AWS). Amazon
EC2 will be used to host the containers responsible for managing and visualizing the data.

To regulate access to the resources, a Virtual Private Cloud (VPC) will be set up to enable users to define a virtual network within AWS,
isolating resources and regulating network traffic. Within the VPC, we will establish a public subnet and a private subnet. Resources that
require public access, such as EC2 and Elastic File System (EFS) instances, will be placed in the public subnet. On the other hand,
resources that need to be accessible internally, such as the RDS instance, will be placed in the private subnet
LITERATURE SURVEY
The paper titled "Impact and Implications of Big Data Analytics" [1] discusses the potential of big data analytics in various industries and its
impact on society. It emphasizes the importance of leveraging big data to drive innovation and improve decision-making. The paper also
examines the challenges associated with big data analytics, such as data privacy concerns and the need for skilled professionals. Finally, the
paper recommends strategies for organizations to overcome these challenges and effectively utilize big data analytics to achieve their goals.

The paper "Performance Analysis of Various Server Hosting Techniques" [2] investigates and compares the performance of different server
hosting techniques, including AWS EC2, AWS ECS with Faragate, and AWS Lambda

The paper "Performance of Containerized Database Management Systems " [4] evaluates the performance of containerized database
management systems (DBMS) and compares it with the traditional non-containerized approach. The study uses three different container
orchestration tools, Docker Swarm, Kubernetes, and Apache Mesos, to deploy the DBMS containers. The results show that containerized DBMS
performs comparably with non-containerized DBMS and can provide better scalability and fault tolerance. The paper also provides insights
into the trade-offs between performance and resource utilization for different container orchestration tools.

The article [3] Deploying Relational Databases in AWS provides an overview of how to deploy relational databases in Amazon Web Services
(AWS). It discusses different AWS database services, such as Amazon RDS and Amazon Aurora, and their features. The article also covers the
basics of setting up and configuring a database instance in AWS, including choosing the right database engine and selecting the appropriate
database instance type. Overall, the article serves as a useful guide for those looking to deploy and manage relational databases in AWS.
ARCHITECTURE
METHODOLOGY
Creation of VPC and Subnets

A virtual private cloud (VPC) is a secure, isolated private cloud hosted within a public cloud. VPC
customers can run code, store data, host websites, and do anything else they could do in an
ordinary private cloud, but the private cloud is hosted remotely by a public cloud provider.

Within the VPC, we will create two public subnets. These subnets will be used to interact with the
tools installed in the containers, such as the web-based GUIs for managing the database.

To ensure high availability and fault tolerance, we will choose two availability zones (AZs) for the
RDS database. This means that the database will be replicated across two different physical
locations, providing redundancy in case of a failure in one of the AZs.
METHODOLOGY
Creation of RDS MySQL Database

Amazon Relational Database Service (Amazon RDS) is a collection of managed services that
makes it simple to set up, operate, and scale databases in the cloud.

Setup a free tier RDS instance of MySQL,

Select the newly created VPC and choose "Deny public access" for connectivity to ensure the
database is in a private subnet and not publicly accessible.

Choose the default VPC Security Group and validate the creation
METHODOLOGY
Deploy ECS Cluster in EC2

An Amazon ECS cluster is a logical grouping of tasks or services. Your tasks and services ae
run on infrastructure that is registered to a cluster.

Linux + Networking Cluster is created which will contain the containers of phpMyAdmin
and Metabase tools.

For networking, we have linked the instance to the project VPC and selected the public
subnet 1a in the same AZ as the RDS database.
METHODOLOGY
Task Definition of phpMyAdmin

A Task Definition is an element containing all the information related to the container (the
image used, the size of the container, the environment variables, etc.). It is therefore
necessary to create a Task Definition specific to the container, and it will be possible to
deploy the container from it.

A task definition for phpMyAdmin on Amazon ECS specifies how a container running
phpMyAdmin should be run, including which Docker image to use, CPU and memory
requirements, network and storage settings, and any container-specific configurations. It
provides a blueprint for how the container should be deployed and managed within the
ECS environment.

With this definition, ECS can manage the container and ensure that it is running as
expected, making it easy to deploy and manage phpMyAdmin in the cloud.
METHODOLOGY
Task Definition of Metabase

A Task Definition is an element containing all the information related to the container (the
image used, the size of the container, the environment variables, etc.). It is therefore
necessary to create a Task Definition specific to the container, and it will be possible to
deploy the container from it.

A task definition for Metabase in Amazon ECS is a blueprint that defines how to run the
Metabase container in the ECS environment. It specifies the Docker image to be used, the
resources required by the container, how to handle networking and storage, and any
additional settings, such as environment variables or command overrides. Once the task
definition is defined, it can be used to launch one or more containers as tasks in a ECS
service.
METHODOLOGY
Addition of EFS to Metabase Container

Amazon Elastic File System (Amazon EFS) is a simple, serverless, set-and-forget, elastic file
system. There is no minimum fee or setup charge. You pay only for the storage you use, for
read and write access to data stored in Infrequent Access storage classes, and for any
provisioned throughput.

By adding EFS to a Metabase container, you can store and access Metabase's data files and
configuration files in a centralized and highly available location that can be shared across
multiple containers and instances.

This can help to simplify deployment, improve scalability, and increase resilience of your
Metabase deployment by ensuring that your data is always accessible, even if an individual
container or instance fails. With EFS, you can also easily scale your storage capacity up or
down as your data needs evolve, without having to manage the underlying infrastructure.
METHODOLOGY
Addition of EFS to Metabase Container

Amazon Elastic File System (Amazon EFS) is a simple, serverless, set-and-forget, elastic file
system. There is no minimum fee or setup charge. You pay only for the storage you use, for
read and write access to data stored in Infrequent Access storage classes, and for any
provisioned throughput.

By adding EFS to a Metabase container, you can store and access Metabase's data files and
configuration files in a centralized and highly available location that can be shared across
multiple containers and instances.

This can help to simplify deployment, improve scalability, and increase resilience of your
Metabase deployment by ensuring that your data is always accessible, even if an individual
container or instance fails. With EFS, you can also easily scale your storage capacity up or
down as your data needs evolve, without having to manage the underlying infrastructure.
METHODOLOGY
Database Structure in phpMyAdmin
METHODOLOGY
Customers Table
METHODOLOGY
Products Table
METHODOLOGY
Orders Table
METHODOLOGY
Creating dashboards with Metabase

MOST PURCHASED PRODUCTS

SQL Query

SELECT products.name, COUNT(*) AS


purchased FROM products, orders
WHERE products.id=orders.product_id
GROUP BY products.name ORDER BY
COUNT(*) DESC
METHODOLOGY
Creating dashboards with Metabase

MOST PURCHASED CATEGORIES OF PRODUCTS

SQL Query

SELECT products.category, COUNT(*) AS


purchased FROM products, orders WHERE
products.id=orders.product_id GROUP BY
products.category ORDER BY COUNT(*) DESC
METHODOLOGY
Creating dashboards with Metabase

BEST CUSTOMERS

SQL Query

SELECT CONCAT(customers.first_name,'
',customers.last_name) AS customer,
COUNT(*) AS purchases FROM
customers, orders WHERE
customers.id=orders.customer_id
GROUP BY customer ORDER BY
COUNT(*) DESC
METHODOLOGY
Creating dashboards with Metabase

SALES REVENUE

SQL Query

SELECT SUM(orders.cost) FROM orders


METHODOLOGY
Creating dashboards with Metabase
METHODOLOGY
Creating dashboards with Metabase
CONCLUSION
This project has successfully demonstrated the process of managing and visualizing data using
containers. By deploying a database management system tool (phpMyAdmin) and a data
visualization tool (Metabase), we were able to effectively manage and visualize data stored in
an Amazon RDS instance. The use of Amazon EC2 and VPC provided a scalable and secure
environment for deploying the containers, and Amazon RDS offered a fully managed database
service, freeing us from routine administrative tasks. Overall, this project highlights the
potential benefits of containerization and cloud computing in managing and visualizing data
for businesses and organizations.

You might also like