Notes Unit IV & V
Notes Unit IV & V
3. Multi-Cloud Service: Clients use a service to access various clouds. The cloud
client hosts a service either inside or externally. The services include elements for
brokers. The inter-cloud initiatives OPTIMUS, contrail, MOSAIC, STRATOS, and
commercial cloud management solutions leverage multi-cloud services.
4. Multi-Cloud Libraries: Clients use a uniform cloud API as a library to create
their own brokers. Inter clouds that employ libraries make it easier to use clouds
consistently. Java library J-clouds, Python library Apache Lib-Clouds, and Ruby
library Apache Delta-Cloud are a few examples of multiple multi-cloud libraries.
Resource provisioning in the context of Computer Science refers to the technique of allocating
virtualized resources to users based on their demands and needs. It involves creating and
assigning virtual machines to users in order to meet quality of service parameters and match
upcoming workloads.
The allocation of resources and services from a cloud provider to a customer is known as
resource provisioning in cloud computing, sometimes called cloud provisioning. Resource
provisioning is the process of choosing, deploying, and managing software (like load balancers
and database server management systems) and hardware resources (including CPU, storage, and
networks) to assure application performance.
To effectively utilize the resources without going against SLA and achieving the QoS
requirements, Static Provisioning/Dynamic Provisioning and Static/Dynamic Allocation of
resources must be established based on the application needs. Resource over and under-
provisioning must be prevented. Power usage is another significant restriction. Care should be
taken to reduce power consumption, dissipation, and VM placement. There should be
techniques to avoid excess power consumption.
Therefore, the ultimate objective of a cloud user is to rent resources at the lowest possible cost,
while the objective of a cloud service provider is to maximize profit by effectively distributing
resources.
Scalability: Being able to actively scale up and down with flux in demand for resources is
one of the major points of cloud computing
Speed: Users can quickly spin up multiple machines as per their usage without the need for
an IT Administrator
Savings: Pay as you go model allows for enormous cost savings for users, it is facilitated by
provisioning or removing resources according to the demand.
Efficient resource allocation is critical for optimizing cloud infrastructure. To explore how
resource management integrates with DevOps practices, the DevOps Engineering – Planning to
Production course offers insights into automating resource allocation in cloud environments.
Complex management: Cloud providers have to use various different tools and techniques
to actively monitor the usage of resources
Policy enforcement: Organisations have to ensure that users are not able to access the
resources they shouldn’t.
Cost: Due to automated provisioning costs may go very high if attention isn’t paid to placing
proper checks in place. Alerts about reaching the cost threshold are required.
Introduction
Cloud Exchange (CEx) serves as a market maker, bringing service providers and
users together. The University of Melbourne proposed it under Intercloud architecture
(Cloudbus). It supports brokering and exchanging cloud resources for scaling
applications across multiple clouds. It aggregates the infrastructure demands from
application brokers and evaluates them against the available supply. It supports the
trading of cloud services based on competitive economic models such as commodity
markets and auctions.
Global exchange of cloud resources
Source: https://snscourseware.org/snsctnew/files/1583815568.pdf
Now we will talk about various entities of the Global exchange of cloud resources.
Global Exchange of
Cloud Resources
An open compute exchange may provide a centralized point where
cloud consumers and providers would be able to make decisions
based upon which cloud resources they may want to utilize as well
as a clearing house for providers with excess capacity.Another
example may be based on geographical cloud computing.
Market directory
Banking system
Brokers
Price setting mechanism
Admission control mechanism
Resource management system
Consumers utility function
Resource management proxy
Challenges:
Unwillingness to shift from traditional controlled environment.
Regulatory pressure
How to obtain restitution in case of SLA violation.
Security Overview
Cloud computing has emerged as a widely accepted approach for accessing resources
remotely while simultaneously reducing costs. Cloud computing security concerns can be
effectively mitigated through proper configuration of your cloud resources.
Misconfiguration is the top cloud computing security challenge, as users must
appropriately protect their data and applications in the cloud.
To prevent this cloud security threat, users must ensure their data is protected, and
applications are configured correctly. It can be accomplished using a cloud storage
service that offers security features such as encryption or access control. Additionally,
implementing security measures such as authentication and password requirements can
help protect sensitive data in the cloud. By taking these steps, users can increase the
security of their cloud computing infrastructure and stay protected from cyber threats.
Unauthorized Access
Unauthorized access to data is one of the most common cloud security problems
businesses face. The cloud provides a convenient way for companies to store and access
data, which can make data vulnerable to cyber threats. Security and cloud computing
threats can include unauthorized access to user data, theft of data, and malware attacks.
To protect their data from these threats, businesses must ensure that only authorized users
can access it. Another security feature businesses can implement is encrypting sensitive
data in the cloud. It will help ensure that only authorized users can access it. By
implementing security measures such as encryption and backup procedures, businesses
can safeguard their data from unauthorized access and ensure its integrity.
Hijacking of Accounts
Hijacking of user accounts is one of the major cloud security issues. Using cloud-based
applications and services will increase the risk of account hijacking. As a result, users
must be vigilant about protecting their passwords and other confidential information to
stay secure in the cloud.
Users can protect themselves using strong passwords, security questions, and two-factor
authentication to access their accounts. They can also monitor their account activity and
take steps to protect themselves from unauthorized access or usage. This will help ensure
that hackers cannot access their data or hijack their accounts. Overall, staying vigilant
about security and updating your security measures are vital to the security of cloud
computing.
Lack of Visibility
Cloud computing has made it easier for businesses to access and store their data online,
but this convenience comes with risks. As a result, companies need to protect their data
from unauthorized access and theft. However, cloud computing also poses security threats
due to its reliance on remote servers. To ensure that their systems are vulnerable only to
authorized sources, businesses must implement security measures such as strong
authentication, data loss prevention (DLP), data breach detection, and data breach
response.
With cloud computing, visibility is vital, and businesses must regularly audit security
operations and procedures to detect vulnerabilities and threats before they become a real
problem. By taking the necessary precautions and implementing security in cloud
computing, organizations can ensure that their data remains secure in this cloud-based
environment.
Data Privacy/Confidentiality
Data privacy and confidentiality are critical issues when it comes to cloud computing.
With cloud computing, businesses can access their data from anywhere worldwide,
raising concerns about securing cloud computing. Companies don’t have control over
who can access their data, so they must ensure that only authorized users can access it.
Data breaches can happen when hackers gain access to company data. In the coming
years, there will be even more data privacy and confidentiality issues due to the rise of
big data and the increased use of cloud computing in business.
Data privacy and confidentiality issues will continue to be essential concerns for
businesses in the years ahead as data-intensive applications grow in popularity. Managed
IT Services Charlotte experts helps to ensure proper security measures and data practice
for a cloud-ready organization to avoid data breach risks.
External Sharing of Data
External data sharing is one of the leading security issues in cloud computing that
businesses face. This issue arises when data is shared with third-party providers who must
be vetted and approved by the organization. As a result, external data sharing can lead to
the loss of critical business information and theft and fraud. To prevent these issues in
cloud security, companies must implement robust security measures, such as encryption
and data management practices. In addition, it will help ensure that sensitive data remains
secure and confidential.
By implementing appropriate security measures, companies can protect their data from
unauthorized access and ensure its reliability and integrity. Overall, external data sharing
is a major cloud security concern that businesses must address to stay ahead of the
competition.
A cloud is a powerful tool that can help organizations reduce costs and improve the
efficiency of their operations. However, cloud computing presents new security
challenges that must be addressed to protect data and ensure compliance with legal and
regulatory requirements.
Organizations must ensure data security for cloud and comply with legal and regulatory
requirements to ensure the safety and integrity of their cloud-based systems. Cyber threats
such as malware, data breaches, and phishing are just a few challenges organizations face
when using cloud computing.
To combat these cloud based security issues, it’s vital to perform regular security audits,
maintain up-to-date security configurations, implement robust authentication procedures,
use strong passwords, use multi-factor authentication methods, and regularly update
software and operating systems. While cloud computing can increase the risk of
cyberattacks, organizations that are diligent about their security posture can stay ahead of
their competitors in this rapidly changing market.
Third-party resources are applications, websites, and services outside the cloud provider’s
control. These resources may have cloud security vulnerabilities, and unauthorized access
to your data is possible. Additionally, unsecured third-party resources may allow hackers
to access your cloud data. These vulnerabilities can put your security at risk. Therefore,
ensuring that only trusted, secure resources are used for cloud computing is essential. In
addition, it will help ensure that only authorized individuals access data and reduce the
risk of unauthorized data loss or breach.
Unsecured third-party resources can pose a threat to cloud security, especially when
interacting with sensitive data in cloud storage accounts. Hackers can access these
resources to gain access to your cloud data and systems. Implementing strong security
controls such as multi-factor authentication and enforcing strict password policies can
help safeguard against this risk. In addition, by restricting access to only trusted
resources, you can ensure that only authorized individuals access data and reduce the risk
of unauthorized data loss or breach.
In this, we will discuss the overview of cloud computing, its need, and mainly our
focus to cover the security issues in Cloud Computing. Let’s discuss it one by one.
Cloud Computing :
Cloud Computing is a type of technology that provides remote services on the
internet to manage, access, and store data rather than storing it on Servers or local
drives. This technology is also known as Serverless technology. Here the data can
be anything like Image, Audio, video, documents, files, etc.
5. Lack of Skill –
While working, shifting to another service provider, need an extra feature, how to
use a feature, etc. are the main problems caused in IT Companies who doesn’t
have skilled Employees. So it requires a skilled person to work with Cloud
Computing.
OpenStack Architecture
Introduction
In 2010, OpenStack began as the joint project of NASA and Rackspace Hosting. It
was handled by the OpenStack Foundation which is a non-profit collective entity
developed in 2012 September for promoting the OpenStack community and software.
50+ enterprises have joined this project.
Architecture of OpenStack
OpenStack contains a modular architecture along with several code names for the
components.
Introduction to OpenStack
Last Updated : 19 Sep, 2023
It is a free open standard cloud computing platform that first came into existence on
July 21′ 2010. It was a joint project of Rackspace Hosting and NASA to make cloud
computing more ubiquitous in nature. It is deployed as Infrastructure-as-a-
service(IaaS) in both public and private clouds where virtual resources are made
available to the users. The software platform contains interrelated components that
control multi-vendor hardware pools of processing, storage, networking resources
through a data center. In OpenStack, the tools which are used to build this platform
are referred to as “projects”. These projects handle a large number of services
including computing, networking, and storage services. Unlike virtualization, in
which resources such as RAM, CPU, etc are abstracted from the hardware using
hypervisors, OpenStack uses a number of APIs to abstract those resources so that
users and the administrators are able to directly interact with the cloud services.
OpenStack components
Apart from various projects which constitute the OpenStack platform, there are nine
major services namely Nova, Neutron, Swift, Cinder, Keystone, Horizon, Ceilometer,
and Heat. Here is the basic definition of all the components which will give us a basic
idea about these components.
1. Nova (compute service): It manages the compute resources like creating, deleting,
and handling the scheduling. It can be seen as a program dedicated to the
automation of resources that are responsible for the virtualization of services and
high-performance computing.
2. Neutron (networking service): It is responsible for connecting all the networks
across OpenStack. It is an API driven service that manages all networks and IP
addresses.
3. Swift (object storage): It is an object storage service with high fault tolerance
capabilities and it used to retrieve unstructured data objects with the help of Restful
API. Being a distributed platform, it is also used to provide redundant storage
within servers that are clustered together. It is able to successfully manage
petabytes of data.
4. Cinder (block storage): It is responsible for providing persistent block storage
that is made accessible using an API (self- service). Consequently, it allows users
to define and manage the amount of cloud storage required.
5. Keystone (identity service provider): It is responsible for all types of
authentications and authorizations in the OpenStack services. It is a directory-
based service that uses a central repository to map the correct services with the
correct user.
6. Glance (image service provider): It is responsible for registering, storing, and
retrieving virtual disk images from the complete network. These images are stored
in a wide range of back-end systems.
7. Horizon (dashboard): It is responsible for providing a web-based interface for
OpenStack services. It is used to manage, provision, and monitor cloud resources.
8. Ceilometer (telemetry): It is responsible for metering and billing of services used.
Also, it is used to generate alarms when a certain threshold is exceeded.
9. Heat (orchestration): It is used for on-demand service provisioning with auto-
scaling of cloud resources. It works in coordination with the ceilometer.
These are the services around which this platform revolves around. These services
individually handle storage, compute, networking, identity, etc. These services are the
base on which the rest of the projects rely on and are able to orchestrate services,
allow bare-metal provisioning, handle dashboards, etc.
Features of OpenStack
Modular architecture: OpenStack is designed with a modular architecture that
enables users to deploy only the components they need. This makes it easier to
customize and scale the platform to meet specific business requirements.
Multi-tenancy support: OpenStack provides multi-tenancy support, which
enables multiple users to access the same cloud infrastructure while maintaining
security and isolation between them. This is particularly important for cloud
service providers who need to offer services to multiple customers.
Open-source software: OpenStack is an open-source software platform that is
free to use and modify. This enables users to customize the platform to meet their
specific requirements, without the need for expensive proprietary software
licenses.
Distributed architecture: OpenStack is designed with a distributed architecture
that enables users to scale their cloud infrastructure horizontally across multiple
physical servers. This makes it easier to handle large workloads and improve
system performance.
API-driven: OpenStack is API-driven, which means that all components can be
accessed and controlled through a set of APIs. This makes it easier to automate and
integrate with other tools and services.
Comprehensive dashboard: OpenStack provides a comprehensive dashboard that
enables users to manage their cloud infrastructure and resources through a user-
friendly web interface. This makes it easier to monitor and manage cloud resources
without the need for specialized technical skills.
Resource pooling: OpenStack enables users to pool computing, storage, and
networking resources, which can be dynamically allocated and de-allocated based
on demand. This enables users to optimize resource utilization and reduce waste.
Nova (Compute)
Nova is a project of OpenStack that facilitates a way for provisioning compute
instances. Nova supports building bare-metal servers, virtual machines. It has narrow
support for various system containers. It executes as a daemon set on the
existing Linux server's top for providing that service.
What is Virtualbox?
It allows users to create and run virtual machines on their computers, enabling
them to install and run multiple operating systems simultaneously. This is
particularly useful for testing software, experimenting with different
configurations, and isolating environments for increased security.
Last Updated : 01 Feb, 2024
In this article, you will learn about Oracle Virtual Box. Some of you may also use a
virtual box to run more than one operating system on your computer or Linux. So
basically, it is software that enables us to run operating systems like Ubuntu,
Windows, and many other operating systems. I describe its origin, usage, and
ownership. Have a look at the article and comment below if you have any queries
related to this article.
What is a Virtual Box?
Oracle Corporation developed a virtual box, and it is also known as VB. It acts like a
hypervisor for X86 machines. Originally, it was created by Innotek GmbH, and they
made it accessible to all in 2007. After that, it was bought by Sun Microsoft in 2008.
Since then, it has been developed by Oracle, and people refer to it as Oracle VM
Virtual Box. VirtualBox comes in a variety of flavors, depending on the operating
system for which it is configured. VirtualBox Ubuntu is more common, however,
VirtualBox for Windows is also popular. With the introduction of Android phones,
VirtualBox for Android has emerged as the new face of virtual machines in
smartphones.
Use of Virtual Box
In general, a Virtual Box is a software virtualization program that may be run as an
application on any operating system. It's one of the numerous advantages of Virtual
Box. It supports the installation of additional operating systems, known as Guest OS.
It may then set up and administer free guest virtual machines, each with its own
operating system and virtual environment. Virtual Box is supported by several
operating systems, including Windows XP, Windows 7, Linux, Windows Vista, Mac
OS X, Solaris, and Open Solaris. Windows, Linux, OS/2, BSD, Haiku, and other
guest operating systems are supported in various versions and derivatives.
It can be used in following project
Software portability
Application development
System testing and debugging
Network simulation
General computing
Advantages of Virtual Box
Isolation - A virtual machine's isolated environment is suitable for testing software
or running programmes that demand more resources than are accessible in other
settings.
Virtualization- VirtualBox allows users to run another OS on a single computer
without purchasing a new device. It generates a virtual machine that functions just
like a real computer, with its own processing cores, RAM, and hard disc space
dedicated only to the virtual environment.
Cross-Platform Compatability- VirtualBox can run Windows, Linux, Solaris,
Open Solaris, and MacOS as its host operating system (OS). Users do not have to
be concerned about compatibility difficulties while setting up virtual computers on
numerous devices or platforms.
Easy Control Panel- VirtualBox's simple control interface makes it easier to
configure parameters like CPU cores and RAM. Users may begin working on their
projects within a few moments of installing the software program on their PCs or
laptops.
Multiple Modes- Users have control over how they interact with their
installations. Whether in full-screen mode, flawless window mode, scaled window
mode, or 3D graphics acceleration. This allows users to customize their experience
according to the kind of project they are working on.
DIsadvantages of Virtual Box
VirtualBox, however, relies on the computer's hardware. Thus, the virtual machine
will only be effective if the host is faster and more powerful. As a result,
VirtualBox is dependent on its host computer.
If the host computer has any defects and the OS only has one virtual machine, just
that system will be affected; if there are several virtual machines operating on the
same OS, all of them would be affected.
Though these machines act like real machines, they are not genuine; hence, the
host CPU must accept the request, resulting in delayed usability. So, when
compared to real computers, these virtual machines are not as efficient.
Differences Between VMware and VirtualBox
VMware VirtualBox
Offers virtualization at
Offers virtualization at the
both hardware and
hardware level.
software levels.
3D acceleration needs to
3D acceleration is default enabled.
be manually enabled.
One of the three components of Hadoop is Map Reduce. The first component of
Hadoop that is, Hadoop Distributed File System (HDFS) is responsible for storing the
file. The second component that is, Map Reduce is responsible for processing the
file.
MapReduce has mainly 2 tasks which are divided phase-wise.In first phase, Map is
utilised and in next phase Reduce is utilised.
Map and Reduce interfaces
Suppose there is a word file containing some text. Let us name this file as sample.txt.
Note that we use Hadoop to deal with huge files but for the sake of easy explanation
over here, we are taking a text file as an example. So, let’s assume that this sample.txt
file contains few lines as text. The content of the file is as follows:
Hello I am GeeksforGeeks
How can I help you
How can I assist you
Are you an engineer
Are you looking for coding
Are you looking for interview questions
what are you doing these days
what are your strengths
Hence, the above 8 lines are the content of the file. Let’s assume that while storing
this file in Hadoop, HDFS broke this file into four parts and named each part as
first.txt, second.txt, third.txt, and fourth.txt. So, you can easily see that the above file
will be divided into four equal parts and each part will contain 2 lines. First two lines
will be in the file first.txt, next two lines in second.txt, next two in third.txt and the last
two lines will be stored in fourth.txt. All these files will be stored in Data Nodes and
the Name Node will contain the metadata about them. All this is the task of HDFS.
Now, suppose a user wants to process this file. Here is what Map-Reduce comes into
the picture. Suppose this user wants to run a query on this sample.txt. So, instead of
bringing sample.txt on the local computer, we will send this query on the data. To
keep a track of our request, we use Job Tracker (a master service). Job Tracker traps
our request and keeps a track of it. Now suppose that the user wants to run his query
on sample.txt and want the output in result.output file. Let the name of the file
containing the query is query.jar. So, the user will write a query like:
J$hadoop jar query.jar DriverCode sample.txt result.output
1. query.jar : query file that needs to be processed on the input file.
2. sample.txt: input file.
3. result.output: directory in which output of the processing will be received.
So, now the Job Tracker traps this request and asks Name Node to run this request on
sample.txt. Name Node then provides the metadata to the Job Tracker. Job Tracker
now knows that sample.txt is stored in first.txt, second.txt, third.txt, and fourth.txt. As
all these four files have three copies stored in HDFS, so the Job Tracker
communicates with the Task Tracker (a slave service) of each of these files but it
communicates with only one copy of each file which is residing nearest to
it. Note: Applying the desired code on local first.txt, second.txt, third.txt and fourth.txt
is a process., This process is called Map. In Hadoop terminology, the main file
sample.txt is called input file and its four subfiles are called input splits. So, in
Hadoop the number of mappers for an input file are equal to number of input splits
of this input file. In the above case, the input file sample.txt has four input splits
hence four mappers will be running to process it. The responsibility of handling these
mappers is of Job Tracker. Note that the task trackers are slave services to the Job
Tracker. So, in case any of the local machines breaks down then the processing over
that part of the file will stop and it will halt the complete process. So, each task
tracker sends heartbeat and its number of slots to Job Tracker in every 3 seconds.
This is called the status of Task Trackers. In case any task tracker goes down, the Job
Tracker then waits for 10 heartbeat times, that is, 30 seconds, and even after that if it
does not get any status, then it assumes that either the task tracker is dead or is
extremely busy. So it then communicates with the task tracker of another copy of the
same file and directs it to process the desired code over it. Similarly, the slot
information is used by the Job Tracker to keep a track of how many tasks are being currently
served by the task tracker and how many more tasks can be assigned to it. In this way, the Job
Tracker keeps track of our request. Now, suppose that the system has generated output for
individual first.txt, second.txt, third.txt, and fourth.txt. But this is not the user’s desired output. To
produce the desired output, all these individual outputs have to be merged or reduced to a single
output. This reduction of multiple outputs to a single one is also a process which is done
by REDUCER. In Hadoop, as many reducers are there, those many number of output files
are generated. By default, there is always one reducer per cluster. Note: Map and Reduce are
two different processes of the second component of Hadoop, that is, Map Reduce. These are also
called phases of Map Reduce. Thus we can say that Map Reduce has two phases. Phase 1 is Map
and Phase 2 is Reduce.
Functioning of Map Reduce
Now, let us move back to our sample.txt file with the same content. Again it is being divided into
four input splits namely, first.txt, second.txt, third.txt, and fourth.txt. Now, suppose we want to
count number of each word in the file. That is the content of the file looks like:
Hello I am GeeksforGeeks
How can I help you
How can I assist you
Are you an engineer
Are you looking for coding
Are you looking for interview questions
what are you doing these days
what are your strengths
Then the output of the ‘word count’ code will be like:
Hello - 1
I-1
am - 1
geeksforgeeks - 1
How - 2 (How is written two times in the entire file)
Similarly
Are - 3
are - 2
….and so on
Thus in order to get this output, the user will have to send his query on the data.
Suppose the query ‘word count’ is in the file wordcount.jar. So, the query will look
like:
J$hadoop jar wordcount.jar DriverCode sample.txt result.output
Each level of the cloud federation poses unique problems and functions at a different
level of the IT stack. Then, several strategies and technologies are needed. The
answers to the problems encountered at each of these levels when combined form a
reference model for a cloud federation.
Conceptual Level
The obstacles in creating a framework that allows the aggregation of providers from
various administrative domains within the context of a single overlay infrastructure,
or cloud federation, are identified and addressed at the logical and operational level of
a federated cloud.
Policies and guidelines for cooperation are established at this level. Additionally, this
is the layer where choices are made regarding how and when to use a service from
another provider that is being leased or leveraged. The operational component
characterizes and molds the dynamic behavior of the federation as a result of the
decisions made by the individual providers, while the logical component specifies the
context in which agreements among providers are made and services are negotiated.
At this level, MOCC is put into precise and becomes a reality. At this stage, it’s
crucial to deal with the following difficulties:
How ought a federation should be portrayed?
How should a cloud service, a cloud provider, or an agreement be modeled and
represented?
How should the regulations and standards that permit providers to join a federation
be defined?
What procedures are in place to resolve disputes between providers?
What obligations does each supplier have to the other?
When should consumers and providers utilize the federation?
What categories of services are more likely to be rented than purchased?
Which percentage of the resources should be leased, and how should we value the
resources that are leased?
Both academia and industry have potential at the logical and operational levels.
Infrastructure Level
The technological difficulties in making it possible for various cloud computing
systems to work together seamlessly are dealt with at the infrastructure level. It
addresses the technical obstacles keeping distinct cloud computing systems from
existing inside various administrative domains. These obstacles can be removed by
using standardized protocols and interfaces.
The following concerns should be addressed at this level:
What types of standards ought to be applied?
How should interfaces and protocols be created to work together?
Which technologies should be used for collaboration?
How can we design platform components, software systems, and services that
support interoperability?
Only open standards and interfaces allow for interoperability and composition
amongst various cloud computing companies. Additionally, the Cloud Computing
Reference Model has layers that each has significantly different interfaces and
protocols.