Google Cloud Platform
Google Cloud Platform
Chapter 2
Prelude
If you have heard the term Cloud Computing, then you would be aware of Virtual
Machine (VM). Google Compute Engine (GCE) in GCP will let you run the VM in
Google Cloud Platform Infrastructure. You will configure your VM in such a way that
you will build a physical server by specifying its CPU power, memory, storage types
and OS.
3)
Introduction
Developers develop many applications that require the storage of large amounts of
data. Data can be in many forms like Media, Confidential data from devices,
Customer account balances, and so on.
Earlier, we read that data can be stored in Persistent disks. Also, GCP provides
other storage options for Structured or Unstructured data, Transactional, and
Relational data.
In this topic, you will be learning the various storage options like Cloud Storage,
Cloud Bigtable, Cloud SQL, Cloud Spanner, and Cloud Data storage .
More About Cloud Storage
Google Cloud Storage is an Object Storage.
Google Cloud Storage will store the data in objects, which will be in the form
of an arbitrary bunch of bytes, that can be addressed using a unique key.
These unique keys are in the form of URL, which makes it easier to interact
with Web Technologies.
Cloud storage is comprised of Buckets, which is used to hold the Storage
Objects.
Storage Objects are immutable, that is, for every new change, a new version
will be created.
Object Versioning is another important concept in cloud storage. If you
turn ON this feature, it will store every version of the object. Otherwise, the
newer version will override the old one.
When you create a bucket, you will provide a Unique name to it, specify
a Geographic location to store the data and also choose a Storage
Class by default.
You can control the access to the bucket using Cloud IAM and Access
Control List (ACL).
Scalability: With the help of Bigtable, you can increase your machine count,
which does not require any downtime. Also, it handles all Administration
tasks like Upgrades and Restarts.
The data present in the Cloud Bigtable is encrypted. You can use IAM
roles to specify who can access the data.
From an application API perspective, data is written to and from Bigtable
through data service layers like Managed VMs, HBase Rest Server, or Java
Server using HBase Client.
Cloud Bigtable will serve the data to Applications, Dashboards and Data
services.
Data can be read from and written to Bigtable through batch processes
like Hadoop MapReduce, DataFlow or Spark.
CONTAINERS
Google Compute Engine and Google App
Engine - Drawbacks
In Google Compute Engine (GCE), you are allowed to share resources by
virtualizing the hardware using VMs. The developer can deploy the OS,
access the hardware, and build their applications in a self-
contained environment along with access to Networking Interfaces, File
Systems, RAM, and so on.
When the demand for your application increases, then you need to copy the
entire VM and boot the guest OS for each instance of the application which
can be slow and expensive.
By using the Google App Engine (GAE), you will gain access to the
programming services. The only job that you need to do is to write the code,
and the self-contained workloads will use these services and include any
dependent libraries.
As the demand for your application increases, the platform scales your
applications seamlessly and independently, and you need to give up the
underlying server architecture control.
Containers
Containers are preferred due to the drawbacks in the Compute
Engine and App Engine.
Containers are responsible for the independent scalability of workloads. They
also act as an abstraction layer between the OS and Hardware.
All you need when you get a host with an OS is it should
support Containers and Container Runtime.
Your code will be portable and you can treat the OS and hardware as Black
Box. Using this functionality, you can move between stages
like Development, Staging, and Production from on-premises to the cloud.
For example, if you want to scale a web server a hundred times, then you can
do it in seconds on a single host depending on the size of the workload.
Kubernetes
If you want to build your application using containers acting like microservices,
which are connected through network connections, then you can make
them modular, scale independently, and easily deployable across a group
of hosts.
Containers can be scaled up or down, started, or stopped by the hosts on-
demand as your application changes or else, the host fails.
The above process can be done using a tool called Kubernetes.
The function of Kubernetes is to orchestrate several containers on hosts,
scale them as microservices, and perform rollouts, rollbacks.
An open-source tool called Docker, helps you to define a format for bundling
your application with machine-specific settings and dependencies into a
container.
GCP also has a separate tool called Google Container Builder, a managed
service that is used for building containers.
Introduction
By now you are familiar with two crucial GCP products, Compute
Engine and Kubernetes Engine.
One common feature in these two products is you will choose the
infrastructure in which your application will run, that is, Virtual
Machines for Compute Engine and Containers for Kubernetes Engine.
If you want to focus on the application code, then your choice should be App
Engine.
App Engine comes under Platform as a Service (Paas). App Engine
manages both Hardware and Network Infrastructure.
App Engine has many built-in services like No SQL databases, in-memory
caching, load balancing, health checks, logging, and authenticating
users, which can be used by your application.
App Engine will automatically scale applications like web
applications and mobile backend.
Service
Hybrid PaaS PaaS
Model
Primary Use- Container-based Web and Mobile Both web and mobile applications
Case workloads applications and container-based workloads
1. Development
2. Deployment
3. Monitoring
Most of the developers will store and maintain their code in the Git
Repository.
You can also create Git Instance (which gives you great control) or Git
Provider (which lessens your work).
In addition to the above options, you can keep your code private and add IAM
permissions to protect it by using the Cloud Source Repository (CSR).
Deployment
Setting up an environment in GCP will take a lot of steps like
1. You can set up compute network, storage network, and their configurations.
2. If you want to change the environment, you can do it with some commands.
3. You can also clone the environment by executing some commands.
The above steps will take a lot of time and you can reduce it by using
a Template. ( Requires specification for the environment that you want it to
be).
GCP provides , Deployment Manager to automate the creation and
management of the above steps by using the template.
You can create the Template file either by using YAML Markup
language or Python. Then the template will be consumed by the Deployment
Manager and performs the actions that need to be done.
You can edit the template and tell the Deployment Manager to make changes
according to that.
Deployment Manager Templates can be stored and version controlled in Cloud Sources
Repository.
Monitoring
Without Monitoring your application you cannot run it stably.
It will analyze and let you know whether the changes are functional or not. It
will respond with some information whenever your application is down.
Stackdriver is the monitoring tool that you can use in GCP for Monitoring,
Logging and Diagnosing.
Stackdriver will give you an entry to receive signals from Infrastructure
Platform, Virtual Machines, Middleware, Application tier, Logs, Metrics,
and Traces.
It also helps you to check the Application health, Performance and
Availability.
With Stackdriver, you can perform Monitoring, Logging, Tracing, Error
Reporting and Debugging.
You can configure uptime checks which are associated with URLs and
Resources such as Instances and Load Balancers.
You can also set up alerts on Health check results or Uptime falls.
Stackdriver
You can also use a Monitoring tool with Notification tools. You can also view
your Application state by creating Dashboards in Stackdriver.
Stackdriver Logging will allow you to view logs of your application.
Logging also allows you to define metrics depending on the Log content that
is included in the Dashboards and Alerts.
You can export logs to Cloud Pub/Sub, Cloud Storage and BigQuery.
Stackdriver Error Reporting will track and group all the errors in the cloud
application.
Stackdriver Trace is used to sample the latency of App Engine applications.
Stackdriver Debugger connects your production data to the source code. It
works efficiently when your source code is available in Cloud Source
Repository.
Cloud Dataproc
When you request a Hadoop Cluster, it will be built in less than 90 seconds
on top of the VM. Scaling can be done up and down based on the processing
power.
You can monitor the cluster using Processing Power.
Running the clusters in On-premises will require hardware investment. But,
running them in Dataproc will allow you to pay only for the hardware
resources that you use while creating the cluster.
Cloud Dataproc is billed per second, and GCP stops the billing once the
cluster is deleted.
You can also use Preemptible instances for the batch processing to save
costs.
Once the cluster consumes the data, Spark and SparkSQL can be used for
data mining.
You can also use Apache Spark Machine Learning Libraries to discover
patterns through Machine Learning.
Cloud Dataflow
Cloud Dataproc is suitable when you know your cluster size. But if your
cluster size is unpredictable or your data shows up in real-time, then your
choice should be Cloud Dataflow.
Cloud Dataflow is a managed service that allows you to develop and execute
a large range of processing patterns by extracting, transforming, and
loading batch computation or continuous computation.
Cloud Dataflow is used to build Data pipelines for
both Batch and Streaming data.
It is used to automate processing resources, free you from operational tasks
like Performance optimization and Resource Management.
Cloud Dataflow can read data from BigQuery, process it, apply transforms
like Map operations and Reduce Operations and write it to the Cloud
Storage.
Use cases include Fraud Detection and Financial Services, IoT Analytics,
Manufacturing, Logistics, HealthCare and so on.
BigQuery
Say you possess a large dataset, and you need to perform ad-hoc SQL
queries, then you need to go for BigQuery.
More on ML Platform
If you want to add different machine learning capabilities to your applications,
then you can add it through Machine Learning APIs, which are provided
through Google Cloud.
Cloud Machine Learning Platform can be used in different applications based
on the type of data, namely, Structured or Unstructured.
For structured data, ML is used for Classification and Regression
tasks like Customer churn analysis, product diagnostics and
Forecasting.
For unstructured data, you can use ML for Image Analytics such as Damage
Shipment, Identifying Styles and Flagging Content.
You can also perform Text analytics like Blog analysis, Language
Identification, and Topic Classification.
Machine Learning APIs
There are several Machine Learning APIs that are provided to the users by
Google Cloud. They are as follows:
Cloud Vision API helps its users to identify the content of an image and to
classify the image into a predictable number of categories.
Cloud Speech API helps to convert audio to text. It can recognize around 80
languages.
Cloud Natural API provides different natural language technologies to
developers around the world by doing Syntax analysis, identifying verbs,
nouns, adverbs, adjectives, and it can also find the relationship between
words.
Cloud Translation API can convert a simple arbitrary string to a repositoryed
language through a simple interface.
Cloud Video Intelligence API helps to annotate videos in different formats.
You can use it to make your video content searchable.
Conclusion
Learn as if you were to live forever -Mahatma Gandhi
In this course, we have discussed the following topics that laid the foundation
for Google Cloud Platform: