Practice_Qns_GCP_DevOps_Set1
Practice_Qns_GCP_DevOps_Set1
The
application makes several HTTP requests to dependent applications. You want to anticipate which
dependent applications might cause performance issues. What should you do?
You created a Stackdriver chart for CPU utilization in a dashboard within your workspace project.
You want to share the chart with your Site Reliability Engineering
(SRE) team only. You want to ensure you follow the principle of least privilege. What should you do?
• A. Share the workspace Project ID with the SRE team. Assign the SRE team the Monitoring
Viewer IAM role in the workspace project.
• B. Share the workspace Project ID with the SRE team. Assign the SRE team the Dashboard
Viewer IAM role in the workspace project.
• C. Click ג€Share chart by URLג€ and provide the URL to the SRE team. Assign the SRE team
the Monitoring Viewer IAM role in the workspace project. Most Voted
• D. Click ג€Share chart by URLג€ and provide the URL to the SRE team. Assign the SRE team
the Dashboard Viewer IAM role in the workspace project.
Your organization wants to implement Site Reliability Engineering (SRE) culture and principles.
Recently, a service that you support had a limited outage. A manager on another team asks you to
provide a formal explanation of what happened so they can action remediations. What should you
do?
• A. Develop a postmortem that includes the root causes, resolution, lessons learned, and a
prioritized list of action items. Share it with the manager only.
• B. Develop a postmortem that includes the root causes, resolution, lessons learned, and a
prioritized list of action items. Share it on the engineering organization's document
portal. Most Voted
• C. Develop a postmortem that includes the root causes, resolution, lessons learned, the list
of people responsible, and a list of action items for each person. Share it with the manager
only.
• D. Develop a postmortem that includes the root causes, resolution, lessons learned, the list
of people responsible, and a list of action items for each person. Share it on the engineering
organization's document portal.
You have a set of applications running on a Google Kubernetes Engine (GKE) cluster, and you are
using Stackdriver Kubernetes Engine Monitoring. You are bringing a new containerized application
required by your company into production. This application is written by a third party and cannot be
modified or reconfigured. The application writes its log information to /var/log/app_messages.log,
and you want to send these log entries to Stackdriver Logging. What should you do?
You are running an application in a virtual machine (VM) using a custom Debian image. The image
has the Stackdriver Logging agent installed. The VM has the cloud-platform scope. The application
is logging information via syslog. You want to use Stackdriver Logging in the Google Cloud Platform
Console to visualize the logs. You notice that syslog is not showing up in the "All logs" dropdown
list of the Logs Viewer. What is the first thing you should do?
• A. Look for the agent's test log entry in the Logs Viewer.
• B. Install the most recent version of the Stackdriver agent.
• C. Verify the VM service account access scope includes the monitoring.write scope.
• D. SSH to the VM and execute the following commands on your VM: ps ax | grep fluentd. Most
Voted
You use a multiple step Cloud Build pipeline to build and deploy your application to Google
Kubernetes Engine (GKE). You want to integrate with a third-party monitoring platform by performing
a HTTP POST of the build information to a webhook. You want to minimize the development effort.
What should you do?
• A. Add logic to each Cloud Build step to HTTP POST the build information to a webhook.
• B. Add a new step at the end of the pipeline in Cloud Build to HTTP POST the build
information to a webhook.
• C. Use Stackdriver Logging to create a logs-based metric from the Cloud Build logs. Create
an Alert with a Webhook notification type.
• D. Create a Cloud Pub/Sub push subscription to the Cloud Build cloud-builds PubSub topic
to HTTP POST the build information to a webhook. Most Voted
You use Spinnaker to deploy your application and have created a canary deployment stage in the
pipeline. Your application has an in-memory cache that loads objects at start time. You want to
automate the comparison of the canary version against the production version. How should you
configure the canary analysis?
• A. Compare the canary with a new deployment of the current production version. Most Voted
• B. Compare the canary with a new deployment of the previous production version.
• C. Compare the canary with the existing deployment of the current production version.
• D. Compare the canary with the average performance of a sliding window of previous
production versions.
You support a high-traffic web application and want to ensure that the home page loads in a timely
manner. As a first step, you decide to implement a Service
Level Indicator (SLI) to represent home page request latency with an acceptable page load time set
to 100 ms. What is the Google-recommended way of calculating this SLI?
• A. Bucketize the request latencies into ranges, and then compute the percentile at 100 ms.
• B. Bucketize the request latencies into ranges, and then compute the median and 90th
percentiles.
• C. Count the number of home page requests that load in under 100 ms, and then divide by
the total number of home page requests. Most Voted
• D. Count the number of home page request that load in under 100 ms, and then divide by the
total number of all web application requests.
You deploy a new release of an internal application during a weekend maintenance window when
there is minimal user tragic. After the window ends, you learn that one of the new features isn't
working as expected in the production environment. After an extended outage, you roll back the new
release and deploy a fix.
You want to modify your release process to reduce the mean time to recovery so you can avoid
extended outages in the future. What should you do? (Choose two.)
• A. Before merging new code, require 2 different peers to review the code changes.
• B. Adopt the blue/green deployment strategy when releasing new code via a CD server. Most
Voted
• C. Integrate a code linting tool to validate coding standards before any code is accepted into
the repository.
• D. Require developers to run automated integration tests on their local development
environments before release.
• E. Configure a CI server. Add a suite of unit tests to your code and have your CI server run
them on commit and verify any changes. Most Voted
You have a pool of application servers running on Compute Engine. You need to provide a secure
solution that requires the least amount of configuration and allows developers to easily access
application logs for troubleshooting. How would you implement the solution on GCP?
• A. ג€¢ Deploy the Stackdriver logging agent to the application servers. ג€¢ Give the
developers the IAM Logs Viewer role to access Stackdriver and view logs. Most Voted
• B. ג€¢ Deploy the Stackdriver logging agent to the application servers. ג€¢ Give the
developers the IAM Logs Private Logs Viewer role to access Stackdriver and view logs.
• C. ג€¢ Deploy the Stackdriver monitoring agent to the application servers. ג€¢ Give the
developers the IAM Monitoring Viewer role to access Stackdriver and view metrics.
• D. ג€¢ Install the gsutil command line tool on your application servers. ג€¢ Write a script
using gsutil to upload your application log to a Cloud Storage bucket, and then schedule it to
run via cron every 5 minutes. ג€¢ Give the developers the IAM Object Viewer access to view
the logs in the specified bucket.
You support the backend of a mobile phone game that runs on a Google Kubernetes Engine (GKE)
cluster. The application is serving HTTP requests from users.
You need to implement a solution that will reduce the network cost. What should you do?
You encountered a major service outage that affected all users of the service for multiple hours.
After several hours of incident management, the service returned to normal, and user access was
restored. You need to provide an incident summary to relevant stakeholders following the Site
Reliability Engineering recommended practices. What should you do first?
• A. Verify the maximum node pool size, enable a horizontal pod autoscaler, and then perform
a load test to verify your expected resource needs. Most Voted
• B. Because you are deployed on GKE and are using a cluster autoscaler, your GKE cluster will
scale automatically, regardless of growth rate.
• C. Because you are at only 30% utilization, you have significant headroom and you won't need
to add any additional capacity for this rate of growth.
• D. Proactively add 60% more node capacity to account for six months of 10% growth rate,
and then perform a load test to make sure you have enough capacity.
Your application images are built and pushed to Google Container Registry (GCR). You want to build
an automated pipeline that deploys the application when the image is updated while minimizing the
development effort. What should you do?
Your product is currently deployed in three Google Cloud Platform (GCP) zones with your users
divided between the zones. You can fail over from one zone to another, but it causes a 10-minute
service disruption for the affected users. You typically experience a database failure once per
quarter and can detect it within five minutes. You are cataloging the reliability risks of a new real-
time chat feature for your product. You catalog the following information for each risk:
* Mean Time to Detect (MTTD) in minutes
* Mean Time to Repair (MTTR) in minutes
* Mean Time Between Failure (MTBF) in days
* User Impact Percentage
The chat feature requires a new database system that takes twice as long to successfully fail over
between zones. You want to account for the risk of the new database failing in one zone. What
would be the values for the risk of database failover with the new system?
You are managing the production deployment to a set of Google Kubernetes Engine (GKE) clusters.
You want to make sure only images which are successfully built by your trusted CI/CD pipeline are
deployed to production. What should you do?
You support an e-commerce application that runs on a large Google Kubernetes Engine (GKE) cluster
deployed on-premises and on Google Cloud Platform. The application consists of microservices that
run in containers. You want to identify containers that are using the most CPU and memory. What
should you do?
• A. Create an automated testing script in production to detect failures as soon as they occur.
• B. Create a development environment with smaller server capacity and give access only to
developers and testers.
• C. Secure the production environment to ensure that developers can't change it and set up
one controlled update per year.
• D. Create a development environment for writing code and a test environment for
configurations, experiments, and load testing. Most Voted
You support an application running on App Engine. The application is used globally and accessed
from various device types. You want to know the number of connections. You are using Stackdriver
Monitoring for App Engine. What metric should you use?
You support an application deployed on Compute Engine. The application connects to a Cloud SQL
instance to store and retrieve data. After an update to the application, users report errors showing
database timeout messages. The number of concurrent active users remained stable. You need to
find the most probable cause of the database timeout. What should you do?
Your application images are built using Cloud Build and pushed to Google Container Registry (GCR).
You want to be able to specify a particular version of your application for deployment based on the
release version tagged in source control. What should you do when you push the image?
You are on-call for an infrastructure service that has a large number of dependent systems. You
receive an alert indicating that the service is failing to serve most of its requests and all of its
dependent systems with hundreds of thousands of users are affected. As part of your Site Reliability
Engineering (SRE) incident management protocol, you declare yourself Incident Commander (IC) and
pull in two experienced people from your team as Operations Lead (OL) and
Communications Lead (CL). What should you do next?
• A. Look for ways to mitigate user impact and deploy the mitigations to production.
• B. Contact the affected service owners and update them on the status of the incident.
• C. Establish a communication channel where incident responders and leads can
communicate with each other. Most Voted
• D. Start a postmortem, add incident information, circulate the draft internally, and ask
internal stakeholders for input.
You are developing a strategy for monitoring your Google Cloud Platform (GCP) projects in
production using Stackdriver Workspaces. One of the requirements is to be able to quickly identify
and react to production environment issues without false alerts from development and staging
projects. You want to ensure that you adhere to the principle of least privilege when providing
relevant team members with access to Stackdriver Workspaces. What should you do?
• A. Grant relevant team members read access to all GCP production projects. Create
Stackdriver workspaces inside each project.
• B. Grant relevant team members the Project Viewer IAM role on all GCP production projects.
Create Stackdriver workspaces inside each project.
• C. Choose an existing GCP production project to host the monitoring workspace. Attach the
production projects to this workspace. Grant relevant team members read access to the
Stackdriver Workspace.
• D. Create a new GCP monitoring project and create a Stackdriver Workspace inside it. Attach
the production projects to this workspace. Grant relevant team members read access to the
Stackdriver Workspace. Most Voted
You currently store the virtual machine (VM) utilization logs in Stackdriver. You need to provide an
easy-to-share interactive VM utilization dashboard that is updated in real time and contains
information aggregated on a quarterly basis. You want to use Google Cloud Platform solutions. What
should you do?