FileViewer (3)
FileViewer (3)
YouTube.com t.Me
/CLEARCATNET /CLEARCATNET
Based on the model evaluation results, why is this a viable model for production?
A. The model is 86% accurate and the cost incurred by the company as a result of false negatives is less than
the false positives.
B. The precision of the model is 86%, which is less than the accuracy of the model.
C. The model is 86% accurate and the cost incurred by the company as a result of false positives is less than the
false negatives.
D. The precision of the model is 86%, which is greater than the accuracy of the model.
Answer: C
Explanation:
There are more FP's than FN's, however the costs of FN's are far larger than that of FP's. So:numberof(FP) >
numberof(FN), costperunit(FP) << costperunit(FN). This itself could suggest that totalcosts(FP) <
totalcosts(FN), but would be somewhat subjective, since it is not stated how far the unitary costs are.What is
suggested, however, is that the model is indeed viable (question asks WHY the model is viable, and not
WHETHER it's viable). If the model didn't exist, there would be no way that there are FP's or FN's, but churns
would still exist, which have the same cost as FN's.So it means the total costs with FP's must be less than the
total costs with FN's (churns).
A. Build a content-based filtering recommendation engine with Apache Spark ML on Amazon EMR
B. Build a collaborative filtering recommendation engine with Apache Spark ML on Amazon EMR.
C. Build a model-based filtering recommendation engine with Apache Spark ML on Amazon EMR
D. Build a combinative filtering recommendation engine with Apache Spark ML on Amazon EMR
Answer: B
Explanation:
Many developers want to implement the famous Amazon model that was used to power the People who
bought this also bought these items feature on
Amazon.com. This model is based on a method called Collaborative Filtering. It takes items such as movies,
books, and products that were rated highly by a set of users and recommending them to other users who also
gave them high ratings. This method works well in domains where explicit ratings or implicit user actions can
https://aws.amazon.com/blogs/big-data/building-a-recommendation-engine-with-spark-ml-on-amazon-emr-u
sing-zeppelin/
A. Ingest .CSV data using Apache Kafka Streams on Amazon EC2 instances and use Kafka Connect S3 to
serialize data as Parquet
B. Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Glue to convert data into Parquet.
C. Ingest .CSV data using Apache Spark Structured Streaming in an Amazon EMR cluster and use Apache Spark
to convert data into Parquet.
D. Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Kinesis Data Firehose to convert data
into Parquet.
Answer: B
Explanation:
Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Glue to convert data into Parquet.
Reference:
https://github.com/ecloudvalley/Building-a-Data-Lake-with-AWS-Glue-and-Amazon-S3
A. Use the Amazon SageMaker k-Nearest-Neighbors (kNN) algorithm on the single time series consisting of the
full year of data with a predictor_type of regressor.
B. Use Amazon SageMaker Random Cut Forest (RCF) on the single time series consisting of the full year of data.
C. Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of
data with a predictor_type of regressor.
D. Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of
data with a predictor_type of classifier.
Answer: C
Explanation:
Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of
data with a predictor_type of regressor.
Reference:
https://aws.amazon.com/blogs/machine-learning/build-a-model-to-predict-the-impact-of-weather-on-urban-
air-quality-using-amazon-sagemaker/? ref=Welcome.AI
A. Use a custom encryption algorithm to encrypt the data and store the data on an Amazon SageMaker instance
in a VPC. Use the SageMaker DeepAR algorithm to randomize the credit card numbers.
B. Use an IAM policy to encrypt the data on the Amazon S3 bucket and Amazon Kinesis to automatically discard
credit card numbers and insert fake credit card numbers.
C. Use an Amazon SageMaker launch configuration to encrypt the data once it is copied to the SageMaker
instance in a VPC. Use the SageMaker principal component analysis (PCA) algorithm to reduce the length of the
credit card numbers.
D. Use AWS KMS to encrypt the data on Amazon S3 and Amazon SageMaker, and redact the credit card
numbers from the customer data with AWS Glue.
Answer: D
Explanation:
Use AWS KMS to encrypt the data on Amazon S3 and Amazon SageMaker, and redact the credit card
numbers from the customer data with AWS Glue.
A. Amazon SageMaker notebook instances are based on the EC2 instances within the customer account, but
they run outside of VPCs.
B. Amazon SageMaker notebook instances are based on the Amazon ECS service within customer accounts.
C. Amazon SageMaker notebook instances are based on EC2 instances running within AWS service accounts.
D. Amazon SageMaker notebook instances are based on AWS ECS instances running within AWS service
accounts.
Answer: C
Explanation:
Amazon SageMaker notebook instances are based on EC2 instances running within AWS service accounts.
Reference:
https://docs.aws.amazon.com/sagemaker/latest/dg/gs-setup-working-env.html
Question: 7
A Machine Learning Specialist is building a model that will perform time series forecasting using Amazon
SageMaker. The Specialist has finished training the model and is now planning to perform load testing on the
endpoint so they can configure Auto Scaling for the model variant.
Which approach will allow the Specialist to review the latency, memory utilization, and CPU utilization during the
load test?
A. Review SageMaker logs that have been written to Amazon S3 by leveraging Amazon Athena and Amazon
QuickSight to visualize logs as they are being produced.
B. Generate an Amazon CloudWatch dashboard to create a single view for the latency, memory utilization, and
CPU utilization metrics that are outputted by Amazon SageMaker.
C. Build custom Amazon CloudWatch Logs and then leverage Amazon ES and Kibana to query and visualize the
log data as it is generated by Amazon SageMaker.
D. Send Amazon CloudWatch Logs that were generated by Amazon SageMaker to Amazon ES and use Kibana to
query and visualize the log data.
Answer: B
Explanation:
Generate an Amazon CloudWatch dashboard to create a single view for the latency, memory utilization, and
CPU utilization metrics that are outputted by Amazon SageMaker.
Reference:
https://docs.aws.amazon.com/sagemaker/latest/dg/monitoring-cloudwatch.html
A. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
B. Use AWS Glue to catalogue the data and Amazon Athena to run queries.
C. Use AWS Batch to run ETL on the data and Amazon Aurora to run the queries.
D. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run queries.
Answer: B
Explanation:
AWS Glue is a fully managed ETL service that makes it easy to move data between data stores. It can
automatically crawl, catalogue, and classify data stored in Amazon S3, and make it available for querying and
analysis. With AWS Glue, you don't have to worry about the underlying infrastructure and can focus on your
data.Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using
standard SQL. It integrates with AWS Glue, so you can use the catalogued data directly in Athena without any
additional data movement or transformation.
Question: 9
A Machine Learning Specialist is developing a custom video recommendation model for an application. The dataset
used to train this model is very large with millions of data points and is hosted in an Amazon S3 bucket. The
Specialist wants to avoid loading all of this data onto an Amazon SageMaker notebook instance because it would
take hours to move and will exceed the attached 5 GB Amazon EBS volume on the notebook instance.
Which approach allows the Specialist to use all the data to train the model?
A. Load a smaller subset of the data into the SageMaker notebook and train locally. Confirm that the training
code is executing and the model parameters seem reasonable. Initiate a SageMaker training job using the full
dataset from the S3 bucket using Pipe input mode.
B. Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the S3 bucket to the instance.
Train on a small amount of the data to verify the training code and hyperparameters. Go back to Amazon
SageMaker and train using the full dataset
C. Use AWS Glue to train a model using a small subset of the data to confirm that the data will be compatible
with Amazon SageMaker. Initiate a SageMaker training job using the full dataset from the S3 bucket using Pipe
input mode.
D. Load a smaller subset of the data into the SageMaker notebook and train locally. Confirm that the training
code is executing and the model parameters seem reasonable. Launch an Amazon EC2 instance with an AWS
Deep Learning AMI and attach the S3 bucket to train the full dataset.
Answer: A
Explanation:
Answer is A. The answer to this question is about Pipe mode from S3. The only options are A and C. As AWS
Glue cannot be use to create models which is option C.The correct answer is A.
A. Write a direct connection to the SQL database within the notebook and pull data in
B. Push the data from Microsoft SQL Server to Amazon S3 using an AWS Data Pipeline and provide the S3
location within the notebook.
C. Move the data to Amazon DynamoDB and set up a connection to DynamoDB within the notebook to pull data
in.
D. Move the data to Amazon ElastiCache using AWS DMS and set up a connection within the notebook to pull
data in for fast access.
Answer: B
Explanation:
In Option B approach, the Specialist can use AWS Data Pipeline to automate the movement of data from
Amazon RDS to Amazon S3. This allows for the creation of a reliable and scalable data pipeline that can
handle large amounts of data and ensure the data is available for training.In the Amazon SageMaker
notebook, the Specialist can then access the data stored in Amazon S3 and use it for training the model. Using
Amazon S3 as the source of training data is a common and scalable approach, and it also provides durability
and high availability of the data.
B is the correct answer.Official AWS Documentation:"Amazon ML allows you to create a datasource object
from data stored in a MySQL database in Amazon Relational Database Service (Amazon RDS). When you
perform this action, Amazon ML creates an AWS Data Pipeline object that executes the SQL query that you
specify, and places the output into an S3 bucket of your choice. Amazon ML uses that data to create the
datasource."
Question: 11 MLS-C01: Actual Exam Q&A | CLEARCATNET
A Machine Learning Specialist receives customer data for an online shopping website. The data includes
demographics, past visits, and locality information. The
Specialist must develop a machine learning approach to identify the customer shopping patterns, preferences, and
trends to enhance the website for better service and smart recommendations.
Which solution should the Specialist recommend?
A. Latent Dirichlet Allocation (LDA) for the given collection of discrete data to identify patterns in the customer
database.
B. A neural network with a minimum of three layers and random initial weights to identify patterns in the
customer database.
C. Collaborative filtering based on user interactions and correlations to identify patterns in the customer
database.
D. Random Cut Forest (RCF) over random subsamples to identify patterns in the customer database.
Answer: C
Explanation:
Collaborative filtering is a machine learning technique that recommends products or services to users based
on the ratings or preferences of other users. This technique is well-suited for identifying customer shopping patterns
and preferences because it takes into account the interactions between users and products.
A.Linear regression
B.Classification
C.Clustering
D.Reinforcement learning
Answer: B
Explanation:
The goal of classification is to determine to which class or category a data point (customer in our case)
belongs to. For classification problems, data scientists would use historical data with predefined target
variables AKA labels (churner/non-churner) " answers that need to be predicted " to train an algorithm. With
classification, businesses can answer the following questions:
✑ Will this customer churn or not?
✑ Will a customer renew their subscription?
✑ Will a user downgrade a pricing plan?
✑ Are there any signs of unusual customer behavior?
Reference:
https://www.kdnuggets.com/2019/05/churn-prediction-machine-learning.html
Question: 13
Thank you for Trying our Free
Sample Questions
But We Recommend try our Premium Exam Material (Full Premium
PDF) dumps in PDF Format to certain your Guaranteed success in First
Attempt Only.
VISIT US NOW TO DOWNLOAD FULL PDF INSTANTLY 👇
https://www.clearcatnet.com/Papers
Send us your request/inquiry at [email protected] or connect us for Live Support any time for any certification exam dumps
pdf Or for most asked Interview Q&A PDFs to ensure your success in first try!!
Get any exam latest real exam questions PDF Now- YouTube.com
/CLEARCATNET
✅Visit us - www.clearcatnet.com
t.Me
✅Mail us- [email protected]
/CLEARCATNET
✅Live Support- https://t.me/clearcatnet