Name: Amazon MLS-C01 Exam
Brand: CertsLeader
SKU: MLS-C01
Rating: 5.0 (979 reviews)

Amazon MLS-C01 Sample Questions

Question # 1

A data scientist stores financial datasets in Amazon S3. The data scientist uses AmazonAthena to query the datasets by using SQL.The data scientist uses Amazon SageMaker to deploy a machine learning (ML) model. Thedata scientist wants to obtain inferences from the model at the SageMaker endpointHowever, when the data …. ntist attempts to invoke the SageMaker endpoint, the datascientist receives SOL statement failures The data scientist's 1AM user is currently unableto invoke the SageMaker endpointWhich combination of actions will give the data scientist's 1AM user the ability to invoke the SageMaker endpoint? (Select THREE.)

A. Attach the AmazonAthenaFullAccess AWS managed policy to the user identity.
B. Include a policy statement for the data scientist's 1AM user that allows the 1AM user toperform the sagemaker: lnvokeEndpoint action,
C. Include an inline policy for the data scientist’s 1AM user that allows SageMaker to readS3 objects
D. Include a policy statement for the data scientist's 1AM user that allows the 1AM user toperform the sagemakerGetRecord action.
E. Include the SQL statement "USING EXTERNAL FUNCTION ml_function_name" in theAthena SQL query.
F. Perform a user remapping in SageMaker to map the 1AM user to another 1AM user thatis on the hosted endpoint.

Question # 2

A Machine Learning Specialist is designing a scalable data storage solution for AmazonSageMaker. There is an existing TensorFlow-based model implemented as a train.py scriptthat relies on static training data that is currently stored as TFRecords.Which method of providing training data to Amazon SageMaker would meet the businessrequirements with the LEAST development overhead?

A. Use Amazon SageMaker script mode and use train.py unchanged. Point the AmazonSageMaker training invocation to the local path of the data without reformatting the trainingdata.
B. Use Amazon SageMaker script mode and use train.py unchanged. Put the TFRecorddata into an Amazon S3 bucket. Point the Amazon SageMaker training invocation to the S3bucket without reformatting the training data.
C. Rewrite the train.py script to add a section that converts TFRecords to protobuf andingests the protobuf data instead of TFRecords.
D. Prepare the data in the format accepted by Amazon SageMaker. Use AWS Glue orAWS Lambda to reformat and store the data in an Amazon S3 bucket.

Question # 3

A credit card company wants to identify fraudulent transactions in real time. A data scientistbuilds a machine learning model for this purpose. The transactional data is captured andstored in Amazon S3. The historic data is already labeled with two classes: fraud (positive)and fair transactions (negative). The data scientist removes all the missing data and buildsa classifier by using the XGBoost algorithm in Amazon SageMaker. The model producesthe following results:• True positive rate (TPR): 0.700• False negative rate (FNR): 0.300• True negative rate (TNR): 0.977• False positive rate (FPR): 0.023• Overall accuracy: 0.949Which solution should the data scientist use to improve the performance of the model?

A. Apply the Synthetic Minority Oversampling Technique (SMOTE) on the minority class inthe training dataset. Retrain the model with the updated training data.
B. Apply the Synthetic Minority Oversampling Technique (SMOTE) on the majority class in the training dataset. Retrain the model with the updated training data.
C. Undersample the minority class.
D. Oversample the majority class.

Question # 4

A pharmaceutical company performs periodic audits of clinical trial sites to quickly resolvecritical findings. The company stores audit documents in text format. Auditors haverequested help from a data science team to quickly analyze the documents. The auditorsneed to discover the 10 main topics within the documents to prioritize and distribute thereview work among the auditing team members. Documents that describe adverse eventsmust receive the highest priority. A data scientist will use statistical modeling to discover abstract topics and to provide a listof the top words for each category to help the auditors assess the relevance of the topic.Which algorithms are best suited to this scenario? (Choose two.)

A. Latent Dirichlet allocation (LDA)
B. Random Forest classifier
C. Neural topic modeling (NTM)
D. Linear support vector machine
E. Linear regression

Question # 5

A media company wants to create a solution that identifies celebrities in pictures that usersupload. The company also wants to identify the IP address and the timestamp details fromthe users so the company can prevent users from uploading pictures from unauthorizedlocations.Which solution will meet these requirements with LEAST development effort?

A. Use AWS Panorama to identify celebrities in the pictures. Use AWS CloudTrail tocapture IP address and timestamp details.
B. Use AWS Panorama to identify celebrities in the pictures. Make calls to the AWSPanorama Device SDK to capture IP address and timestamp details.
C. Use Amazon Rekognition to identify celebrities in the pictures. Use AWS CloudTrail tocapture IP address and timestamp details.
D. Use Amazon Rekognition to identify celebrities in the pictures. Use the text detectionfeature to capture IP address and timestamp details.

Question # 6

A retail company stores 100 GB of daily transactional data in Amazon S3 at periodicintervals. The company wants to identify the schema of the transactional data. Thecompany also wants to perform transformations on the transactional data that is in AmazonS3.The company wants to use a machine learning (ML) approach to detect fraud in thetransformed data.Which combination of solutions will meet these requirements with the LEAST operationaloverhead? {Select THREE.)

A. Use Amazon Athena to scan the data and identify the schema.
B. Use AWS Glue crawlers to scan the data and identify the schema.
C. Use Amazon Redshift to store procedures to perform data transformations
D. Use AWS Glue workflows and AWS Glue jobs to perform data transformations.
E. Use Amazon Redshift ML to train a model to detect fraud.
F. Use Amazon Fraud Detector to train a model to detect fraud.

Question # 7

An automotive company uses computer vision in its autonomous cars. The companytrained its object detection models successfully by using transfer learning from aconvolutional neural network (CNN). The company trained the models by using PyTorch through the Amazon SageMaker SDK.The vehicles have limited hardware and compute power. The company wants to optimizethe model to reduce memory, battery, and hardware consumption without a significantsacrifice in accuracy.Which solution will improve the computational efficiency of the models?

A. Use Amazon CloudWatch metrics to gain visibility into the SageMaker training weights,gradients, biases, and activation outputs. Compute the filter ranks based on the traininginformation. Apply pruning to remove the low-ranking filters. Set new weights based on thepruned set of filters. Run a new training job with the pruned model.
B. Use Amazon SageMaker Ground Truth to build and run data labeling workflows. Collecta larger labeled dataset with the labelling workflows. Run a new training job that uses thenew labeled data with previous training data.
C. Use Amazon SageMaker Debugger to gain visibility into the training weights, gradients,biases, and activation outputs. Compute the filter ranks based on the training information.Apply pruning to remove the low-ranking filters. Set the new weights based on the prunedset of filters. Run a new training job with the pruned model.
D. Use Amazon SageMaker Model Monitor to gain visibility into the ModelLatency metricand OverheadLatency metric of the model after the company deploys the model. Increasethe model learning rate. Run a new training job.

Question # 8

A media company is building a computer vision model to analyze images that are on socialmedia. The model consists of CNNs that the company trained by using images that thecompany stores in Amazon S3. The company used an Amazon SageMaker training job inFile mode with a single Amazon EC2 On-Demand Instance.Every day, the company updates the model by using about 10,000 images that thecompany has collected in the last 24 hours. The company configures training with only oneepoch. The company wants to speed up training and lower costs without the need to makeany code changes.Which solution will meet these requirements?

A. Instead of File mode, configure the SageMaker training job to use Pipe mode. Ingest thedata from a pipe.
B. Instead Of File mode, configure the SageMaker training job to use FastFile mode withno Other changes.
C. Instead Of On-Demand Instances, configure the SageMaker training job to use SpotInstances. Make no Other changes.
D. Instead Of On-Demand Instances, configure the SageMaker training job to use SpotInstances. Implement model checkpoints.

Question # 9

A data scientist is building a forecasting model for a retail company by using the mostrecent 5 years of sales records that are stored in a data warehouse. The dataset containssales records for each of the company's stores across five commercial regions The datascientist creates a working dataset with StorelD. Region. Date, and Sales Amount ascolumns. The data scientist wants to analyze yearly average sales for each region. Thescientist also wants to compare how each region performed compared to average salesacross all commercial regions.Which visualization will help the data scientist better understand the data trend?

A. Create an aggregated dataset by using the Pandas GroupBy function to get averagesales for each year for each store. Create a bar plot, faceted by year, of average sales foreach store. Add an extra bar in each facet to represent average sales.
B. Create an aggregated dataset by using the Pandas GroupBy function to get averagesales for each year for each store. Create a bar plot, colored by region and faceted by year,of average sales for each store. Add a horizontal line in each facet to represent averagesales.
C. Create an aggregated dataset by using the Pandas GroupBy function to get averagesales for each year for each region Create a bar plot of average sales for each region. Addan extra bar in each facet to represent average sales.
D. Create an aggregated dataset by using the Pandas GroupBy function to get average sales for each year for each region Create a bar plot, faceted by year, of average sales foreach region Add a horizontal line in each facet to represent average sales.

Question # 10

A data scientist is training a large PyTorch model by using Amazon SageMaker. It takes 10hours on average to train the model on GPU instances. The data scientist suspects thattraining is not converging and thatresource utilization is not optimal.What should the data scientist do to identify and address training issues with the LEASTdevelopment effort?

A. Use CPU utilization metrics that are captured in Amazon CloudWatch. Configure aCloudWatch alarm to stop the training job early if low CPU utilization occurs.
B. Use high-resolution custom metrics that are captured in Amazon CloudWatch. Configurean AWS Lambda function to analyze the metrics and to stop the training job early if issuesare detected.
C. Use the SageMaker Debugger vanishing_gradient and LowGPUUtilization built-in rulesto detect issues and to launch the StopTrainingJob action if issues are detected.
D. Use the SageMaker Debugger confusion and feature_importance_overweight built-inrules to detect issues and to launch the StopTrainingJob action if issues are detected.

Question # 11

A company builds computer-vision models that use deep learning for the autonomousvehicle industry. A machine learning (ML) specialist uses an Amazon EC2 instance thathas a CPU: GPU ratio of 12:1 to train the models.The ML specialist examines the instance metric logs and notices that the GPU is idle half ofthe time The ML specialist must reduce training costs without increasing the duration of thetraining jobs.Which solution will meet these requirements?

A. Switch to an instance type that has only CPUs.
B. Use a heterogeneous cluster that has two different instances groups.
C. Use memory-optimized EC2 Spot Instances for the training jobs.
D. Switch to an instance type that has a CPU GPU ratio of 6:1.

Question # 12

An engraving company wants to automate its quality control process for plaques. Thecompany performs the process before mailing each customized plaque to a customer. Thecompany has created an Amazon S3 bucket that contains images of defects that shouldcause a plaque to be rejected. Low-confidence predictions must be sent to an internal teamof reviewers who are using Amazon Augmented Al (Amazon A2I).Which solution will meet these requirements?

A. Use Amazon Textract for automatic processing. Use Amazon A2I with AmazonMechanical Turk for manual review.
B. Use Amazon Rekognition for automatic processing. Use Amazon A2I with a privateworkforce option for manual review.
C. Use Amazon Transcribe for automatic processing. Use Amazon A2I with a privateworkforce option for manual review.
D. Use AWS Panorama for automatic processing Use Amazon A2I with AmazonMechanical Turk for manual review

Question # 13

An Amazon SageMaker notebook instance is launched into Amazon VPC The SageMakernotebook references data contained in an Amazon S3 bucket in another account Thebucket is encrypted using SSE-KMS The instance returns an access denied error whentrying to access data in Amazon S3.Which of the following are required to access the bucket and avoid the access deniederror? (Select THREE)

A. An AWS KMS key policy that allows access to the customer master key (CMK)
B. A SageMaker notebook security group that allows access to Amazon S3
C. An 1AM role that allows access to the specific S3 bucket
D. A permissive S3 bucket policy
E. An S3 bucket owner that matches the notebook owner
F. A SegaMaker notebook subnet ACL that allow traffic to Amazon S3.

Question # 14

A machine learning (ML) engineer has created a feature repository in Amazon SageMakerFeature Store for the company. The company has AWS accounts for development,integration, and production. The company hosts a feature store in the developmentaccount. The company uses Amazon S3 buckets to store feature values offline. Thecompany wants to share features and to allow the integration account and the productionaccount to reuse the features that are in the feature repository. Which combination of steps will meet these requirements? (Select TWO.)

A. Create an IAM role in the development account that the integration account andproduction account can assume. Attach IAM policies to the role that allow access to thefeature repository and the S3 buckets.
B. Share the feature repository that is associated the S3 buckets from the developmentaccount to the integration account and the production account by using AWS ResourceAccess Manager (AWS RAM).
C. Use AWS Security Token Service (AWS STS) from the integration account and theproduction account to retrieve credentials for the development account.
D. Set up S3 replication between the development S3 buckets and the integration andproduction S3 buckets.
E. Create an AWS PrivateLink endpoint in the development account for SageMaker.

Question # 15

A network security vendor needs to ingest telemetry data from thousands of endpoints thatrun all over the world. The data is transmitted every 30 seconds in the form of records thatcontain 50 fields. Each record is up to 1 KB in size. The security vendor uses AmazonKinesis Data Streams to ingest the data. The vendor requires hourly summaries of therecords that Kinesis Data Streams ingests. The vendor will use Amazon Athena to querythe records and to generate the summaries. The Athena queries will target 7 to 12 of theavailable data fields.Which solution will meet these requirements with the LEAST amount of customization totransform and store the ingested data?

A. Use AWS Lambda to read and aggregate the data hourly. Transform the data and storeit in Amazon S3 by using Amazon Kinesis Data Firehose.
B. Use Amazon Kinesis Data Firehose to read and aggregate the data hourly. Transformthe data and store it in Amazon S3 by using a short-lived Amazon EMR cluster.
C. Use Amazon Kinesis Data Analytics to read and aggregate the data hourly. Transformthe data and store it in Amazon S3 by using Amazon Kinesis Data Firehose.
D. Use Amazon Kinesis Data Firehose to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using AWS Lambda.

Question # 16

A data scientist is building a linear regression model. The scientist inspects the dataset andnotices that the mode of the distribution is lower than the median, and the median is lowerthan the mean.Which data transformation will give the data scientist the ability to apply a linear regressionmodel?

A. Exponential transformation
B. Logarithmic transformation
C. Polynomial transformation
D. Sinusoidal transformation

Question # 17

A car company is developing a machine learning solution to detect whether a car is presentin an image. The image dataset consists of one million images. Each image in the datasetis 200 pixels in height by 200 pixels in width. Each image is labeled as either having a caror not having a car.Which architecture is MOST likely to produce a model that detects whether a car is presentin an image with the highest accuracy?

A. Use a deep convolutional neural network (CNN) classifier with the images as input.Include a linear output layer that outputs the probability that an image contains a car.
B. Use a deep convolutional neural network (CNN) classifier with the images as input.Include a softmax output layer that outputs the probability that an image contains a car.
C. Use a deep multilayer perceptron (MLP) classifier with the images as input. Include alinear output layer that outputs the probability that an image contains a car.
D. Use a deep multilayer perceptron (MLP) classifier with the images as input. Include asoftmax output layer that outputs the probability that an image contains a car.

Question # 18

A university wants to develop a targeted recruitment strategy to increase new studentenrollment. A data scientist gathers information about the academic performance history ofstudents. The data scientist wants to use the data to build student profiles. The universitywill use the profiles to direct resources to recruit students who are likely to enroll in theuniversity.Which combination of steps should the data scientist take to predict whether a particularstudent applicant is likely to enroll in the university? (Select TWO)

A. Use Amazon SageMaker Ground Truth to sort the data into two groups named"enrolled" or "not enrolled."
B. Use a forecasting algorithm to run predictions.
C. Use a regression algorithm to run predictions.
D. Use a classification algorithm to run predictions
E. Use the built-in Amazon SageMaker k-means algorithm to cluster the data into twogroups named "enrolled" or "not enrolled."

Question # 19

An insurance company developed a new experimental machine learning (ML) model toreplace an existing model that is in production. The company must validate the quality ofpredictions from the new experimental model in a production environment before thecompany uses the new experimental model to serve general user requests.Which one model can serve user requests at a time. The company must measure theperformance of the new experimental model without affecting the current live trafficWhich solution will meet these requirements?

A. A/B testing
B. Canary release
C. Shadow deployment
D. Blue/green deployment

Question # 20

A company wants to detect credit card fraud. The company has observed that an averageof 2% of credit card transactions are fraudulent. A data scientist trains a classifier on ayear's worth of credit card transaction data. The classifier needs to identify the fraudulenttransactions. The company wants to accurately capture as many fraudulent transactions aspossible.Which metrics should the data scientist use to optimize the classifier? (Select TWO.)

A. Specificity
B. False positive rate
C. Accuracy
D. Fl score
E. True positive rate

Exam Code	MLS-C01
Exam Name	AWS Certified Machine Learning - Specialty
Questions	208 Questions Answers With Explanation
Update Date	November 10,2024
Price	Was : ~~$81~~ Today : $45 Was : ~~$99~~ Today : $55 Was : ~~$117~~ Today : $65

Amazon MLS-C01 Exam Dumps

AWS Certified Machine Learning - Specialty

Why Certsleader is Your Best Choice:

Guaranteed Success

Start Your Journey with Certsleader

Amazon MLS-C01 Sample Questions

Leave Your Review

Top Microsoft Exams

Top Cisco Exams

Top Amazon Exams