Free AWS Certified Machine Learning – Specialty MLS-C01 Exam Practice Test

UNLOCK FULL
MLS-C01 Exam Features

In Just $59 You can Access

All Official Question Types
Interactive Web-Based Practice Test Software
No Installation or 3rd Party Software Required
Customize your practice sessions (Free Demo)
24/7 Customer Support

Page: 1 / 62
Total Questions: 307

Question 1
A data scientist uses an Amazon SageMaker notebook instance to conduct data exploration and analysis. This requires certain Python packages that are not natively available on Amazon SageMaker to be installed on the notebook instance.How can a machine learning specialist ensure that required packages are automatically available on the notebook instance for the data scientist to use?
- Install AWS Systems Manager Agent on the underlying Amazon EC2 instance and use Systems Manager Automation to execute the package installation commands.
- Create a Jupyter notebook file (.ipynb) with cells containing the package installation commands to execute and place the file under the /etc/init directory of each Amazon SageMaker notebook instance.
- Use the conda package manager from within the Jupyter notebook console to apply the necessary conda packages to the default kernel of the notebook.
- Create an Amazon SageMaker lifecycle configuration with package installation commands and assign the lifecycle configuration to the notebook instance.
Reveal Answer Answer: D Next Question
Question 2
A Machine Learning Specialist deployed a model that provides product recommendations on a company's website Initially, the model was performing very well and resulted in customers buying more products on average However within the past few months the Specialist has noticed that the effect of product recommendations has diminished and customers are starting to return to their original habits of spending less The Specialist is unsure of what happened, as the model has not changed from its initial deployment over a year agoWhich method should the Specialist try to improve model performance?
- The model needs to be completely re-engineered because it is unable to handle product inventory changes
- The model's hyperparameters should be periodically updated to prevent drift
- The model should be periodically retrained from scratch using the original data while adding a regularization term to handle product inventory changes
- The model should be periodically retrained using the original training data plus new data as product inventory changes
Reveal Answer Answer: D Next Question
Question 3
A Data Scientist is developing a binary classifier to predict whether a patient has a particular disease on a series of test results. The Data Scientist has data on 400 patients randomly selected from the population. The disease is seen in 3% of the population.Which cross-validation strategy should the Data Scientist adopt?
- A k-fold cross-validation strategy with k=5
- A stratified k-fold cross-validation strategy with k=5
- A k-fold cross-validation strategy with k=5 and 3 repeats
- An 80/20 stratified split between training and validation
Reveal Answer Answer: B Next Question
Question 4
A Machine Learning Specialist at a company sensitive to security is preparing a dataset for model training. The dataset is stored in Amazon S3 and contains Personally Identifiable Information (Pll). The dataset:* Must be accessible from a VPC only.* Must not traverse the public internet.How can these requirements be satisfied?
- Create a VPC endpoint and apply a bucket access policy that restricts access to the given VPC endpoint and the VPC.
- Create a VPC endpoint and apply a bucket access policy that allows access from the given VPC endpoint and an Amazon EC2 instance.
- Create a VPC endpoint and use Network Access Control Lists (NACLs) to allow traffic between only the given VPC endpoint and an Amazon EC2 instance.
- Create a VPC endpoint and use security groups to restrict access to the given VPC endpoint and an Amazon EC2 instance.
Reveal Answer Answer: A Next Question
Question 5
A manufacturing company wants to use machine learning (ML) to automate quality control in its facilities. The facilities are in remote locations and have limited internet connectivity. The company has 20 of training data that consists of labeled images of defective product parts. The training data is in the corporate on-premises data center.The company will use this data to train a model for real-time defect detection in new parts as the parts move on a conveyor belt in the facilities. The company needs a solution that minimizes costs for compute infrastructure and that maximizes the scalability of resources for training. The solution also must facilitate the company's use of an ML model in the low-connectivity environments.Which solution will meet these requirements?
- Move the training data to an Amazon S3 bucket. Train and evaluate the model by using Amazon SageMaker. Optimize the model by using SageMaker Neo. Deploy the model on a SageMaker hosting services endpoint.
- Train and evaluate the model on premises. Upload the model to an Amazon S3 bucket. Deploy the model on an Amazon SageMaker hosting services endpoint.
- Move the training data to an Amazon S3 bucket. Train and evaluate the model by using Amazon SageMaker. Optimize the model by using SageMaker Neo. Set up an edge device in the manufacturing facilities with AWS IoT Greengrass. Deploy the model on the edge device.
- Train the model on premises. Upload the model to an Amazon S3 bucket. Set up an edge device in the manufacturing facilities with AWS IoT Greengrass. Deploy the model on the edge device.
Reveal Answer Answer: C Next Question
Question 6
An online store is predicting future book sales by using a linear regression model that is based on past sales data. The data includes duration, a numerical feature that represents the number of days that a book has been listed in the online store. A data scientist performs an exploratory data analysis and discovers that the relationship between book sales and duration is skewed and non-linear.Which data transformation step should the data scientist take to improve the predictions of the model?
- One-hot encoding
- Cartesian product transformation
- Quantile binning
- Normalization
Reveal Answer Answer: C Next Question
Question 7
A retail company wants to build a recommendation system for the company's website. The system needs to provide recommendations for existing users and needs to base those recommendations on each user's past browsing history. The system also must filter out any items that the user previously purchased.Which solution will meet these requirements with the LEAST development effort?
- Train a model by using a user-based collaborative filtering algorithm on Amazon SageMaker. Host the model on a SageMaker real-time endpoint. Configure an Amazon API Gateway API and an AWS Lambda function to handle real-time inference requests that the web application sends. Exclude the items that the user previously purchased from the results before sending the results back to the web application.
- Use an Amazon Personalize PERSONALIZED_RANKING recipe to train a model. Create a real-time filter to exclude items that the user previously purchased. Create and deploy a campaign on Amazon Personalize. Use the GetPersonalizedRanking API operation to get the real-time recommendations.
- Use an Amazon Personalize USER_ PERSONAL IZATION recipe to train a model Create a real-time filter to exclude items that the user previously purchased. Create and deploy a campaign on Amazon Personalize. Use the GetRecommendations API operation to get the real-time recommendations.
- Train a neural collaborative filtering model on Amazon SageMaker by using GPU instances. Host the model on a SageMaker real-time endpoint. Configure an Amazon API Gateway API and an AWS Lambda function to handle real-time inference requests that the web application sends. Exclude the items that the user previously purchased from the results before sending the results back to the web application.
Reveal Answer Answer: C Next Question
Question 8
A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model should output a continuous value as its prediction. The data available includes labeled outcomes for a set of 4,000 patients. The study was conducted on a group of individuals over the age of 65 who have a particular disease that is known to worsen with age.Initial models have performed poorly. While reviewing the underlying data, the Data Scientist notices that, out of 4,000 patient observations, there are 450 where the patient age has been input as 0. The other features for these observations appear normal compared to the rest of the sample population.How should the Data Scientist correct this issue?
- Drop all records from the dataset where age has been set to 0.
- Replace the age field value for records with a value of 0 with the mean or median value from the dataset.
- Drop the age feature from the dataset and train the model using the rest of the features.
- Use k-means clustering to handle missing features.
Reveal Answer Answer: B Next Question
Question 9
A machine learning (ML) specialist must develop a classification model for a financial services company. A domain expert provides the dataset, which is tabular with 10,000 rows and 1,020 features. During exploratory data analysis, the specialist finds no missing values and a small percentage of duplicate rows. There are correlation scores of > 0.9 for 200 feature pairs. The mean value of each feature is similar to its 50th percentile.Which feature engineering strategy should the ML specialist use with Amazon SageMaker?
- Apply dimensionality reduction by using the principal component analysis (PCA) algorithm.
- Drop the features with low correlation scores by using a Jupyter notebook.
- Apply anomaly detection by using the Random Cut Forest (RCF) algorithm.
- Concatenate the features with high correlation scores by using a Jupyter notebook.
Reveal Answer Answer: A Next Question
Question 10
A company ingests machine learning (ML) data from web advertising clicks into an Amazon S3 data lake. Click data is added to an Amazon Kinesis data stream by using the Kinesis Producer Library (KPL). The data is loaded into the S3 data lake from the data stream by using an Amazon Kinesis Data Firehose delivery stream. As the data volume increases, an ML specialist notices that the rate of data ingested into Amazon S3 is relatively constant. There also is an increasing backlog of data for Kinesis Data Streams and Kinesis Data Firehose to ingest.Which next step is MOST likely to improve the data ingestion rate into Amazon S3?
- Increase the number of S3 prefixes for the delivery stream to write to.
- Decrease the retention period for the data stream.
- Increase the number of shards for the data stream.
- Add more consumers using the Kinesis Client Library (KCL).
Reveal Answer Answer: C Next Question

Page: 1 / 62
Total Questions: 307

Free AWS Certified Machine Learning – Specialty MLS-C01 Exam Practice Test

In Just $59 You can Access

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Login

Register