Given the large number of stores and the legacy data ingestion, which change will require the LEAST amount of development effort?
A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data Firehose To support training an improved machine learning model, training records will require new but simple transformations, and some attributes will be combined. The model needs lo be retrained...
How should the records be stored in Amazon S3 to improve query performance?
A monitoring service generates 1 TB of scale metrics record data every minute A Research team performs queries on this data using Amazon Athena. The queries run slowly due to the large volume of data, and the team requires better performance How should the records be stored in Amazon S3...
Which prior probability distribution should the ML Specialist use for this variable?
A Machine Learning Specialist is implementing a full Bayesian network on a dataset that describes public transit in New York City. One of the random variables is discrete, and represents the number of minutes New Yorkers wait for a bus given that the buses cycle every 10 minutes, with a...
What metric is BEST suited to score the model?
A Machine Learning Specialist is working for a credit card processing company and receives an unbalanced dataset containing credit card transactions. It contains 99,000 valid transactions and 1,000 fraudulent transactions. The Specialist is asked to score a model that was run against the dataset. The Specialist has been advised that...
How should the Specialist frame this business problem?
A Machine Learning Specialist works for a credit card processing company and needs to predict which transactions may be fraudulent in near-real time. Specifically, the Specialist must train a model that returns the probability that a given transaction may be fraudulent. How should the Specialist frame this business problem?A ....
What should the Specialist do to ensure better convergence during backpropagation?
While working on a neural network project, a Machine Learning Specialist discovers that some features in the data have very high magnitude resulting in this data being weighted more in the cost function. What should the Specialist do to ensure better convergence during backpropagation?A . Dimensionality reductionB . Data normalizationC...
What feature engineering and model development approach should the Specialist take with a dataset this large?
A Machine Learning Specialist is working with multiple data sources containing billions of records that need to be joined. What feature engineering and model development approach should the Specialist take with a dataset this large?A . Use an Amazon SageMaker notebook for both feature engineering and model developmentB . Use...
Which model is MOST likely to provide the best results in Amazon SageMaker?
A city wants to monitor its air quality to address the consequences of air pollution A Machine Learning Specialist needs to forecast the air quality in parts per million of contaminates for the next 2 days in the city as this is a prototype, only daily data from the last...
Which approach should the Specialist use to continue working?
A Machine Learning Specialist is assigned a TensorFlow project using Amazon SageMaker for training, and needs to continue working for an extended period with no Wi-Fi access. Which approach should the Specialist use to continue working?A . Install Python 3 and boto3 on their laptop and continue the code development...
A Machine Learning Specialist is working with a media company to perform classification on popular articles from the company's website. The company is using random forests to classify how popular an article will be before it is published A sample of the data being used is below.
A Machine Learning Specialist is working with a media company to perform classification on popular articles from the company's website. The company is using random forests to classify how popular an article will be before it is published A sample of the data being used is below. Given the dataset,...