Which combination of steps should the Data Scientist take to reduce the number of false positive predictions by the model?

A Data Scientist is developing a machine learning model to classify whether a financial transaction is fraudulent. The labeled data available for training consists of 100,000 non-fraudulent observations and 1,000 fraudulent observations. The Data Scientist applies the XGBoost algorithm to the data, resulting in the following confusion matrix when the...

September 4, 2024 No Comments READ MORE +

Which services are integrated with Amazon SageMaker to track this information?

A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access notebooks, train models, and deploy endpoints. To ensure the best operational performance, the Specialist needs to be able to track how often the Scientists are deploying models, GPU and CPU utilization on the deployed SageMaker endpoints,...

September 4, 2024 No Comments READ MORE +

What model should be used to complete this work?

A Machine Learning Specialist was given a dataset consisting of unlabeled data. The Specialist must create a model that can help the team classify the data into different buckets. What model should be used to complete this work?A . K-means clusteringB . Random Cut Forest (RCF)C . XGBoostD . BlazingTextView...

September 4, 2024 No Comments READ MORE +

Which method should the Specialist try to improve model performance?

A Machine Learning Specialist deployed a model that provides product recommendations on a company's website Initially, the model was performing very well and resulted in customers buying more products on average However within the past few months the Specialist has noticed that the effect of product recommendations has diminished and...

September 4, 2024 No Comments READ MORE +

Which solution should the Data Scientist build to satisfy the requirements?

A Data Scientist needs to create a serverless ingestion and analytics solution for high-velocity, real-time streaming data. The ingestion process must buffer and convert incoming records from JSON to a query-optimized, columnar format without data loss. The output datastore must be highly available, and Analysts must be able to run...

September 3, 2024 No Comments READ MORE +

Which of the following services can feed data to the MapReduce jobs?

A Machine Learning Specialist needs to move and transform data in preparation for training Some of the data needs to be processed in near-real time and other data can be moved hourly There are existing Amazon EMR MapReduce jobs to clean and feature engineering to perform on the data. Which...

September 3, 2024 No Comments READ MORE +

Which approach allows the Specialist to use all the data to train the model?

A Machine Learning Specialist is developing a custom video recommendation model for an application The dataset used to train this model is very large with millions of data points and is hosted in an Amazon S3 bucket The Specialist wants to avoid loading all of this data onto an Amazon...

January 20, 2024 No Comments READ MORE +

Which of the following metrics should a Machine Learning Specialist generally use to compare/evaluate machine learning classification models against each other?

Which of the following metrics should a Machine Learning Specialist generally use to compare/evaluate machine learning classification models against each other?A . RecallB . Misclassification rateC . Mean absolute percentage error (MAPE)D . Area Under the ROC Curve (AUC)View AnswerAnswer: D Explanation: Area Under the ROC Curve (AUC) is a...

January 20, 2024 No Comments READ MORE +

Which solution takes the LEAST effort to implement?

A Mobile Network Operator is building an analytics platform to analyze and optimize a company's operations using Amazon Athena and Amazon S3. The source systems send data in CSV format in real lime The Data Engineering team wants to transform the data to the Apache Parquet format before storing it...

January 19, 2024 No Comments READ MORE +

Which solution requires the LEAST effort to be able to query this data?

A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this data. Which solution requires the LEAST effort to be able to query this data?A . Use AWS Data Pipeline to transform the data...

January 19, 2024 No Comments READ MORE +