Which of the following statements is the BEST recommendation for the given scenario?

exams MLA-C01 MLA-C01 exam 0 Comments

You are working as a machine learning engineer for a startup that provides image recognition services. The service is currently in its beta phase, and the company expects varying levels of traffic, with some days having very few requests and other days experiencing sudden spikes. The company wants to minimize costs during low-traffic periods while still being able to handle large, infrequent spikes of requests efficiently. Given these requirements, you are considering using Amazon SageMaker for your deployment.

Which of the following statements is the BEST recommendation for the given scenario?
A . Use Amazon SageMaker Asynchronous Inference that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently
B . Use Batch transform to run inference with Amazon SageMaker that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently
C . Use Amazon SageMaker Serverless Inference that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently
D . Use Amazon SageMaker Real-time Inference that minimizes costs during low-traffic periods while
managing large infrequent spikes of requests efficiently

Answer: C

Explanation:

Correct option:

Use Amazon SageMaker Serverless Inference that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently

via – https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-deployment.html

Serverless Inference is designed to automatically scale the compute resources based on incoming requests, making it highly efficient for handling varying levels of traffic. It is cost-effective because you only pay for the compute time used when requests are being processed. This makes it an excellent choice for scenarios where traffic is unpredictable, with periods of low or no traffic. It is ideal for workloads that have idle periods between traffic spikes and can tolerate cold starts.

via – https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html

Incorrect options:

Use Amazon SageMaker Asynchronous Inference that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently – Asynchronous Inference is ideal for handling large and long-running inference requests that do not require an immediate response. However, it may not be as cost-effective for handling fluctuating traffic where immediate scaling and low-latency are priorities.

Use Amazon SageMaker Real-time Inference that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently – Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements.

Use Batch transform to run inference with Amazon SageMaker that minimizes costs during low-traffic periods while managing large infrequent spikes of requests efficiently – To get predictions for an entire dataset, you can use Batch transform with Amazon SageMaker.

References:

https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html

https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-deployment.html