Which AWS service strategy is best for this use case?

A new algorithm has been written in Python to identify SPAM e-mails. The algorithm analyzes the free text contained within a sample set of 1 million e-mails stored on Amazon S3. The algorithm must be scaled across a production dataset of 5 PB, which also resides in Amazon S3 storage.

Which AWS service strategy is best for this use case?
A . Copy the data into Amazon ElastiCache to perform text analysis on the in-memory data and export the results of the model into Amazon Machine Learning.
B . Use Amazon EMR to parallelize the text analysis tasks across the cluster using a streaming program step.
C . Use Amazon Elasticsearch Service to store the text and then use the Python Elasticsearch Client to run analysis against the text index.
D . Initiate a Python job from AWS Data Pipeline to run directly against the Amazon S3 text files.

Answer: C

Explanation:

Reference: https://aws.amazon.com/blogs/database/indexing-metadata-in-amazon-elasticsearch-serviceusing-aws-lambda-and-python/