A monitoring service generates 1 TB of scale metrics record data every minute A Research team performs queries on this data using Amazon Athena. The queries run slowly due to the large volume of data, and the team requires better performance
How should the records be stored in Amazon S3 to improve query performance?
A . CSV files
B . Parquet files
C . Compressed JSON
D . RecordIO
Answer: B
Explanation:
Parquet is a columnar storage format that can store data in a compressed and efficient way. Parquet files can improve query performance by reducing the amount of data that needs to be scanned, as only the relevant columns are read from the files. Parquet files can also support predicate pushdown, which means that the filtering conditions are applied at the storage level, further reducing the data that needs to be processed. Parquet files are compatible with Amazon Athena, which can leverage the benefits of the columnar format and provide faster and cheaper queries. Therefore, the records should be stored in Parquet files in Amazon S3 to improve query performance.
References:
Columnar Storage Formats – Amazon Athena
Parquet SerDe – Amazon Athena
Optimizing Amazon Athena Queries – Amazon Athena
Parquet – Apache Software Foundation
Latest MLS-C01 Dumps Valid Version with 104 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund