Site icon Exam4Training

What should you recommend?

You are designing an Azure Data Lake Storage solution that will transform raw JSON files for use in an analytical workload.

You need to recommend a format for the transformed files.

The solution must meet the following requirements:

✑ Contain information about the data types of each column in the files.

✑ Support querying a subset of columns in the files.

✑ Support read-heavy analytical workloads.

✑ Minimize the file size.

What should you recommend?
A . JSON
B. CSV
C. Apache Avro
D. Apache Parquet

Answer: D

Explanation:

Parquet, an open-source file format for Hadoop, stores nested data structures in a flat columnar format.

Compared to a traditional approach where data is stored in a row-oriented approach,

Parquet file format is more efficient in terms of storage and performance.

It is especially good for queries that read particular columns from a “wide” (with many columns) table since only needed columns are read, and IO is minimized.

Reference: https://www.clairvoyant.ai/blog/big-data-file-formats

Latest DP-203 Dumps Valid Version with 116 Q&As

Latest And Valid Q&A | Instant Download | Once Fail, Full Refund

Exit mobile version