Which data technology should you implement?

exams DP-200 DP-200 exam 0 Comments

Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.

You must develop a pipeline that meets the following requirements:

Process data every six hours

Offer interactive data analysis capabilities

Offer the ability to process data using solid-state drive (SSD) caching

Use Directed Acyclic Graph(DAG) processing mechanisms

Provide support for REST API calls to monitor processes

Provide native support for Python

Integrate with Microsoft Power BI

You need to select the appropriate data technology to implement the pipeline.

Which data technology should you implement?
A . Azure SQL Data Warehouse
B . HDInsight Apache Storm cluster
C . Azure Stream Analytics
D . HDInsight Apache Hadoop cluster using MapReduce
E . HDInsight Spark cluster

Answer: B

Explanation:

Storm runs topologies instead of the Apache Hadoop MapReduce jobs that you might be familiar with. Storm topologies are composed of multiple components that are arranged in a directed acyclic graph (DAG). Data flows between the components in the graph. Each component consumes one or more data streams, and can optionally emit one or more streams.

Python can be used to develop Storm components.

References: https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-overview