Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.
You must develop a pipeline that meets the following requirements:
Process data every six hours
Offer interactive data analysis capabilities
Offer the ability to process data using solid-state drive (SSD) caching
Use Directed Acyclic Graph(DAG) processing mechanisms
Provide support for REST API calls to monitor processes
Provide native support for Python
Integrate with Microsoft Power BI
You need to select the appropriate data technology to implement the pipeline.
Which data technology should you implement?
A . Azure SQL Data Warehouse
B . HDInsight Apache Storm cluster
C . Azure Stream Analytics
D . HDInsight Apache Hadoop cluster using MapReduce
E . HDInsight Spark cluster
Answer: B
Explanation:
Storm runs topologies instead of the Apache Hadoop MapReduce jobs that you might be familiar with. Storm topologies are composed of multiple components that are arranged in a directed acyclic graph (DAG). Data flows between the components in the graph. Each component consumes one or more data streams, and can optionally emit one or more streams.
Python can be used to develop Storm components.
References: https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-overview
Latest DP-200 Dumps Valid Version with 242 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund