What should you do?

exams DP-100 V5 DP-100 exam 0 Comments

Your team is building a data engineering and data science development environment.

The environment must support the following requirements:

✑ support Python and Scala

✑ compose data storage, movement, and processing services into automated data pipelines

✑ the same tool should be used for the orchestration of both data engineering and data science

✑ support workload isolation and interactive workloads

✑ enable scaling across a cluster of machines

You need to create the environment.

What should you do?
A . Build the environment in Apache Hive for HDInsight and use Azure Data Factory for orchestration.
B . Build the environment in Azure Databricks and use Azure Data Factory for orchestration.
C . Build the environment in Apache Spark for HDInsight and use Azure Container Instances for orchestration.
D . Build the environment in Azure Databricks and use Azure Container Instances for orchestration.

Answer: B

Explanation:

In Azure Databricks, we can create two different types of clusters.

Standard, these are the default clusters and can be used with Python, R, Scala and SQL

High-concurrency

Azure Databricks is fully integrated with Azure Data Factory.

Incorrect Answers:

D: Azure Container Instances is good for development or testing. Not suitable for production workloads.

Reference: https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/data-science-and-machinelearning