Microsoft DP-100 Designing and Implementing a Data Science Solution on Azure Online Training
Microsoft DP-100 Online Training
The questions for DP-100 were last updated at Jan 21,2025.
- Exam Code: DP-100
- Exam Name: Designing and Implementing a Data Science Solution on Azure
- Certification Provider: Microsoft
- Latest update: Jan 21,2025
HOTSPOT
You are performing a classification task in Azure Machine Learning Studio.
You must prepare balanced testing and training samples based on a provided data set.
You need to split the data with a 0.75:0.25 ratio.
Which value should you use for each parameter? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
HOTSPOT
You create a binary classification model using Azure Machine Learning Studio.
You must use a Receiver Operating Characteristic (RO C) curve and an F1 score to evaluate the model.
You need to create the required business metrics.
How should you complete the experiment? To answer, select the appropriate options in the dialog box in the answer area. NOTE: Each correct selection is worth one point.
HOTSPOT
You are tuning a hyperparameter for an algorithm. The following table shows a data set with different hyperparameter, training error, and validation errors.
Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.
You use Azure Machine Learning Studio to build a machine learning experiment.
You need to divide data into two distinct datasets.
Which module should you use?
- A . Partition and Sample
- B . Assign Data to Clusters
- C . Group Data into Bins
- D . Test Hypothesis Using t-Test
You are developing a hands-on workshop to introduce Docker for Windows to attendees.
You need to ensure that workshop attendees can install Docker on their devices.
Which two prerequisite components should attendees install on the devices? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
- A . Microsoft Hardware-Assisted Virtualization Detection Tool
- B . Kitematic
- C . BIOS-enabled virtualization
- D . VirtualBox
- E . Windows 10 64-bit Professional
Your team is building a data engineering and data science development environment.
The environment must support the following requirements:
✑ support Python and Scala
✑ compose data storage, movement, and processing services into automated data pipelines
✑ the same tool should be used for the orchestration of both data engineering and data science
✑ support workload isolation and interactive workloads
✑ enable scaling across a cluster of machines
You need to create the environment.
What should you do?
- A . Build the environment in Apache Hive for HDInsight and use Azure Data Factory for orchestration.
- B . Build the environment in Azure Databricks and use Azure Data Factory for orchestration.
- C . Build the environment in Apache Spark for HDInsight and use Azure Container Instances for orchestration.
- D . Build the environment in Azure Databricks and use Azure Container Instances for orchestration.
DRAG DROP
You are building an intelligent solution using machine learning models.
The environment must support the following requirements:
✑ Data scientists must build notebooks in a cloud environment
✑ Data scientists must use automatic feature engineering and model building in machine learning pipelines.
✑ Notebooks must be deployed to retrain using Spark instances with dynamic worker allocation.
✑ Notebooks must be exportable to be version controlled locally.
You need to create the environment.
Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
You plan to build a team data science environment. Data for training models in machine learning pipelines will be over 20 GB in size.
You have the following requirements:
✑ Models must be built using Caffe2 or Chainer frameworks.
✑ Data scientists must be able to use a data science environment to build the machine learning pipelines and train models on their personal devices in both connected and disconnected network environments.
✑ Personal devices must support updating machine learning pipelines when connected to a network.
You need to select a data science environment.
Which environment should you use?
- A . Azure Machine Learning Service
- B . Azure Machine Learning Studio
- C . Azure Databricks
- D . Azure Kubernetes Service (AKS)
You are implementing a machine learning model to predict stock prices.
The model uses a PostgreSQL database and requires GPU processing.
You need to create a virtual machine that is pre-configured with the required tools.
What should you do?
- A . Create a Data Science Virtual Machine (DSVM) Windows edition.
- B . Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.
- C . Create a Deep Learning Virtual Machine (DLVM) Linux edition.
- D . Create a Deep Learning Virtual Machine (DLVM) Windows edition.
- E . Create a Data Science Virtual Machine (DSVM) Linux edition.
You are developing deep learning models to analyze semi-structured, unstructured, and structured data types.
You have the following data available for model building:
✑ Video recordings of sporting events
✑ Transcripts of radio commentary about events
✑ Logs from related social media feeds captured during sporting events
You need to select an environment for creating the model.
Which environment should you use?
- A . Azure Cognitive Services
- B . Azure Data Lake Analytics
- C . Azure HDInsight with Spark MLib
- D . Azure Machine Learning Studio