Microsoft DP-203 Data Engineering on Microsoft Azure Online Training
Microsoft DP-203 Online Training
The questions for DP-203 were last updated at Jan 21,2025.
- Exam Code: DP-203
- Exam Name: Data Engineering on Microsoft Azure
- Certification Provider: Microsoft
- Latest update: Jan 21,2025
A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices.
The company must be able to monitor the devices in real-time.
You need to design the solution.
What should you recommend?
- A . Azure Stream Analytics cloud job using Azure PowerShell
- B . Azure Analysis Services using Azure Portal
- C . Azure Data Factory instance using Azure Portal
- D . Azure Analysis Services using Azure PowerShell
You are designing a statistical analysis solution that will use custom proprietary1 Python functions on near real-time data from Azure Event Hubs.
You need to recommend which Azure service to use to perform the statistical analysis. The solution must minimize latency.
What should you recommend?
- A . Azure Stream Analytics
- B . Azure SQL Database
- C . Azure Databricks
- D . Azure Synapse Analytics
HOTSPOT
You have the following Azure Stream Analytics query.
For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.
HOTSPOT
You are designing an Azure Stream Analytics solution that receives instant messaging data from an Azure Event Hub.
You need to ensure that the output from the Stream Analytics job counts the number of messages per time zone every 15 seconds.
How should you complete the Stream Analytics query? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
You are designing an Azure Databricks interactive cluster. The cluster will be used infrequently and will be configured for auto-termination.
You need to ensure that the cluster configuration is retained indefinitely after the cluster is terminated. The solution must minimize costs.
What should you do?
- A . Clone the cluster after it is terminated.
- B . Terminate the cluster manually when processing completes.
- C . Create an Azure runbook that starts the cluster every 90 days.
- D . Pin the cluster.
You have an Azure Synapse Analytics job that uses Scala.
You need to view the status of the job.
What should you do?
- A . From Azure Monitor, run a Kusto query against the AzureDiagnostics table.
- B . From Azure Monitor, run a Kusto query against the SparkLogying1 Event.CL table.
- C . From Synapse Studio, select the workspace. From Monitor, select Apache Sparks applications.
- D . From Synapse Studio, select the workspace. From Monitor, select SQL requests.
You configure monitoring for a Microsoft Azure SQL Data Warehouse implementation. The implementation uses PolyBase to load data from comma-separated value (CSV) files stored in Azure Data Lake Gen 2 using an external table.
Files with an invalid schema cause errors to occur.
You need to monitor for an invalid schema error.
For which error should you monitor?
- A . EXTERNAL TABLE access failed due to internal error: ‘Java exception raised on call to
HdfsBridge_Connect: Error
[com.microsoft.polybase.client.KerberosSecureLogin] occurred while accessing
external files.’ - B . EXTERNAL TABLE access failed due to internal error: ‘Java exception raised on call to HdfsBridge_Connect: Error [No FileSystem for scheme: wasbs] occurred while accessing external file.’
- C . Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11": for linked server "(null)", Query aborted- the maximum reject threshold (o
rows) was reached while regarding from an external source: 1 rows rejected out of total 1 rows processed. - D . EXTERNAL TABLE access failed due to internal error: ‘Java exception raised on call to
HdfsBridge_Connect: Error [Unable to instantiate LoginClass] occurred while accessing external files.’
You use Azure Data Lake Storage Gen2.
You need to ensure that workloads can use filter predicates and column projections to filter data at the time the data is read from disk.
Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
- A . Reregister the Microsoft Data Lake Store resource provider.
- B . Reregister the Azure Storage resource provider.
- C . Create a storage policy that is scoped to a container.
- D . Register the query acceleration feature.
- E . Create a storage policy that is scoped to a container prefix filter.
DRAG DROP
You plan to monitor an Azure data factory by using the Monitor & Manage app.
You need to identify the status and duration of activities that reference a table in a source database.
Which three actions should you perform in sequence? To answer, move the actions from the list of actions to the answer are and arrange them in the correct order.
You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1.
You need to verify whether the size of the transaction log file for each distribution of DW1 is smaller than 160 GB.
What should you do?
- A . On the master database, execute a query against the
sys.dm_pdw_nodes_os_performance_counters dynamic management view. - B . From Azure Monitor in the Azure portal, execute a query against the logs of DW1.
- C . On DW1, execute a query against the sys.database_files dynamic management view.
- D . Execute a query against the logs of DW1 by using the Get-AzOperationalInsightSearchResult PowerShell cmdlet.