Microsoft DP-203 Data Engineering on Microsoft Azure Online Training
Microsoft DP-203 Online Training
The questions for DP-203 were last updated at Nov 19,2024.
- Exam Code: DP-203
- Exam Name: Data Engineering on Microsoft Azure
- Certification Provider: Microsoft
- Latest update: Nov 19,2024
You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.
Which Azure Storage functionality should you include in the solution?
- A . change feed
- B . soft delete
- C . time-based retention
- D . lifecycle management
Topic 2, Litware, inc.
Case study
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.
To start the case study
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.
Overview
Litware, Inc. owns and operates 300 convenience stores across the US. The company sells a variety of packaged foods and drinks, as well as a variety of prepared foods, such as sandwiches and pizzas.
Litware has a loyalty club whereby members can get daily discounts on specific items by providing their membership number at checkout.
Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks.
Requirements
Business Goals
Litware wants to create a new analytics environment in Azure to meet the following requirements:
✑ See inventory levels across the stores. Data must be updated as close to real time as possible.
✑ Execute ad hoc analytical queries on historical data to identify whether the loyalty club discounts increase sales of the discounted products.
✑ Every four hours, notify store employees about how many prepared food items to produce based on historical demand from the sales data.
Technical Requirements
Litware identifies the following technical requirements:
✑ Minimize the number of different Azure services needed to achieve the business goals.
✑ Use platform as a service (PaaS) offerings whenever possible and avoid having to provision virtual machines that must be managed by Litware.
✑ Ensure that the analytical data store is accessible only to the company’s on-premises network and Azure services.
✑ Use Azure Active Directory (Azure AD) authentication whenever possible.
✑ Use the principle of least privilege when designing security.
✑ Stage Inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.
✑ Limit the business analysts’ access to customer contact information, such as phone numbers, because this type of data is not analytically relevant.
✑ Ensure that you can quickly restore a copy of the analytical data store within one hour in the event of corruption or accidental deletion.
Planned Environment
Litware plans to implement the following environment:
✑ The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
✑ Customer data, including name, contact information, and loyalty number, comes from Salesforce, a SaaS application, and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
✑ Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
✑ Daily inventory data comes from a Microsoft SQL server located on a private network.
✑ Litware currently has 5 TB of historical sales data and 100 GB of customer data. The company expects approximately 100 GB of new data per month for the next year.
✑ Litware will build a custom application named FoodPrep to provide store employees with the calculation results of how many prepared food items to produce every four hours.
✑ Litware does not plan to implement Azure ExpressRoute or a VPN between the on-premises network and Azure.
HOTSPOT
Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
What should you do to improve high availability of the real-time data processing solution?
- A . Deploy identical Azure Stream Analytics jobs to paired regions in Azure.
- B . Deploy a High Concurrency Databricks cluster.
- C . Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.
- D . Set Data Lake Storage to use geo-redundant storage (GRS).
What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?
- A . a server-level virtual network rule
- B . a database-level virtual network rule
- C . a database-level firewall IP rule
- D . a server-level firewall IP rule
What should you recommend using to secure sensitive customer contact information?
- A . data labels
- B . column-level security
- C . row-level security
- D . Transparent Data Encryption (TDE)
Topic 3, Mix Questions
You have an Azure Data Lake Storage account that has a virtual network service endpoint configured.
You plan to use Azure Data Factory to extract data from the Data Lake Storage account. The data will then be loaded to a data warehouse in Azure Synapse Analytics by using PolyBase.
Which authentication method should you use to access Data Lake Storage?
- A . shared access key authentication
- B . managed identity authentication
- C . account key authentication
- D . service principal authentication
HOTSPOT
You have an Azure subscription that contains the following resources:
✑ An Azure Active Directory (Azure AD) tenant that contains a security group named Group1
✑ An Azure Synapse Analytics SQL pool named Pool1
You need to control the access of Group1 to specific columns and rows in a table in Pool1.
Which Transact-SQL commands should you use? To answer, select the appropriate options in the answer area.
HOTSPOT
You need to implement an Azure Databricks cluster that automatically connects to Azure Data Lake Storage Gen2 by using Azure Active Directory (Azure AD) integration.
How should you configure the new cluster? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
You have an Azure Synapse Analystics dedicated SQL pool that contains a table named Contacts.
Contacts contains a column named Phone.
You need to ensure that users in a specific role only see the last four digits of a phone number when querying the Phone column.
What should you include in the solution?
- A . a default value
- B . dynamic data masking
- C . row-level security (RLS)
- D . column encryption
- E . table partitions
HOTSPOT
You have an Azure Synapse Analytics dedicated SQL pool that contains the users shown in the following table.
User1 executes a query on the database, and the query returns the results shown in the following exhibit.
User1 is the only user who has access to the unmasked data.
Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic. NOTE: Each correct selection is worth one point.