Which of the following approaches can the data engineer take to identify the table that is dropping the records?

A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped.

Which of the following approaches can the data engineer take to identify the table that is dropping the records?
A . They can set up separate expectations for each table when developing their DLT pipeline.
B . They cannot determine which table is dropping the records.
C . They can set up DLT to notify them via email when records are dropped.
D . They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.
E . They can navigate to the DLT pipeline page, click on the “Error” button, and review the present errors.

Answer: D

Explanation:

One of the features of DLT is that it provides data quality metrics for each dataset in the pipeline, such as the number of records that pass or fail expectations, the number of records that are dropped, and the number of records that are written to the target. These metrics can be accessed from the DLT pipeline page, where the data engineer can click on each table and view the data quality statistics for the latest update or any previous update. This way, they can identify which table is dropping the records and why.

Reference: Monitor Delta Live Tables pipelines

Manage data quality with Delta Live Tables

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments