Which of the following code blocks creates this SQL UDF?

A data engineer needs to apply custom logic to string column city in table stores for a specific use case. In order to apply this custom logic at scale, the data engineer wants to create a SQL user-defined function (UDF). Which of the following code blocks creates this SQL UDF?...

August 24, 2024 No Comments READ MORE +

Which of the following tools can the data engineer use to solve this problem?

A data engineer is maintaining a data pipeline. Upon data ingestion, the data engineer notices that the source data is starting to have a lower level of quality. The data engineer would like to automate the process of monitoring the quality level. Which of the following tools can the data...

August 22, 2024 No Comments READ MORE +

If the data engineer only wants the query to execute a micro-batch to process data every 5 seconds, which of the following lines of code should the data engineer use to fill in the blank?

A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table. The cade block used by the data engineer is below: If the data engineer only wants the query to execute a micro-batch to...

August 21, 2024 No Comments READ MORE +

Which of the following approaches can the data engineer use to minimize the total running time of the SQL endpoint used in the refresh schedule of their dashboard?

A data engineer wants to schedule their Databricks SQL dashboard to refresh once per day, but they only want the associated SQL endpoint to be running when it is necessary. Which of the following approaches can the data engineer use to minimize the total running time of the SQL endpoint...

August 20, 2024 No Comments READ MORE +

Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos. Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?A . Databricks Repos automatically saves development progressB . Databricks Repos supports the use...

August 19, 2024 No Comments READ MORE +

Which of the following describes a scenario in which a data team will want to utilize cluster pools?

Which of the following describes a scenario in which a data team will want to utilize cluster pools?A . An automated report needs to be refreshed as quickly as possible.B . An automated report needs to be made reproducible.C . An automated report needs to be tested to identify errors.D...

August 18, 2024 No Comments READ MORE +

Which of the following is hosted completely in the control plane of the classic Databricks architecture?

Which of the following is hosted completely in the control plane of the classic Databricks architecture?A . Worker nodeB . JDBC data sourceC . Databricks web applicationD . Databricks FilesystemE . Driver nodeView AnswerAnswer: C Explanation: The Databricks web application is the user interface that allows you to create and...

August 18, 2024 No Comments READ MORE +

Which of the following tools can the data engineer use to solve this problem?

A data engineer is designing a data pipeline. The source system generates files in a shared directory that is also used by other processes. As a result, the files should be kept as is and will accumulate in the directory. The data engineer needs to identify which files are new...

August 18, 2024 No Comments READ MORE +

Which of the following benefits of using the Databricks Lakehouse Platform is provided by Delta Lake?

Which of the following benefits of using the Databricks Lakehouse Platform is provided by Delta Lake?A . The ability to manipulate the same data using a variety of languagesB . The ability to collaborate in real time on a single notebookC . The ability to set up alerts for query...

August 16, 2024 No Comments READ MORE +

Which of the following approaches can the data engineer take to identify the table that is dropping the records?

A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at...

August 15, 2024 No Comments READ MORE +