Which approach will ensure that this requirement is met?

The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables. Which approach will ensure that this requirement is met?A . Whenever a database is being created, make sure that the location keyword is usedB . When configuring an external data warehouse...

September 8, 2024 No Comments READ MORE +

Assuming that this code produces logically correct results and the data in the source tables has been de-duplicated and validated, which statement describes what will occur when this code is executed?

The data engineering team maintains the following code: Assuming that this code produces logically correct results and the data in the source tables has been de-duplicated and validated, which statement describes what will occur when this code is executed?A . A batch job will update the enriched_itemized_orders_by_account table, replacing only...

September 6, 2024 No Comments READ MORE +

Which approach will allow this developer to review the current logic for this notebook?

A junior developer complains that the code in their notebook isn't producing the correct results in the development environment. A shared screenshot reveals that while they're using a notebook versioned with Databricks Repos, they're using a personal branch that contains old logic. The desired branch named dev-2.3.9 is not available...

September 5, 2024 No Comments READ MORE +

Which statement describes the results of querying recent_orders?

A table is registered with the following code: Both users and orders are Delta Lake tables. Which statement describes the results of querying recent_orders?A . All logic will execute at query time and return the result of joining the valid versions of the source tables at the time the query...

September 3, 2024 No Comments READ MORE +

Which statement describes the execution and results of running the above query multiple times?

A junior data engineer seeks to leverage Delta Lake's Change Data Feed functionality to create a Type 1 table representing all of the values that have ever been valid for all rows in a bronze table created with the property delta.enableChangeDataFeed = true. They plan to execute the following code...

September 2, 2024 No Comments READ MORE +

Which situation is causing increased duration of the overall job?

A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task...

September 2, 2024 No Comments READ MORE +

Which code snippet completes this function definition?

A nightly job ingests data into a Delta Lake table using the following code: The next step in the pipeline requires a function that returns an object that can be used to manipulate new records that have not yet been processed to the next table in the pipeline. Which code...

August 31, 2024 No Comments READ MORE +

Which statement explains the cause of this failure?

The downstream consumers of a Delta Lake table have been complaining about data quality issues impacting performance in their applications. Specifically, they have complained that invalid latitude and longitude values in the activity_details table have been breaking their ability to use other geolocation processes. A junior engineer has written the...

August 31, 2024 No Comments READ MORE +

When this query is executed, what will happen with new records that have the same event_id as an existing record?

A junior data engineer on your team has implemented the following code block. The view new_events contains a batch of records with the same schema as the events Delta table. The event_id field serves as a unique key for this table. When this query is executed, what will happen with...

August 31, 2024 No Comments READ MORE +

Assuming that all data governance considerations are accounted for, which statement accurately informs this decision?

A small company based in the United States has recently contracted a consulting firm in India to implement several new data engineering pipelines to power artificial intelligence applications. All the company's data is stored in regional cloud storage in the United States. The workspace administrator at the company is uncertain...

August 30, 2024 No Comments READ MORE +