Which statement regarding stream-static joins and static Delta tables is correct?
Which statement regarding stream-static joins and static Delta tables is correct?A . Each microbatch of a stream-static join will use the most recent version of the static Delta table as of each microbatch.B . Each microbatch of a stream-static join will use the most recent version of the static Delta...
Assuming that all data governance considerations are accounted for, which statement accurately informs this decision?
A small company based in the United States has recently contracted a consulting firm in India to implement several new data engineering pipelines to power artificial intelligence applications. All the company's data is stored in regional cloud storage in the United States. The workspace administrator at the company is uncertain...
If all users on the finance team are members of the finance group, which statement describes how the tx_sales table will be created?
An external object storage container has been mounted to the location /mnt/finance_eda_bucket. The following logic was executed to create a database for the finance team: After the database was successfully created and permissions configured, a member of the finance team runs the following code: If all users on the finance...
Assuming there are millions of user accounts and tens of thousands of records processed hourly, which implementation can be used to efficiently update the described account_current table as part of each hourly batch job?
An hourly batch job is configured to ingest data files from a cloud object storage container where each batch represent all records produced by the source system in a given hour. The batch job to process these records into the Lakehouse is sufficiently delayed to ensure no late-arriving data is...
Which of the following accurately presents information about Delta Lake and Databricks that may impact their decision-making process?
A junior data engineer is working to implement logic for a Lakehouse table named silver_device_recordings. The source data contains 100 unique fields in a highly nested JSON structure. The silver_device_recordings table will be used downstream to power several production monitoring dashboards and a production model. At present, 45 of the...
Which command allows manual confirmation that these three requirements have been met?
The data governance team has instituted a requirement that all tables containing Personal Identifiable Information (PH) must be clearly annotated. This includes adding column comments, table comments, and setting the custom table property "contains_pii" = true. The following SQL DDL statement is executed to create a new table: Which command...
Assuming that all configurations and referenced resources are available, which statement describes the result of executing this workload three times?
A junior data engineer has configured a workload that posts the following JSON to the Databricks REST API endpoint 2.0/jobs/create. Assuming that all configurations and referenced resources are available, which statement describes the result of executing this workload three times?A . Three new jobs named "Ingest new data" will be...
Which approach will ensure that this requirement is met?
The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables. Which approach will ensure that this requirement is met?A . Whenever a database is being created, make sure that the location keyword is usedB . When configuring an external data warehouse...
Which situation is causing increased duration of the overall job?
A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task...
Assuming that user_id is a unique identifying key and that delete_requests contains all users that have requested deletion, which statement describes whether successfully executing the above logic guarantees that the records to be deleted are no longer accessible and why?
The data governance team is reviewing code used for deleting records for compliance with GDPR. They note the following logic is used to delete records from the Delta Lake table named users. Assuming that user_id is a unique identifying key and that delete_requests contains all users that have requested deletion,...