Assuming that user_id is a unique identifying key and that delete_requests contains all users that have requested deletion, which statement describes whether successfully executing the above logic guarantees that the records to be deleted are no longer accessible and why?
The data governance team is reviewing code used for deleting records for compliance with GDPR.
They note the following logic is used to delete records from the Delta Lake table named users.
Assuming that user_id is a unique identifying key and that delete_requests contains all users that have requested deletion, which statement describes whether successfully executing the above logic guarantees that the records to be deleted are no longer accessible and why?
A . Yes; Delta Lake ACID guarantees provide assurance that the delete command succeeded fully and permanently purged these records.
B . No; the Delta cache may return records from previous versions of the table until the cluster is restarted.
C . Yes; the Delta cache immediately updates to reflect the latest data files recorded to disk.
D . No; the Delta Lake delete command only provides ACID guarantees when combined with the merge into command.
E . No; files containing deleted records may still be accessible with time travel until a vacuum command is used to remove invalidated data files.
Answer: E
Explanation:
The code uses the DELETE FROM command to delete records from the users table that match a condition based on a join with another table called delete_requests, which contains all users that have requested deletion. The DELETE FROM command deletes records from a Delta Lake table by creating a new version of the table that does not contain the deleted records. However, this does not guarantee that the records to be deleted are no longer accessible, because Delta Lake supports time travel, which allows querying previous versions of the table using a timestamp or version number. Therefore, files containing deleted records may still be accessible with time travel until a vacuum command is used to remove invalidated data files from physical storage.
Verified Reference: [Databricks Certified Data Engineer Professional], under “Delta Lake” section; Databricks Documentation, under “Delete from a table” section; Databricks Documentation, under “Remove files no longer referenced by a Delta table” section.
Latest Databricks Certified Professional Data Engineer Dumps Valid Version with 222 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund