Assuming all delete logic is correctly implemented, which statement correctly addresses this concern?

The data engineering team has configured a job to process customer requests to be forgotten (have their data deleted). All user data that needs to be deleted is stored in Delta Lake tables using default table settings.

The team has decided to process all deletions from the previous week as a batch job at 1am each Sunday. The total duration of this job is less than one hour. Every Monday at 3am, a batch job executes a series of VACUUM commands on all Delta Lake tables throughout the organization.

The compliance officer has recently learned about Delta Lake’s time travel functionality. They are concerned that this might allow continued access to deleted data.

Assuming all delete logic is correctly implemented, which statement correctly addresses this concern?
A . Because the vacuum command permanently deletes all files containing deleted records, deleted records may be accessible with time travel for around 24 hours.
B . Because the default data retention threshold is 24 hours, data files containing deleted records will be retained until the vacuum job is run the following day.
C . Because Delta Lake time travel provides full access to the entire history of a table, deleted records can always be recreated by users with full admin privileges.
D . Because Delta Lake’s delete statements have ACID guarantees, deleted records will be permanently purged from all storage systems as soon as a delete job completes.
E . Because the default data retention threshold is 7 days, data files containing deleted records will be retained until the vacuum job is run 8 days later.

Answer: E

Explanation:

https://learn.microsoft.com/en-us/azure/databricks/delta/vacuum

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments