If the data engineer only wants the query to execute a micro-batch to process data every 5 seconds, which of the following lines of code should the data engineer use to fill in the blank?

A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.

The cade block used by the data engineer is below:

If the data engineer only wants the query to execute a micro-batch to process data every 5 seconds, which of the following lines of code should the data engineer use to fill in the blank?
A . trigger("5 seconds")
B . trigger()
C . trigger(once="5 seconds")
D . trigger(processingTime="5 seconds")
E . trigger(continuous="5 seconds")

Answer: D

Explanation:

The processingTime option specifies a time-based trigger interval for fixed interval micro-batches. This means that the query will execute a micro-batch to process data every 5 seconds, regardless of how much data is available. This option is suitable for near-real time processing workloads that require low latency and consistent processing frequency. The other options are either invalid syntax (A, C), default behavior (B), or experimental feature (E).

Reference: Databricks Documentation – Configure Structured Streaming trigger intervals, Databricks Documentation – Trigger.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments