When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?

When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?
A . Cluster: New Job Cluster;
Retries: Unlimited;
Maximum Concurrent Runs: Unlimited
B . Cluster: New Job Cluster;
Retries: None;
Maximum Concurrent Runs: 1
C . Cluster: Existing All-Purpose Cluster; Retries: Unlimited;
Maximum Concurrent Runs: 1
D . Cluster: Existing All-Purpose Cluster; Retries: Unlimited;
Maximum Concurrent Runs: 1

E . Cluster: Existing All-Purpose Cluster; Retries: None;
Maximum Concurrent Runs: 1

Answer: D

Explanation:

The configuration that automatically recovers from query failures and keeps costs low is to use a new job cluster, set retries to unlimited, and set maximum concurrent runs to 1.

This configuration has the following advantages:

A new job cluster is a cluster that is created and terminated for each job run. This means that the cluster resources are only used when the job is running, and no idle costs are incurred. This also ensures that the cluster is always in a clean state and has the latest configuration and libraries for the job1.

Setting retries to unlimited means that the job will automatically restart the query in case of any failure, such as network issues, node failures, or transient errors. This improves the reliability and availability of the streaming job, and avoids data loss or inconsistency2.

Setting maximum concurrent runs to 1 means that only one instance of the job can run at a time. This prevents multiple queries from competing for the same resources or writing to the same output location, which can cause performance degradation or data corruption3.

Therefore, this configuration is the best practice for scheduling Structured Streaming jobs for production, as it ensures that the job is resilient, efficient, and consistent.

Reference: Job clusters, Job retries, Maximum concurrent runs

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments