Which statement characterizes the general programming model used by Spark Structured Streaming?
Which statement characterizes the general programming model used by Spark Structured Streaming?
A . Structured Streaming leverages the parallel processing of GPUs to achieve highly parallel data throughput.
B . Structured Streaming is implemented as a messaging bus and is derived from Apache Kafka.
C . Structured Streaming uses specialized hardware and I/O streams to achieve sub-second latency for data transfer.
D . Structured Streaming models new data arriving in a data stream as new rows appended to an unbounded table.
E . Structured Streaming relies on a distributed network of nodes that hold incremental state values for cached stages.
Answer: B
Explanation:
This is the correct answer because it characterizes the general programming model used by Spark Structured Streaming, which is to treat a live data stream as a table that is being continuously appended. This leads to a new stream processing model that is very similar to a batch processing
model, where users can express their streaming computation using the same Dataset/DataFrame API as they would use for static data. The Spark SQL engine will take care of running the streaming query incrementally and continuously and updating the final result as streaming data continues to arrive.
Verified Reference: [Databricks Certified Data Engineer Professional], under “Structured Streaming” section; Databricks Documentation, under “Overview” section.
Latest Databricks Certified Professional Data Engineer Dumps Valid Version with 222 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund