Which of the following code blocks will accomplish this task?

exams

10 months ago

A data scientist has a Spark DataFrame spark_df. They want to create a new Spark DataFrame that contains only the rows from spark_df where the value in column price is greater than 0.

Which of the following code blocks will accomplish this task?
A . spark_df[spark_df["price"] > 0]
B . spark_df.filter(col("price") > 0)
C . SELECT * FROM spark_df WHERE price > 0
D . spark_df.loc[spark_df["price"] > 0,:]
E . spark_df.loc[:,spark_df["price"] > 0]

Answer: B

Explanation:

To filter rows in a Spark DataFrame based on a condition, you use the filter method along with a column condition. The correct syntax in PySpark to accomplish this task is spark_df.filter(col("price")

> 0), which filters the DataFrame to include only those rows where the value in the "price" column is greater than 0. The col function is used to specify column-based operations. The other options provided either do not use correct Spark DataFrame syntax or are intended for different types of data manipulation frameworks like pandas.

Reference: PySpark DataFrame API documentation (Filtering DataFrames).

Latest Databricks Machine Learning Associate Dumps Valid Version with 74 Q&As

Latest And Valid Q&A | Instant Download | Once Fail, Full Refund

Instant Download Databricks Machine Learning Associate PDF Databricks Machine Learning Associate Questions Online Training