Exam4Training

Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?

Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?
A . pandas API on Spark DataFrames are single-node versions of Spark DataFrames with additional metadata
B . pandas API on Spark DataFrames are more performant than Spark DataFrames
C . pandas API on Spark DataFrames are made up of Spark DataFrames and additional metadata
D . pandas API on Spark DataFrames are less mutable versions of Spark DataFrames
E . pandas API on Spark DataFrames are unrelated to Spark DataFrames

Answer: C

Explanation:

Pandas API on Spark (previously known as Koalas) provides a pandas-like API on top of Apache Spark.

It allows users to perform pandas operations on large datasets using Spark’s distributed compute capabilities. Internally, it uses Spark DataFrames and adds metadata that facilitates handling operations in a pandas-like manner, ensuring compatibility and leveraging Spark’s performance and scalability.

Reference

pandas API on Spark documentation:

https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/index.html

Exit mobile version