In many small file scenarios, Spark will start a lot of tasks. When there is a Shuffle operation in the SQL logic, it will greatly increase the number of hash buckets, which will seriously affect performance. In Fusioninsight, for small file scenarios, the () operator is usually used to merge the partitions generated by the small files in the Table to reduce the number of partitions, to avoid generating too many hash buckets during shuffle and improve performance?
In many small file scenarios, Spark will start a lot of tasks. When there is a Shuffle operation in the SQL logic, it will greatly increase the number of hash buckets, which will seriously affect performance. In Fusioninsight, for small file scenarios, the () operator is usually used to merge the partitions generated by the small files in the Table to reduce the number of partitions, to avoid generating too many hash buckets during shuffle and improve performance?
A . group by
B . coalosce
C . connect
D . join
Answer: D
Latest H13-711_V3.0-ENU Dumps Valid Version with 300 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund
Subscribe
Login
0 Comments
Inline Feedbacks
View all comments