Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Symtom
High disk and memory spill when doing shuffle.
Cause
Insufficient executor memory (you can monitor this spill metrics from Spark UI).
Solution
-
Increase executor memory.
--executor-memory=4G
-
For jobs that …