Ben Chuanlong Du's Blog

It is never too late to learn.

Format a Disk on Linux

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. Locate the right disk to operate on. A few commands might help you. For example, you can use the command ls /dev/sd* to list all hard drives and the command …

Spark Issue: High Disk and Memory Spill When Doing Shuffle

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symtom

High disk and memory spill when doing shuffle.

Cause

Insufficient executor memory (you can monitor this spill metrics from Spark UI).

Solution

  1. Increase executor memory.

    --executor-memory=4G
    
  2. For jobs that …