Ben Chuanlong Du's Blog

It is never too late to learn.

Spark Issue: IllegalArgumentException: System Memory Must Be At Least

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

Exception in thread "main" java.lang.IllegalArgumentException: System memory 466092032 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration …

Rust and Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The simplest and best way is to leverage pandas_udf in PySpark. In the pandas UDF, you can call subprocess.run to run any shell command and capture its output.

from pathlib …

Yarn for Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. List all Spark applications.

    yarn application --list
    
  2. Show status of a Spark application.

    yarn application -status application_1459542433815_0002
    
  3. view logs of a Spark application.

    yarn logs -applicationId application_1459542433815_0002
    
  4. kill a Spark application …

Spark Issue Libc Not Found

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

/lib64/libc.so.6: version `GLIBC_2.18' not found (required by ...)

Cause

The required version of GLIBC by the binary executor is not found on Spark nodes.

Solution

Recompile your …