A Comprehensive List of Common Issues in Spark Applications
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
List of Common Issues
Please refer to http://www.legendu.net/misc/tag/spark-issue.html for a comprehensive list of Spark Issues and (possible) causes and solutions.
Debugging Tips
Spark/Hadoop …
Spark Issue: IllegalArgumentException: System Memory Must Be At Least
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Symptom
Exception in thread "main" java.lang.IllegalArgumentException: System memory 466092032 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration …
Tips on Rustfmt
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
tab_spaces = 4
max_width = 90
chain_width = 70
newline_style = "unix"
use_field_init_shorthand = true
use_small_heuristics = "Max"
References
https://github.com/rust-lang/rustfmt
Rust and Spark
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The simplest and best way is to leverage pandas_udf
in PySpark.
In the pandas UDF,
you can call subprocess.run
to run any shell command
and capture its output.
from pathlib …
Yarn for Spark
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
-
List all Spark applications.
yarn application --list
-
Show status of a Spark application.
yarn application -status application_1459542433815_0002
-
view logs of a Spark application.
yarn logs -applicationId application_1459542433815_0002
-
kill a Spark application …