Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Show Error Messages Only¶
When you run Spark or PySpark in a Jupyter/Lab notebook, it is recommended that you show ERROR messages only. Otherwise, there might be too much logging information polluting your notebook. You can set the log level of Spark to ERROR using the following line of code.
Make Date Show Time of a Specific Timezone in Linux
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
For example, to show the current Pacific time,
TZ=America/Los_Angeles date
Spark Issue: InvalidResourceRequestException
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Symptoms
Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=16 …
Spark Configuration
A Comprehensive List of Common Issues in Spark Applications
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
List of Common Issues
Please refer to http://www.legendu.net/misc/tag/spark-issue.html for a comprehensive list of Spark Issues and (possible) causes and solutions.
Debugging Tips
Spark/Hadoop …
Spark Issue: IllegalArgumentException: System Memory Must Be At Least
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Symptom
Exception in thread "main" java.lang.IllegalArgumentException: System memory 466092032 must be at least 471859200. Please increase heap size using the --driver-memory option or spark.driver.memory in Spark configuration …