Ben Chuanlong Du's Blog

It is never too late to learn.

Spark Issue: Data Skew on Shuffle Phase

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.shuffle.FetchFailedException: Too large frame: 2200180718 Caused by: java.lang.IllegalArgumentException: Too large frame: 2200289525 at org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)

Reason

There …

Spark Issue: InvalidInputException for Some Hive Data Partitions

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

15/12/29 17:22:27 ERROR yarn.ApplicationMaster: User class …

Spark Issue: a Master URL Must Be Set in Your Configuration

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Error Message

Error initializzing SparkContext: A master URL must be set in your configuration.

Possible Causes

The master of Spark cluster is not specified.

Solutions

Add .master("yarn") into the following …

Spark Issue: Spark Application Submission Is Not Finished

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Error Message

Application submission is not finished, submitted application application__1524215324275_0081 is still …

Spark Issue: Cannot Create a Path from An Empty String

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Issue

java.lang.IllegalArgumentException: Can not create a Path from an empty string

Possible Causes

The error you are seeing could be from number of things:

  1. parameters , check for ${param} in …

Spark Issue: Duplicated Partitions

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

There seems to be an issue in Spark that it might fail to overwrite files even if mode of spark.write is set to be "overwrite".