Ben Chuanlong Du's Blog

It is never too late to learn.

Spark Issue: AnalysisException: Cannot Resolve

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.sql.AnalysisException: cannot resolve ...

Cause

Miss-spell a column name or refer to a column which does not exist in the DataFrame.

Solution

Correct the column name or …

Spark Issue: AnalysisException: Path Does Not Exist

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.sql.AnalysisException: Path does not exist ...

Cause

A specified HDFS path does not exist.

Solution

Use the correct HDFS path.

Spark Issue: AccessControlException: Permission Denied

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.hadoop.security.AccessControlException: Permission denied ...

Cause

The user of the Spark application has no permission to the query a table or HDFS path.

Solution

Apply to access to …

Spark Issue: Container Killed by Yarn for Exceeding Memory Limits

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptoms

Symptom 1

Container killed by YARN for exceeding memory limits.
22.0 GB of 19 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled …

Spark Issue: Data Skew on Shuffle Phase

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.shuffle.FetchFailedException: Too large frame: 2200180718 Caused by: java.lang.IllegalArgumentException: Too large frame: 2200289525 at org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)

Reason

There …