Ben Chuanlong Du's Blog

It is never too late to learn.

Spark Issue: AnalysisException: Found Duplicated Columns

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptoms

pyspark.sql.utils.AnalysisException: Found duplicate column(s) when inserting into ...

Possible Causes

As the error message says, there are duplicated columns in your Spark SQL code.

Possible Solutions

Fix …

Spark Issue: AnalysisException: Cannot Resolve

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.sql.AnalysisException: cannot resolve ...

Cause

Miss-spell a column name or refer to a column which does not exist in the DataFrame.

Solution

Correct the column name or …

Spark Issue: AnalysisException: Path Does Not Exist

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.sql.AnalysisException: Path does not exist ...

Cause

A specified HDFS path does not exist.

Solution

Use the correct HDFS path.