Ben Chuanlong Du's Blog

It is never too late to learn.

Date and Time in Java and Scala

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Use Joda time if you are using JDK <= 7 and java.time if you are using JDK8 and above.

If you do prefer Scala libraries (when working in Scala), https://github …

Spark Issue: Data Skew on Shuffle Phase

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

org.apache.spark.shuffle.FetchFailedException: Too large frame: 2200180718 Caused by: java.lang.IllegalArgumentException: Too large frame: 2200289525 at org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)

Reason

There …

Spark Issue: InvalidInputException for Some Hive Data Partitions

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Symptom

15/12/29 17:22:27 ERROR yarn.ApplicationMaster: User class …

Tips on Selenium

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. Selenium IDE is very useful. You can use it to record (test) actions and then export it into (testing) code in different programming languages (e.g., Python).

    selenium-ide-menu

    selenium-ide-menu-export

Examples …

Spark Issue: a Master URL Must Be Set in Your Configuration

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Error Message

Error initializzing SparkContext: A master URL must be set in your configuration.

Possible Causes

The master of Spark cluster is not specified.

Solutions

Add .master("yarn") into the following …