Ben Chuanlong Du's Blog

It is never too late to learn.

Hadoop Filesystem Tips

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. It is suggested that you never use the -skipTrash option unless you are absolutely aware of what you are doing. I made mistakes a couple of times in …

Hive SQL

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. Hive is case-insensitive, both keywords and functions

  2. You can use both double and single quotes for strings

  3. use = rather than == for equality comparison but it seems that == also works

  4. use % rather …

Check Whether a File Exists in Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

org.apache.hadoop.fs.FileSystem

val conf = sc.hadoopConfiguration
val fs = org.apache.hadoop.fs.FileSystem.get(conf)
val exists = fs.exists(new org.apache.hadoop.fs.Path("/path/on/hdfs …