Ben Chuanlong Du's Blog

It is never too late to learn.

Hadoop Filesystem Tips

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

  1. It is suggested that you never use the -skipTrash option unless you are absolutely aware of what you are doing. I made mistakes a couple of times in …

Process Big Data Using Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

General Tips

  1. Please refer to Spark SQL for tips specific to Spark SQL.

  2. It is almost always a good idea to filter out null value in the joinining columns before joining …

General Tips on Programming

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. Do NOT chase the latest versions of libraries/software/tools. Wait for some time for them to be tested thoroughly before adopting them.

  2. Follow a good Semantic Versioning if you release …