Ben Chuanlong Du's Blog

It is never too late to learn.

Get Size of Tables on HDFS

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The HDFS Way

You can use the hdfs dfs -du /path/to/table command or hdfs dfs -count -q -v -h /path/to/table to get the size of an HDFS path (or table). However, this only works if the cluster supports HDFS. If a Spark cluster exposes only JDBC/ODBC APIs, this method does not work.

Tips on NetworkX

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Comparison of Collections in C++ and Java

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Plain Old Array

  1. The length/size of array is as in the declaration. Each element of the array is initialized to the default value (null for object).

  2. Array in Java does …

Alternatives to Docker Containers

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. LXD and Multipass are alternatives to Docker container. Docker is more lightweight than LXD which is more lightweight than Multipass (Docker < LXD < Multipass).

  2. Neither Docker nor LXD requires a CPU which …

Tips on Omegaconf

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. omegaconf can parse command-line options too. However, unlike argparse it does not enforce any constraint on command-line options.

Tips on Redash

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Creating a new query runner (data source)

https://discuss.redash.io/t/creating-a-new-query-runner-data-source-in-redash/347