Ben Chuanlong Du's Blog

It is never too late to learn.

Directly Initialize a Hashmap in Java

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The following code snippet in Java 9+ initialize an immutable HashMap with up to 10 elements.

Get Size of Tables on HDFS

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The HDFS Way

You can use the hdfs dfs -du /path/to/table command or hdfs dfs -count -q -v -h /path/to/table to get the size of an HDFS path (or table). However, this only works if the cluster supports HDFS. If a Spark cluster exposes only JDBC/ODBC APIs, this method does not work.

Read/Write Files/Tables in Spark

References

DataFrameReader APIs

DataFrameWriter APIs

https://spark.apache.org/docs/latest/sql-programming-guide.html#data-sources

Comments

  1. It is suggested that you specify a schema when reading text files. If a schema is not specified when reading text files, it is good practice to check the types of columns (as the types are inferred).

  2. Do NOT read data from and write data to the same path in Spark! Due to lazy evaluation of Spark, the path will likely be cleared before it is read into Spark, which will throw IO exceptions. And the worst part is that your data on HDFS is removed but recoverable.

Convert Math Formula and Table To LaTeX

R

  1. xtable{xtable}

    • Good for converting table to LaTeX code.
  2. latex{Hmisc}

    • Convert R objects (not just tables) to LaTeX code.

Excel

MATLAB

Mathematica

  1. Type in the formula in Mathematica.

  2. Selected the formula.

  3. Right click on selection, and then select "Copy as" -> "LaTeX".

  4. You can also convert formulas to other …