Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Comments¶
There is no (direct) way of select all columns except a few from a table using SQL. However, this is easily doable with DataFrame APIs (pandas, Spark/PySpark, etc.).
Get Size of Tables on HDFS
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The HDFS Way¶
You can use the hdfs dfs -du /path/to/table
command
or hdfs dfs -count -q -v -h /path/to/table
to get the size of an HDFS path (or table).
However,
this only works if the cluster supports HDFS.
If a Spark cluster exposes only JDBC/ODBC APIs,
this method does not work.
Tips on NetworkX
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Alternatives to Docker Containers
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
-
LXD and Multipass are alternatives to Docker container. Docker is more lightweight than LXD which is more lightweight than Multipass (Docker < LXD < Multipass).
-
Neither Docker nor LXD requires a CPU which …
Tips on Omegaconf
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
omegaconf
can parse command-line options too. However, unlikeargparse
it does not enforce any constraint on command-line options.
Tips on Redash
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Creating a new query runner (data source)
https://discuss.redash.io/t/creating-a-new-query-runner-data-source-in-redash/347