Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The HDFS Way¶
You can use the hdfs dfs -du /path/to/table
command
or hdfs dfs -count -q -v -h /path/to/table
to get the size of an HDFS path (or table).
However,
this only works if the cluster supports HDFS.
If a Spark cluster exposes only JDBC/ODBC APIs,
this method does not work.
Access Control in Spark SQL
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Grant Permission to Users
GRANT
priv_type [, priv_type ] ...
ON database_table_or_view_name
TO principal_specification [, principal_specification] ...
[WITH GRANT OPTION];
Examples:
GRANT SELECT ON table1 TO USER user1;
GRANT SELECT ON DATABASE db1 TO USER user1 …
SQL Translation Tools
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The ORM library SQLAlchemy
can be leverage
SQL Translation Tools
Hands on SQLite3
Tips and Traps¶
Do NOT use SQLite3 on network filesystems (NFS, SAMBA, etc).
.schema
show create statement of a tableYou can force query to keep the original order of rows by applying
order by rowid
.SQLite3 supports full-text search by the FTS5 extension (since 3.9.0). It is suggested that you use the
porter
Use Column Alias in SQL
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
-
Logically any
SELECT
is processed in following order:- from
- where
- group by
- having
- olap functions
- qualify
- select
- sample
- order by
Besides the proprietary
QUALIFY/SAMPLE
every DBMS will do it exactly …
Volatile CTE and Subqueries in SQL
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
-
A volatile table persistents in the duration of the connection that creates it while a CTE is only accessible by the query following it. That is the scope of CTE is …