Tips and Traps¶
Bucketed column is only supported in Hive table at this time.
A Hive table can have both partition and bucket columns.
Suppose
t1andt2are 2 bucketed tables and with the number of bucketsb1andb2respecitvely. For bucket optimization to kick in when joining them:- The 2 tables must be bucketed on the same keys/columns. - Must joining on the bucket keys/columns. - `b1` is a multiple of `b2` or `b2` is a multiple of `b1`.
A Text File Is Marked as a Binary File
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The issue can be fix by stripping null characters using the following command.
tr -d '\000' < filein > fileout
References
Sum Type in Rust
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Enum is the preferred way to constrcut a sum type of several types (which does not implemente the same trait).
The Rust crate
either
provides an enum Either (with variants Left …
Spark Issue: Pure Python Code Errors
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
This post collects some typical pure Python errors in PySpark applications.
Symptom 1
object has no attribute
Solution 1
Fix the attribute name.
Symptom 2
No such file or directory
Solution …
Spark Issue: TypeError WithReplacement
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Symptoms
TypeError: withReplacement (optional), fraction (required) and seed (optional) should be a bool, float and number; however, got [
].
Causes
An integer number (e.g., 1) is passed to the fraction parameter …
Cross Compile Rust
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!