Construct Simple Spark DataFrames Using Seq
Seq.toDF¶
toDF() provides a concise syntax for creating DataFrames and can be accessed after importing Spark implicits.
import spark.implicits._
SparkSession.createDataFrame¶
Parse Arguments in Bash
Reshape a pandas DataFrame
Reshape DataFrame¶
Logging in PySpark
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
-
Excessive logging is better than no logging! This is generally true in distributed big data applications.
-
Use
loguruif it is available. If you have to use theloggingmodule, be …
Markdown vs RestructureText vs MyST for Documentation
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Comparison
- Compared to Markdown, RestructuredText is more fully-featured, much more standardized and uniform, and has built-in support for extensions. However, ReStructuredText is also criticized for its complex and confusing syntax. The …