Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://build-system.fman.io/
https://github.com/mherrmann/fbs-tutorial
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://build-system.fman.io/
https://github.com/mherrmann/fbs-tutorial
To enable AQE,
you have to set spark.sql.adaptive.enabled
to true
(using --conf spark.sql.adaptive.enabled=true
in spark-submit
or using `spark.config("spark.sql.adaptive,enabled", "true") in Spark/PySpark code.)
Pandas UDFs are user defined functions
that are executed by Spark using Arrow
to transfer data to Pandas to work with the data,
which allows vectorized operations.
A Pandas UDF is defined using pandas_udf
A new child process forked from a parent process does not inherit parent's variables by default. The export command marks an environment variable to be exported with any newly forked child processes and thus it allows a child process to inherit all marked variables.
export
unset
explainshell.com is a great place for learning shell.
Bash-it/bash-it is a great community driven Bash framework.
It is suggested that you avoid writing complicated Bash scripts. IPython is a much better alternative.
Do NOT use ;
to delimit paths passed to a shell command because ;