Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://build-system.fman.io/
https://github.com/mherrmann/fbs-tutorial
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://build-system.fman.io/
https://github.com/mherrmann/fbs-tutorial
To enable AQE,
you have to set spark.sql.adaptive.enabled
to true
(using --conf spark.sql.adaptive.enabled=true
in spark-submit
or using `spark.config("spark.sql.adaptive,enabled", "true") in Spark/PySpark code.)
Pandas UDFs are user defined functions
that are executed by Spark using Arrow
to transfer data to Pandas to work with the data,
which allows vectorized operations.
A Pandas UDF is defined using pandas_udf
https://github.com/apache/arrow
https://stackoverflow.com/questions/54582073/sharing-objects-across-workers-using-pyarrow
https://github.com/pytorch/pytorch/issues/13039
https://issues.apache.org/jira/browse/ARROW-5130
https://uwekorn.com/2019/09/15/how-we-build-apache-arrows-manylinux-wheels.html