AQE (Adaptive Query Execution)¶
To enable AQE,
you have to set spark.sql.adaptive.enabled
to true
(using --conf spark.sql.adaptive.enabled=true
in spark-submit
or using `spark.config("spark.sql.adaptive,enabled", "true") in Spark/PySpark code.)
Pandas UDFs¶
Pandas UDFs are user defined functions
that are executed by Spark using Arrow
to transfer data to Pandas to work with the data,
which allows vectorized operations.
A Pandas UDF is defined using pandas_udf