Builtin Objects Python
Python has built-in functions and object that users can use directly (no need to import).
However,
if you import another module which hide a built-in function or object,
you cannot use it anymore.
For example,
sum
is a built-in function in Python which can be used directly.
However,
if you use PySpark import SQL functions (from pyspark.sql.functions import *
Python Virtual Environment
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Cut and qcut in pandas DataFrame
Aggregation in pandas DataFrame
Comment¶
The order of elements within each group are preserved (as the original order).
groupby
works exactly the same on index if the index is named.The order of columns in groupby matters if you want unstack the results later.
groupby works on columns too and it can group by some level of a MultiIndex.
Hands on the Python module dask
Installation¶
- You have to install the complete version of Dask (using the command
pip3 install dask[complete]
) if you need support of extended memory (for handling big data) and schedulers (for performance). The default installation version (pip3 install dask
) of Dask does not include those features out-of-box.