Date and Time in Python pandas
Date/time utilities in the pandas
module are more flexible/powerful than that in the datetime
module.
It is suggested that you use date/time utilities in the pandas
module
when you use DataFrame/Series in the pandas
module.
pandas.to_datetime works on an iterable object, handles missing values and nano seconds.
pandas.Series.dt.strftime
Hands on pandas.Series in Python
pandas.Series.str¶
The attribute
pandas.Series.str
can only be used with Series ofstr
values. You will either encounter anAttributionError
(Can only use .str accessor with string values, which use np.object_ dtype in pandas) or find it to yield a Series ofNaN
's if you invoke it on a Series of non-string values. If you have control of the DataFrame, the preferred way is to cast the type the column tostr
Cut and qcut in pandas DataFrame
Aggregation in pandas DataFrame
Comment¶
The order of elements within each group are preserved (as the original order).
groupby
works exactly the same on index if the index is named.The order of columns in groupby matters if you want unstack the results later.
groupby works on columns too and it can group by some level of a MultiIndex.
Hands on the Python module dask
Installation¶
- You have to install the complete version of Dask (using the command
pip3 install dask[complete]
) if you need support of extended memory (for handling big data) and schedulers (for performance). The default installation version (pip3 install dask
) of Dask does not include those features out-of-box.