Ben Chuanlong Du's Blog

It is never too late to learn.

My List of Python Modules

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Awesome Python

Awesome Python Applications

Data Science

  1. pandas: data frame.

  2. scipy: scientific computing.

  3. numpy: multi-dimensional arrays, fundation of pandas and deep learning packages.

  4. re: regular expression

File System

  1. shutil: copy, move …

Python Modules for Date and Time

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

datetime

dateutil

Useful extensions to the standard Python datetime features

dateparser

python parser for human readable dates

arrow

Better dates & times for Python.

monthdelta

Reshape Numpy Arrays

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

String Functions in Spark

Tips and Traps

  1. You can use the split function to split a delimited string into an array. It is suggested that removing trailing separators before you apply the split function. Please refer to the split section before for more detailed discussions.

  2. Some string functions (e.g., right, etc.) are available in the Spark SQL APIs but not available as Spark DataFrame APIs.

Serialization and Caching in Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

functools.lru_cache

https://docs.python.org/3/library/functools.html#functools.lru_cache

cachetools

https://cachetools.readthedocs.io/en/latest/ https://github.com/tkem/cachetools

diskcache sounds like a good options!!!

DiskCache …