Ben Chuanlong Du's Blog

It is never too late to learn.

String Functions in Spark

Tips and Traps

  1. You can use the split function to split a delimited string into an array. It is suggested that removing trailing separators before you apply the split function. Please refer to the split section before for more detailed discussions.

  2. Some string functions (e.g., right, etc.) are available in the Spark SQL APIs but not available as Spark DataFrame APIs.

Serialization and Caching in Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

functools.lru_cache

https://docs.python.org/3/library/functools.html#functools.lru_cache

cachetools

https://cachetools.readthedocs.io/en/latest/ https://github.com/tkem/cachetools

diskcache sounds like a good options!!!

DiskCache …

Tips on Vaex

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Object Detection Using Deep Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Concepts

Image Classification

Image Localization

Image Classification: Predict the type or class of an object in an image. Input: An image with a single object, such as a photograph. Output: A …

The Case Statement and the when Function in Spark

Tips and Traps

  1. Watch out for NaNs ..., behave might not what you expect ...

  2. None can be used for otherwise and yield null in DataFrame.

Column alias and postional columns can be used in group by in Spark SQL!!!

Notice the function when behaves like if-else.