Ben Chuanlong Du's Blog

It is never too late to learn.

`ifelse` on Pandas Series

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Series.apply + Lambda Function

DataFrame.apply + Lambda Function

axis=1: apply the lambda function on each row

List Comprehension

numpy.where

numpy.where is vectorized ifelse.

Numpy Arrays in Python

Tips and Traps

  1. The Pythonic way of checking whether a collection (string, list, set, dict, etc.) coll is non-empty is to use if coll. However, do NOT use if arr to check whether a numpy array is non-empty or not. Instead, you shoule use arr.size >0 to check whether a numpy array is non-empty or not.

Broadcast Arrays in Numpy

Tips and Traps

  1. The broadcast concept in numpy is essentially a way to "virtually" duplicate data in a numpy array so that it is "virtually" reshaped to be compatible with another numpy array for a certain operation. Do not confused yourself about it with the broadcast concept in Spark which sends a full copy of a (small) DataFrame to each work node for BroadCastJoin