Tips and Traps¶
The broadcast concept in numpy is essentially a way to "virtually" duplicate data in a numpy array so that it is "virtually" reshaped to be compatible with another numpy array for a certain operation. Do not confused yourself about it with the broadcast concept in Spark which sends a full copy of a (small) DataFrame to each work node for
BroadCastJoin