Ben Chuanlong Du's Blog

It is never too late to learn.

Handle Categorical Variables in LightGBM

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

LightGBM support pandas columns of category type. As a matter of fact, this is the suggested way of handling categorical columns in LightGBM.

data[feature] = pd.Series(data[feature], dtype="category")

A LightGBM model (which is a Booster object) records categories of each categorical feature. This information is used to set categories of each categorical feature during prediction, which ensures that a LightGBM model can always handle categorical features correctly.