Handle Categorical Variables in LightGBM

Mar 26, 2022

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

LightGBM support pandas columns of category type. As a matter of fact, this is the suggested way of handling categorical columns in LightGBM.

data[feature] = pd.Series(data[feature], dtype="category")

A LightGBM model (which is a Booster object) records categories of each categorical feature. This information is used to set categories of each categorical feature during prediction, which ensures that a LightGBM model can always handle categorical features correctly.

Tips on LightGBM

Dec 03, 2019

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

It is strongly suggested that you load data into a pandas DataFrame and handle categorical variables by specifying a dtype of "category" for those categorical variables.
```
df.cat_var = df.cat_var.astype …
```

LightGBM on GPU

Feb 04, 2020

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://pypi.org/project/lightgbm/#build-gpu-version

https://github.com/microsoft/LightGBM/blob/master/docs/Installation-Guide.rst#build-gpu-version

https://www.kaggle.com/vinhnguyen/gpu-acceleration-for-lightgbm

Microsoft's Example Dockerfile for GPU version of LightGBM …

Use LightGBM With Spark

Dec 05, 2019

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://github.com/Azure/mmlspark/blob/master/docs/lightgbm.md

MMLSpark seems to be the best option to use train models using LightGBM on a Spark cluster. Note that MMLSpark requires …

Ben Chuanlong Du's Blog

It is never too late to learn.

Handle Categorical Variables in LightGBM

Tips on LightGBM

LightGBM on GPU

Use LightGBM With Spark