Ben Chuanlong Du's Blog

It is never too late to learn.

Handling Categorical Variables in Machine Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Categorical variables are very common in a machine learning project. On a high level, there are two ways to handle a categorical variable.

  1. Drop a categorical variable if a categorical variable …

Tips on LightGBM

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. It is strongly suggested that you load data into a pandas DataFrame and handle categorical variables by specifying a dtype of "category" for those categorical variables.

    df.cat_var = df.cat_var.astype …

Training Deep Neural Networks

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Rules of Thumb for Training

https://arxiv.org/pdf/1206.5533.pdf**

https://towardsdatascience.com/17-rules-of-thumb-for-building-a-neural-network-93356f9930af

https://hackernoon.com/rules-of-thumb-for-deep-learning-5a3b6d4b0138

https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw

Batch size affects both …

Learning to Rank

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

https://www.kaggle.com/c/home-credit-default-risk/discussion/61613

https://studylib.net/doc/18339870/yetirank--everybody-lies

http://proceedings.mlr.press/v14/gulin11a/gulin11a.pdf

Model Architecture Ranking Category SOTA Comments Paper
RankNet NN …

Visualization for AI Concepts

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tools for (approximately) visualizing the architures of existing neural networks or for visualizing the traing process (training/validation loss/accuracy, activation, etc.) are extremely helpful! TensorBoard is one of the best …