Tips on FeatureTools

Jan 18, 2020

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tool Review: Lessons learned from using FeatureTools to simplify the process of Feature Engineering

Predicting the rating a reviewer will give a restaurant using Featuretools and the nlp-primitives library

Natural Language …

Preparing Data for AI

Mar 17, 2020

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

General Tips

When you label individual images, it is better to use numerical labels (even though text labels are easier to understand) so that you can avoid mapping between numbers (use …

Loss Functions for Machine Learning Models

Mar 07, 2013

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Tips and Traps

A Loss function is always non-negative. If you get a negative loss when training a model, there must be something wrong with the code. For example, maybe you …

Entropy

Apr 22, 2013

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Entropy
Shannon Entropy
Cross Entropy
K-L divergence

Tips

The entropy concept was first introduced for discrete distributions (called Shannon entropy), which is defined as
$$H(X) = E …

Rule-base Image Process

Apr 07, 2020

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

If you face a relative simple image recognition problem which hasn't been studied by other people before so that no public data is available for it, it is probably less effort …

Handle Categorical Variables in LightGBM

Mar 26, 2022

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

LightGBM support pandas columns of category type. As a matter of fact, this is the suggested way of handling categorical columns in LightGBM.

data[feature] = pd.Series(data[feature], dtype="category")

A LightGBM model (which is a Booster object) records categories of each categorical feature. This information is used to set categories of each categorical feature during prediction, which ensures that a LightGBM model can always handle categorical features correctly.

← Older Newer →

Ben Chuanlong Du's Blog

It is never too late to learn.

Tips on FeatureTools

Preparing Data for AI

General Tips

Loss Functions for Machine Learning Models

Tips and Traps

Entropy

Tips

Rule-base Image Process

Handle Categorical Variables in LightGBM

General Tips

Tips and Traps

Entropy and Related Concepts

Tips