Ben Chuanlong Du's Blog

It is never too late to learn.

Libraries for Gradient Boosting

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

XGBoost

https://xgboost.ai/

XGBoost Documentation

Speedup XGBoost

https://machinelearningmastery.com/best-tune-multithreading-support-xgboost-python/

https://medium.com/data-design/xgboost-gpu-performance-on-low-end-gpu-vs-high-end-cpu-a7bc5fcd425b

xgboost GPU is fast. Very fast. As long as it fits in RAM and …

Ensemble Machine Learning Models

The prediction error is a trade-off of bias and variance. In statistics, we often talk about unbiased estimators (especially in linear regression). In this case we restrict the estimators/predictors to be in a (small) class, and find the optimal solution in this class (called BLUE or BLUP).

Generally speaking …

Cross Validation in Machine Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Training and Testing Data Set

  • good when you have large amount of data

  • usually use 1/5 to 1/3 of the data as testing data set.

K-fold CV

  • suitable when …