Ben Chuanlong Du's Blog

It is never too late to learn.

Tips on TPU

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

References

https://cloud.google.com/tpu/docs/tutorials/resnet-alpha-py

https://cloud.google.com/tpu/docs/tutorials

https://towardsdatascience.com/running-pytorch-on-tpu-a-bag-of-tricks-b6d0130bddd4

Regularization in Machine Learning Models

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Regularization add a penalty term to the loss function in machine learning models. The type of regularizatin depends on the type of penalty used (not the type of the objective function …

Optimization Method in Machine Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

L-BFGS converges faster and with better solutions on small datasets. However, ADAM is very robust for relatively large datasets. It usually converges quickly and gives pretty good performance. SGD with momentum …

Tips on XGBoost

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

  1. It is suggested that you use the sklearn wrapper classes XGBClassifier and XGBRegressor so that you can fully leverage other tools of the sklearn package.

  2. There are 2 types of boosters …

Libraries for Gradient Boosting

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

XGBoost

https://xgboost.ai/

XGBoost Documentation

Speedup XGBoost

https://machinelearningmastery.com/best-tune-multithreading-support-xgboost-python/

https://medium.com/data-design/xgboost-gpu-performance-on-low-end-gpu-vs-high-end-cpu-a7bc5fcd425b

xgboost GPU is fast. Very fast. As long as it fits in RAM and …

Ensemble Machine Learning Models

The prediction error is a trade-off of bias and variance. In statistics, we often talk about unbiased estimators (especially in linear regression). In this case we restrict the estimators/predictors to be in a (small) class, and find the optimal solution in this class (called BLUE or BLUP).

Generally speaking …