Ben Chuanlong Du's Blog

It is never too late to learn.

Make Your Model Training Reproducible in PyTorch

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The PyTorch doc Reproducibility has very detailed instructions on how to make your model training reproducible. Basically, you need the following code.

torch.manual_seed(args.seed)
np.random.seed(args.seed …

Use XGBoost With Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The split-by-leaf mode (grow_policy="lossguide") is not supported in distributed training, which makes XGBoost4J on Spark much slower than LightGBM on Spark.

XGBoost with Spark

https://towardsdatascience.com/build-xgboost-lightgbm-models-on-large-datasets-what-are-the-possible-solutions-bf882da2c27d

https://xgboost …