Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The split-by-leaf mode (grow_policy="lossguide"
) is not supported in distributed training,
which makes XGBoost4J on Spark much slower than LightGBM on Spark.
XGBoost with Spark
https://towardsdatascience.com/build-xgboost-lightgbm-models-on-large-datasets-what-are-the-possible-solutions-bf882da2c27d
https://xgboost …