Ben Chuanlong Du's Blog

It is never too late to learn.

Use XGBoost With Spark

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

The split-by-leaf mode (grow_policy="lossguide") is not supported in distributed training, which makes XGBoost4J on Spark much slower than LightGBM on Spark.

XGBoost with Spark

https://towardsdatascience.com/build-xgboost-lightgbm-models-on-large-datasets-what-are-the-possible-solutions-bf882da2c27d

https://xgboost …

Convert a Tensor to a Numpy Array or List in PyTorch

Tips

There are multiple ways to convert a Tensor to a numpy array in PyTorch. First, you can call the method Tensor.numpy.

my_tensor.numpy()

Second, you can use the function numpy.array.

import numpy as np
np.array(my_tensor)

It is suggested that you use the function numpy.array to convert a Tensor to a numpy array. The reason is that numpy.array is more generic. You can also use it to convert other objects (e.g., PIL.Image) to numpy arrays while those objects might not have a method named numpy