Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
XGBoost
http://www.legendu.net/misc/blog/use-xgboost-with-spark/
LightGBM
http://www.legendu.net/misc/blog/use-lightgbm-with-spark/
BigDL
MMLSpark
Apache Ray
You can run Apache Ray on top of Spark via analytics-zoo, which enables you to run any Python machine lerning library in distributed fashion. But I'm not sure whether this is a good idea.
yahoo/TensorFlowOnSpark
PyTorch
H2O
https://github.com/h2oai/sparkling-water http://docs.h2o.ai/sparkling-water/2.2/latest-stable/doc/pysparkling.html http://h2o-release.s3.amazonaws.com/h2o/master/4273/docs-website/h2o-docs/faq/sparkling-water.html https://docs.databricks.com/_static/notebooks/h2o-sparkling-water-python.html
SystemML
elephas
Distributed training with Keras and Spark.
References
https://towardsdatascience.com/deep-learning-with-apache-spark-part-1-6d397c16abd