Ben Chuanlong Du's Blog

It is never too late to learn.

Optimization Method in Machine Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

L-BFGS converges faster and with better solutions on small datasets. However, ADAM is very robust for relatively large datasets. It usually converges quickly and gives pretty good performance. SGD with momentum …