Ben Chuanlong Du's Blog

It is never too late to learn.

Cross Validation in Machine Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Training and Testing Data Set

  • good when you have large amount of data

  • usually use 1/5 to 1/3 of the data as testing data set.

K-fold CV

  • suitable when …

Regression Classification ANOVA

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Regression refers to problems where the response (output) variable is continous while classfication refers to problems where the response (output) variable is discrete.

Generally speaking fitting gression to classification problems is …