Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Training and Testing Data Set
-
good when you have large amount of data
-
usually use 1/5 to 1/3 of the data as testing data set.
K-fold CV
-
suitable when you have medium number of data
-
K=10 is popular
-
computationally extensive
Leave-k-out CV
-
Use this way only when you have very limited number of data.
-
Leave-1-out is a specially case of the K-fold CV.
-
computationally very extensive
Some Rules:
-
10 times number of parameters, probably in good shape
-
20 times number of prameters, usually perfect good