Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Optuna is a good framework for hyper parameter tuning in machine learning.
Nature Language Processing Using NLTK
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
nltk.util.ngrams nltk.bigrams nltk.PorterStemmer
from nltk.util import ngrams
sentence = 'this is a foo bar sentences and i want to ngramize it'
n = 6
sixgrams = ngrams(sentence.split …
Keywords Extracting from Text
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Word Stemming
-
existing stemming method such as NLTK.PorterStem, etc.
-
didn't -> did not, there's -> there is, etc. Mr. -> Mister Mrs. -> ... Ms. -> ...
Other things
-
it seems that it is hard to get …
Clustering Algorithms in Machine Learning
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Centroid-based Clustering
-
K-means Clustering
-
K-medians Clustering
-
K-mediods Clustering
Hierarchical Clustering
-
Agglomerative Hierarchical Clustering
-
Divisive Hierarchical Clustering
Partional Clustering
Regression Classification ANOVA
Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Regression refers to problems where the response (output) variable is continous while classfication refers to problems where the response (output) variable is discrete.
Generally speaking fitting gression to classification problems is …