Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
The picture comes from Machine Learning Algorithms Mindmap.
Feature Engineering
Handling Categorical Variables in Machine Learning
Regularization in Machine Learning Models
Ensemble
Frameworks
Libraries for Gradient Boosting
Big-data (Spark) Friendly Frameworks
https://mmlspark.blob.core.windows.net/website/index.html
AutoML
Questions
Random Forest
-
Is discrete variables easier to handle than continous variables (in random forest)? Is there any advantage of discretize variables? The eseential question is how is categorical varialbes handled in RF? Does RF use category variables directly or does it have to convert it to numerical somehow?
-
Random forest has a way to impute missing values. What if I treat missing values in categorical predictors and a new class? It sounds like a good ...
Imputation
-
mean, median, etc.
-
SVD imputation using low dimension to approximate high dimension data
Tips on Kaggle
Machine Learning Resources
AI Tools
https://openai.com/blog/dall-e/
References
-
https://github.com/academic/awesome-datascience
-
Essential Cheat Sheets for Machine Learning and Deep Learning Engineers
-
https://rushter.com/dsreader/