|
|
|
## Machine Learning
|
|
|
|
|
|
|
|
### Machine Learning: The Basics
|
|
|
|
|
|
|
|
Topics to review so you don't get weeded out.
|
|
|
|
* Supervised learning
|
|
|
|
* Unsupervised learning
|
|
|
|
* Semi-supervised learning
|
|
|
|
* Modeling business decisions usually uses supervised and unsupervised learning.
|
|
|
|
* Classification and regression are the most commonly seen machine learning models.
|
|
|
|
|
|
|
|
### Machine Learning: The Full Topics List
|
|
|
|
|
|
|
|
List of topics focusing on theoretical components:
|
|
|
|
|
|
|
|
* Supervised Learning
|
|
|
|
* VC Dimension
|
|
|
|
* PAC Learning
|
|
|
|
* Noise
|
|
|
|
* Regression
|
|
|
|
* Model Selection
|
|
|
|
* Dimensions
|
|
|
|
|
|
|
|
* Bayesian Decision Theory
|
|
|
|
* Classification
|
|
|
|
* Losses and Risks
|
|
|
|
* Discriminant Functions
|
|
|
|
* Utility Theory
|
|
|
|
* Association Rules
|
|
|
|
|
|
|
|
* Parametric Models
|
|
|
|
* Maximum Likelihood Estimation
|
|
|
|
* Building Estimators
|
|
|
|
* Bayes Estimator
|
|
|
|
* Parametric Classification
|
|
|
|
* Regression
|
|
|
|
* Tuning Model Complexity
|
|
|
|
* Model Selection
|
|
|
|
|
|
|
|
* Multivariate Methods
|
|
|
|
* Multivariate Data
|
|
|
|
* Parameter Estimation
|
|
|
|
* Missing Values
|
|
|
|
* Multivariate Normal
|
|
|
|
* Multivariate Classificaiton
|
|
|
|
* Tuning Complexity
|
|
|
|
* Discrete Features
|
|
|
|
* Multivariate Regression
|
|
|
|
|
|
|
|
* Dimensionality Reduction
|
|
|
|
* Subset Selection
|
|
|
|
* PCA
|
|
|
|
* Factor Analysis
|
|
|
|
* Multidimensional Scaling
|
|
|
|
* LDA
|
|
|
|
* Isomapl
|
|
|
|
* Local Linear Embedding
|
|
|
|
|
|
|
|
* Clustering
|
|
|
|
* Mixture Densities
|
|
|
|
* k-Means
|
|
|
|
* Expectation Maximization Algorithm
|
|
|
|
* Mixtures of Latent Variables
|
|
|
|
* Supervised Learning after Clustering
|
|
|
|
* Hierarchical Clustering
|
|
|
|
* Numbers of Clusters
|
|
|
|
|
|
|
|
* Nonparametric Methods
|
|
|
|
* Decision Trees
|
|
|
|
* Linear discrimination
|
|
|
|
* Miultilyaer Perceptrons
|
|
|
|
* Local Models
|
|
|
|
* Kernel Machines
|
|
|
|
* Bayesian Estimation
|
|
|
|
* Hidden Markov Models
|
|
|
|
* Graphical Models
|
|
|
|
* Combining Multiple Learners
|
|
|
|
* Reinforcement Learning
|
|
|
|
* Design and Analysis of Machine Learning Experiments
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A long and full list of types of models under each sub-heading:
|
|
|
|
|
|
|
|
* Regression: **Modeling relationship between variables, iteratively refined using an error measure.**
|
|
|
|
* Linear Regression
|
|
|
|
* Logistic Regression
|
|
|
|
* OLS (Ordinary Least Squares) Regression
|
|
|
|
* Stepwise Regression
|
|
|
|
* MARS (Multivariate Adaptive Regression Splines)
|
|
|
|
* LOESS (Locally Estimated Scatterplot Smoothing)
|
|
|
|
|
|
|
|
* Instance Based: **Build up database of data, compare new data to database; winner-take-all or memory-based learning.**
|
|
|
|
* k-Nearest Neighbor
|
|
|
|
* Learning Vector quantization
|
|
|
|
* Self-Organizing Map
|
|
|
|
* Localy Weighted Learning
|
|
|
|
|
|
|
|
* Regularization: **Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.**
|
|
|
|
* Ridge Regression
|
|
|
|
* LASSO (Least Absolute Shrinkage and Selection Operator)
|
|
|
|
* Elastic Net
|
|
|
|
* LARS (Least Angle Regression)
|
|
|
|
|
|
|
|
* Decision Tree: **Construct a model of decisions made on actual values of attributes in the data.**
|
|
|
|
* Classification and Regression Tree
|
|
|
|
* CHAID (Chi-Squared Automatic Interaction Detection)
|
|
|
|
* Conditional Decision Trees
|
|
|
|
|
|
|
|
* Bayesian: **Methods explicitly applying Bayes' Theorem for classification and regression problems.**
|
|
|
|
* Naive Bayes
|
|
|
|
* Gaussian Naive Bayes
|
|
|
|
* Multinomial Naive Bayes
|
|
|
|
* Bayesian Network
|
|
|
|
* BBN (Bayesian Belief Network)
|
|
|
|
|
|
|
|
* Clustering: **Centroid-based and hierarchical modeling approaches; groups of maximum commonality.**
|
|
|
|
* k-Means
|
|
|
|
* k-Medians
|
|
|
|
* Expectation Maximization
|
|
|
|
* Hierarchical Clustering
|
|
|
|
|
|
|
|
* Association Rule Algorithms: **Extract rules that best explain relationships between variables in data.**
|
|
|
|
* Apriori algorithm
|
|
|
|
* Eclat algorithm
|
|
|
|
|
|
|
|
* Neural Networks: **Inspired by structure and function of biological neural networks, used ofr regression and classification problems.**
|
|
|
|
* Radial Basis Function Network (RBFN)
|
|
|
|
* Perceptron
|
|
|
|
* Back-Propagation
|
|
|
|
* Hopfield Network
|
|
|
|
|
|
|
|
* Deep Learning: **Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.**
|
|
|
|
* Convolutional Neural Network (CNN)
|
|
|
|
* Recurrent Neural Network (RNN)
|
|
|
|
* Long-Short-Term Memory Network (LSTM)
|
|
|
|
* Deep Boltzmann Machine (DBM)
|
|
|
|
* Deep Belief Network (DBN)
|
|
|
|
* Stacked Auto-Encoders
|
|
|
|
|
|
|
|
* Dimensionality Reduction: **Find inherent structure in data, in an unsupervised manner, to describe data using less information.**
|
|
|
|
* PCA
|
|
|
|
* t-SNE
|
|
|
|
* PLS (Partial Least Squares Regression)
|
|
|
|
* Sammon Mapping
|
|
|
|
* Multidimensional Scaling
|
|
|
|
* Projection Pursuit
|
|
|
|
* Principal Component Regression
|
|
|
|
* Partial Least Squares Discriminant Analysis
|
|
|
|
* Mixture Discriminant Analysis
|
|
|
|
* Quadratic Discriminant Analysis
|
|
|
|
* Regularized Discriminant Analysis
|
|
|
|
* Linear Discriminant Analysis
|
|
|
|
|
|
|
|
* Ensemble: **Models composed of multiple weaker models, independently trained, that provide a combined prediction.**
|
|
|
|
* Random Forest
|
|
|
|
* Gradient Boosting Machines (GBM)
|
|
|
|
* Boosting
|
|
|
|
* Bootstrapped Aggregation (Bagging)
|
|
|
|
* AdaBoost
|
|
|
|
* Stacked Generalization (Blending)
|
|
|
|
* Gradient Boosted Regression Trees
|
|
|
|
|