## Machine Learning ### Machine Learning: The Basics Topics to review so you don't get weeded out. * Supervised learning * Unsupervised learning * Semi-supervised learning * Modeling business decisions usually uses supervised and unsupervised learning. * Classification and regression are the most commonly seen machine learning models. ### Machine Learning: The Full Topics List List of topics focusing on theoretical components: * Supervised Learning * VC Dimension * PAC Learning * Noise * Regression * Model Selection * Dimensions * Bayesian Decision Theory * Classification * Losses and Risks * Discriminant Functions * Utility Theory * Association Rules * Parametric Models * Maximum Likelihood Estimation * Building Estimators * Bayes Estimator * Parametric Classification * Regression * Tuning Model Complexity * Model Selection * Multivariate Methods * Multivariate Data * Parameter Estimation * Missing Values * Multivariate Normal * Multivariate Classificaiton * Tuning Complexity * Discrete Features * Multivariate Regression * Dimensionality Reduction * Subset Selection * PCA * Factor Analysis * Multidimensional Scaling * LDA * Isomapl * Local Linear Embedding * Clustering * Mixture Densities * k-Means * Expectation Maximization Algorithm * Mixtures of Latent Variables * Supervised Learning after Clustering * Hierarchical Clustering * Numbers of Clusters * Nonparametric Methods * Decision Trees * Linear discrimination * Miultilyaer Perceptrons * Local Models * Kernel Machines * Bayesian Estimation * Hidden Markov Models * Graphical Models * Combining Multiple Learners * Reinforcement Learning * Design and Analysis of Machine Learning Experiments A long and full list of types of models under each sub-heading: * Regression: **Modeling relationship between variables, iteratively refined using an error measure.** * Linear Regression * Logistic Regression * OLS (Ordinary Least Squares) Regression * Stepwise Regression * MARS (Multivariate Adaptive Regression Splines) * LOESS (Locally Estimated Scatterplot Smoothing) * Instance Based: **Build up database of data, compare new data to database; winner-take-all or memory-based learning.** * k-Nearest Neighbor * Learning Vector quantization * Self-Organizing Map * Localy Weighted Learning * Regularization: **Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.** * Ridge Regression * LASSO (Least Absolute Shrinkage and Selection Operator) * Elastic Net * LARS (Least Angle Regression) * Decision Tree: **Construct a model of decisions made on actual values of attributes in the data.** * Classification and Regression Tree * CHAID (Chi-Squared Automatic Interaction Detection) * Conditional Decision Trees * Bayesian: **Methods explicitly applying Bayes' Theorem for classification and regression problems.** * Naive Bayes * Gaussian Naive Bayes * Multinomial Naive Bayes * Bayesian Network * BBN (Bayesian Belief Network) * Clustering: **Centroid-based and hierarchical modeling approaches; groups of maximum commonality.** * k-Means * k-Medians * Expectation Maximization * Hierarchical Clustering * Association Rule Algorithms: **Extract rules that best explain relationships between variables in data.** * Apriori algorithm * Eclat algorithm * Neural Networks: **Inspired by structure and function of biological neural networks, used ofr regression and classification problems.** * Radial Basis Function Network (RBFN) * Perceptron * Back-Propagation * Hopfield Network * Deep Learning: **Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.** * Convolutional Neural Network (CNN) * Recurrent Neural Network (RNN) * Long-Short-Term Memory Network (LSTM) * Deep Boltzmann Machine (DBM) * Deep Belief Network (DBN) * Stacked Auto-Encoders * Dimensionality Reduction: **Find inherent structure in data, in an unsupervised manner, to describe data using less information.** * PCA * t-SNE * PLS (Partial Least Squares Regression) * Sammon Mapping * Multidimensional Scaling * Projection Pursuit * Principal Component Regression * Partial Least Squares Discriminant Analysis * Mixture Discriminant Analysis * Quadratic Discriminant Analysis * Regularized Discriminant Analysis * Linear Discriminant Analysis * Ensemble: **Models composed of multiple weaker models, independently trained, that provide a combined prediction.** * Random Forest * Gradient Boosting Machines (GBM) * Boosting * Bootstrapped Aggregation (Bagging) * AdaBoost * Stacked Generalization (Blending) * Gradient Boosted Regression Trees