Computer science study plan.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

165 lines
4.6 KiB

## Machine Learning
### Machine Learning: The Basics
Topics to review so you don't get weeded out.
* Supervised learning
* Unsupervised learning
* Semi-supervised learning
* Modeling business decisions usually uses supervised and unsupervised learning.
* Classification and regression are the most commonly seen machine learning models.
### Machine Learning: The Full Topics List
List of topics focusing on theoretical components:
* Supervised Learning
* VC Dimension
* PAC Learning
* Noise
* Regression
* Model Selection
* Dimensions
* Bayesian Decision Theory
* Classification
* Losses and Risks
* Discriminant Functions
* Utility Theory
* Association Rules
* Parametric Models
* Maximum Likelihood Estimation
* Building Estimators
* Bayes Estimator
* Parametric Classification
* Regression
* Tuning Model Complexity
* Model Selection
* Multivariate Methods
* Multivariate Data
* Parameter Estimation
* Missing Values
* Multivariate Normal
* Multivariate Classificaiton
* Tuning Complexity
* Discrete Features
* Multivariate Regression
* Dimensionality Reduction
* Subset Selection
* PCA
* Factor Analysis
* Multidimensional Scaling
* LDA
* Isomapl
* Local Linear Embedding
* Clustering
* Mixture Densities
* k-Means
* Expectation Maximization Algorithm
* Mixtures of Latent Variables
* Supervised Learning after Clustering
* Hierarchical Clustering
* Numbers of Clusters
* Nonparametric Methods
* Decision Trees
* Linear discrimination
* Miultilyaer Perceptrons
* Local Models
* Kernel Machines
* Bayesian Estimation
* Hidden Markov Models
* Graphical Models
* Combining Multiple Learners
* Reinforcement Learning
* Design and Analysis of Machine Learning Experiments
A long and full list of types of models under each sub-heading:
* Regression: **Modeling relationship between variables, iteratively refined using an error measure.**
* Linear Regression
* Logistic Regression
* OLS (Ordinary Least Squares) Regression
* Stepwise Regression
* MARS (Multivariate Adaptive Regression Splines)
* LOESS (Locally Estimated Scatterplot Smoothing)
* Instance Based: **Build up database of data, compare new data to database; winner-take-all or memory-based learning.**
* k-Nearest Neighbor
* Learning Vector quantization
* Self-Organizing Map
* Localy Weighted Learning
* Regularization: **Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.**
* Ridge Regression
* LASSO (Least Absolute Shrinkage and Selection Operator)
* Elastic Net
* LARS (Least Angle Regression)
* Decision Tree: **Construct a model of decisions made on actual values of attributes in the data.**
* Classification and Regression Tree
* CHAID (Chi-Squared Automatic Interaction Detection)
* Conditional Decision Trees
* Bayesian: **Methods explicitly applying Bayes' Theorem for classification and regression problems.**
* Naive Bayes
* Gaussian Naive Bayes
* Multinomial Naive Bayes
* Bayesian Network
* BBN (Bayesian Belief Network)
* Clustering: **Centroid-based and hierarchical modeling approaches; groups of maximum commonality.**
* k-Means
* k-Medians
* Expectation Maximization
* Hierarchical Clustering
* Association Rule Algorithms: **Extract rules that best explain relationships between variables in data.**
* Apriori algorithm
* Eclat algorithm
* Neural Networks: **Inspired by structure and function of biological neural networks, used ofr regression and classification problems.**
* Radial Basis Function Network (RBFN)
* Perceptron
* Back-Propagation
* Hopfield Network
* Deep Learning: **Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.**
* Convolutional Neural Network (CNN)
* Recurrent Neural Network (RNN)
* Long-Short-Term Memory Network (LSTM)
* Deep Boltzmann Machine (DBM)
* Deep Belief Network (DBN)
* Stacked Auto-Encoders
* Dimensionality Reduction: **Find inherent structure in data, in an unsupervised manner, to describe data using less information.**
* PCA
* t-SNE
* PLS (Partial Least Squares Regression)
* Sammon Mapping
* Multidimensional Scaling
* Projection Pursuit
* Principal Component Regression
* Partial Least Squares Discriminant Analysis
* Mixture Discriminant Analysis
* Quadratic Discriminant Analysis
* Regularized Discriminant Analysis
* Linear Discriminant Analysis
* Ensemble: **Models composed of multiple weaker models, independently trained, that provide a combined prediction.**
* Random Forest
* Gradient Boosting Machines (GBM)
* Boosting
* Bootstrapped Aggregation (Bagging)
* AdaBoost
* Stacked Generalization (Blending)
* Gradient Boosted Regression Trees