Computer science study plan.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 

4.6 KiB

Machine Learning

Machine Learning: The Basics

Topics to review so you don't get weeded out.

  • Supervised learning
  • Unsupervised learning
  • Semi-supervised learning
  • Modeling business decisions usually uses supervised and unsupervised learning.
  • Classification and regression are the most commonly seen machine learning models.

Machine Learning: The Full Topics List

List of topics focusing on theoretical components:

  • Supervised Learning

    • VC Dimension
    • PAC Learning
    • Noise
    • Regression
    • Model Selection
    • Dimensions
  • Bayesian Decision Theory

    • Classification
    • Losses and Risks
    • Discriminant Functions
    • Utility Theory
    • Association Rules
  • Parametric Models

    • Maximum Likelihood Estimation
    • Building Estimators
    • Bayes Estimator
    • Parametric Classification
    • Regression
    • Tuning Model Complexity
    • Model Selection
  • Multivariate Methods

    • Multivariate Data
    • Parameter Estimation
    • Missing Values
    • Multivariate Normal
    • Multivariate Classificaiton
    • Tuning Complexity
    • Discrete Features
    • Multivariate Regression
  • Dimensionality Reduction

    • Subset Selection
    • PCA
    • Factor Analysis
    • Multidimensional Scaling
    • LDA
    • Isomapl
    • Local Linear Embedding
  • Clustering

    • Mixture Densities
    • k-Means
    • Expectation Maximization Algorithm
    • Mixtures of Latent Variables
    • Supervised Learning after Clustering
    • Hierarchical Clustering
    • Numbers of Clusters
  • Nonparametric Methods

  • Decision Trees

  • Linear discrimination

  • Miultilyaer Perceptrons

  • Local Models

  • Kernel Machines

  • Bayesian Estimation

  • Hidden Markov Models

  • Graphical Models

  • Combining Multiple Learners

  • Reinforcement Learning

  • Design and Analysis of Machine Learning Experiments

A long and full list of types of models under each sub-heading:

  • Regression: Modeling relationship between variables, iteratively refined using an error measure.

    • Linear Regression
    • Logistic Regression
    • OLS (Ordinary Least Squares) Regression
    • Stepwise Regression
    • MARS (Multivariate Adaptive Regression Splines)
    • LOESS (Locally Estimated Scatterplot Smoothing)
  • Instance Based: Build up database of data, compare new data to database; winner-take-all or memory-based learning.

    • k-Nearest Neighbor
    • Learning Vector quantization
    • Self-Organizing Map
    • Localy Weighted Learning
  • Regularization: Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.

    • Ridge Regression
    • LASSO (Least Absolute Shrinkage and Selection Operator)
    • Elastic Net
    • LARS (Least Angle Regression)
  • Decision Tree: Construct a model of decisions made on actual values of attributes in the data.

    • Classification and Regression Tree
    • CHAID (Chi-Squared Automatic Interaction Detection)
    • Conditional Decision Trees
  • Bayesian: Methods explicitly applying Bayes' Theorem for classification and regression problems.

    • Naive Bayes
    • Gaussian Naive Bayes
    • Multinomial Naive Bayes
    • Bayesian Network
    • BBN (Bayesian Belief Network)
  • Clustering: Centroid-based and hierarchical modeling approaches; groups of maximum commonality.

    • k-Means
    • k-Medians
    • Expectation Maximization
    • Hierarchical Clustering
  • Association Rule Algorithms: Extract rules that best explain relationships between variables in data.

    • Apriori algorithm
    • Eclat algorithm
  • Neural Networks: Inspired by structure and function of biological neural networks, used ofr regression and classification problems.

    • Radial Basis Function Network (RBFN)
    • Perceptron
    • Back-Propagation
    • Hopfield Network
  • Deep Learning: Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.

    • Convolutional Neural Network (CNN)
    • Recurrent Neural Network (RNN)
    • Long-Short-Term Memory Network (LSTM)
    • Deep Boltzmann Machine (DBM)
    • Deep Belief Network (DBN)
    • Stacked Auto-Encoders
  • Dimensionality Reduction: Find inherent structure in data, in an unsupervised manner, to describe data using less information.

    • PCA
    • t-SNE
    • PLS (Partial Least Squares Regression)
    • Sammon Mapping
    • Multidimensional Scaling
    • Projection Pursuit
    • Principal Component Regression
    • Partial Least Squares Discriminant Analysis
    • Mixture Discriminant Analysis
    • Quadratic Discriminant Analysis
    • Regularized Discriminant Analysis
    • Linear Discriminant Analysis
  • Ensemble: Models composed of multiple weaker models, independently trained, that provide a combined prediction.

    • Random Forest
    • Gradient Boosting Machines (GBM)
    • Boosting
    • Bootstrapped Aggregation (Bagging)
    • AdaBoost
    • Stacked Generalization (Blending)
    • Gradient Boosted Regression Trees