Browse Source

Fill in full machine learning topics list

master
Charles Reid 5 years ago
parent
commit
25c2f377fd
  1. 120
      README.md

120
README.md

@ -10,11 +10,11 @@ software engineering and machine learning interviews and jobs.
Topics to review so you don't get weeded out.
[Five essential screening questions](https://sites.google.com/site/steveyegge2/five-essential-phone-screen-questions):
* Coding - writing simple code with correct syntax (C, C++, Java).
* Object Oriented Design - basic concepts, class models, patterns.
* Scripting and Regular Expressions - know your Unix tooling.
* Data Structures - demonstrate basic knowledge of common data structures.
* Bits and Bytes - know about bits, bytes, and binary numbers.
* Coding - writing simple code with correct syntax (C, C++, Java).
* Object Oriented Design - basic concepts, class models, patterns.
* Scripting and Regular Expressions - know your Unix tooling.
* Data Structures - demonstrate basic knowledge of common data structures.
* Bits and Bytes - know about bits, bytes, and binary numbers.
Things you absolutely, positively **must** know:
* Algorithm complexity
@ -137,6 +137,116 @@ A much longer and fuller list of topics:
## Machine Learning
### Machine Learning: The Basics
Topics to review so you don't get weeded out.
* Supervised learning
* Unsupervised learning
* Semi-supervised learning
* Modeling business decisions usually uses supervised and unsupervised learning.
* Classification and regression are the most commonly seen machine learning models.
### Machine Learning: The Full Topics List
A longer, fuller list of topics:
* Regression
* **Modeling relationship between variables, iteratively refined using an error measure.**
* Linear Regression
* Logistic Regression
* OLS (Ordinary Least Squares) Regression
* Stepwise Regression
* MARS (Multivariate Adaptive Regression Splines)
* LOESS (Locally Estimated Scatterplot Smoothing)
* Instance Based
* **Build up database of data, compare new data to database; winner-take-all or memory-based learning.**
* k-Nearest Neighbor
* Learning Vector quantization
* Self-Organizing Map
* Localy Weighted Learning
* Regularization
* **Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.**
* Ridge Regression
* LASSO (Least Absolute Shrinkage and Selection Operator)
* Elastic Net
* LARS (Least Angle Regression)
* Decision Tree
* **Construct a model of decisions made on actual values of attributes in the data.**
* Classification and Regression Tree
* CHAID (Chi-Squared Automatic Interaction Detection)
* Conditional Decision Trees
* Bayesian
* **Methods explicitly applying Bayes' Theorem for classification and regression problems.**
* Naive Bayes
* Gaussian Naive Bayes
* Multinomial Naive Bayes
* Bayesian Netowrk
* BBN (Bayesian Belief Network)
* Clustering
* **Centroid-based and hierarchical modeling approaches; groups of maximum commonality.**
* k-Means
* k-Medians
* Expectation Maximization
* Hierarchical Clustering
* Association Rule Algorithms
* **Extract rules that best explain relationships between variables in data.**
* Apriori algorithm
* Eclat algorithm
* Neural Networks
* **Inspired by structure and function of biological neural networks, used ofr regression and classification problems.**
* Radial Basis Function Network (RBFN)
* Perceptron
* Back-Propagation
* Hopfield Network
* Deep Learning
* **Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.**
* Convolutional Neural Network (CNN)
* Recurrent Neural Network (RNN)
* Long-Short-Term Memory Network (LSTM)
* Deep Boltzmann Machine (DBM)
* Deep Belief Network (DBN)
* Stacked Auto-Encoders
* Dimensionality Reduction
* **Find inherent structure in data, in an unsupervised manner, to describe data using less information.**
* PCA
* t-SNE
* PLS (Partial Least Squares Regression)
* Sammon Mapping
* Multidimensional Scaling
* Projection Pursuit
* Principal Component Regression
* Partial Least Squares Discriminant Analysis
* Mixture Discriminant Analysis
* Quadratic Discriminant Analysis
* Regularized Discriminant Analysis
* Linear Discriminant Analysis
* Ensemble
* **Models composed of multiple weaker models, independently trained, that provide a combined prediction.**
* Random Forest
* Gradient Boosting Machines (GBM)
* Boosting
* Bootstrapped Aggregation (Bagging)
* AdaBoost
* Stacked Generalization (Blending)
* Gradient Boosted Regression Trees

Loading…
Cancel
Save