

@ 10,11 +10,11 @@ software engineering and machine learning interviews and jobs. 



Topics to review so you don't get weeded out. 







[Five essential screening questions](https://sites.google.com/site/steveyegge2/fiveessentialphonescreenquestions): 



* Coding  writing simple code with correct syntax (C, C++, Java). 



* Object Oriented Design  basic concepts, class models, patterns. 



* Scripting and Regular Expressions  know your Unix tooling. 



* Data Structures  demonstrate basic knowledge of common data structures. 



* Bits and Bytes  know about bits, bytes, and binary numbers. 



* Coding  writing simple code with correct syntax (C, C++, Java). 



* Object Oriented Design  basic concepts, class models, patterns. 



* Scripting and Regular Expressions  know your Unix tooling. 



* Data Structures  demonstrate basic knowledge of common data structures. 



* Bits and Bytes  know about bits, bytes, and binary numbers. 







Things you absolutely, positively **must** know: 



* Algorithm complexity 


@ 137,6 +137,116 @@ A much longer and fuller list of topics: 







## Machine Learning 







### Machine Learning: The Basics 







Topics to review so you don't get weeded out. 



* Supervised learning 



* Unsupervised learning 



* Semisupervised learning 



* Modeling business decisions usually uses supervised and unsupervised learning. 



* Classification and regression are the most commonly seen machine learning models. 







### Machine Learning: The Full Topics List 







A longer, fuller list of topics: 







* Regression 



* **Modeling relationship between variables, iteratively refined using an error measure.** 



* Linear Regression 



* Logistic Regression 



* OLS (Ordinary Least Squares) Regression 



* Stepwise Regression 



* MARS (Multivariate Adaptive Regression Splines) 



* LOESS (Locally Estimated Scatterplot Smoothing) 







* Instance Based 



* **Build up database of data, compare new data to database; winnertakeall or memorybased learning.** 



* kNearest Neighbor 



* Learning Vector quantization 



* SelfOrganizing Map 



* Localy Weighted Learning 







* Regularization 



* **Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.** 



* Ridge Regression 



* LASSO (Least Absolute Shrinkage and Selection Operator) 



* Elastic Net 



* LARS (Least Angle Regression) 







* Decision Tree 



* **Construct a model of decisions made on actual values of attributes in the data.** 



* Classification and Regression Tree 



* CHAID (ChiSquared Automatic Interaction Detection) 



* Conditional Decision Trees 







* Bayesian 



* **Methods explicitly applying Bayes' Theorem for classification and regression problems.** 



* Naive Bayes 



* Gaussian Naive Bayes 



* Multinomial Naive Bayes 



* Bayesian Netowrk 



* BBN (Bayesian Belief Network) 







* Clustering 



* **Centroidbased and hierarchical modeling approaches; groups of maximum commonality.** 



* kMeans 



* kMedians 



* Expectation Maximization 



* Hierarchical Clustering 







* Association Rule Algorithms 



* **Extract rules that best explain relationships between variables in data.** 



* Apriori algorithm 



* Eclat algorithm 







* Neural Networks 



* **Inspired by structure and function of biological neural networks, used ofr regression and classification problems.** 



* Radial Basis Function Network (RBFN) 



* Perceptron 



* BackPropagation 



* Hopfield Network 







* Deep Learning 



* **Neural networks that exploit cheap and abundant computational power; semisupervised, lots of data.** 



* Convolutional Neural Network (CNN) 



* Recurrent Neural Network (RNN) 



* LongShortTerm Memory Network (LSTM) 



* Deep Boltzmann Machine (DBM) 



* Deep Belief Network (DBN) 



* Stacked AutoEncoders 







* Dimensionality Reduction 



* **Find inherent structure in data, in an unsupervised manner, to describe data using less information.** 



* PCA 



* tSNE 



* PLS (Partial Least Squares Regression) 



* Sammon Mapping 



* Multidimensional Scaling 



* Projection Pursuit 



* Principal Component Regression 



* Partial Least Squares Discriminant Analysis 



* Mixture Discriminant Analysis 



* Quadratic Discriminant Analysis 



* Regularized Discriminant Analysis 



* Linear Discriminant Analysis 







* Ensemble 



* **Models composed of multiple weaker models, independently trained, that provide a combined prediction.** 



* Random Forest 



* Gradient Boosting Machines (GBM) 



* Boosting 



* Bootstrapped Aggregation (Bagging) 



* AdaBoost 



* Stacked Generalization (Blending) 



* Gradient Boosted Regression Trees 














































