6.9 KiB
Study Plan
This repository contains checklists to prepare for software engineering and machine learning interviews and jobs.
Software Engineering
Software Engineering: The Basics
Topics to review so you don't get weeded out.
Five essential screening questions:
 Coding  writing simple code with correct syntax (C, C++, Java).
 Object Oriented Design  basic concepts, class models, patterns.
 Scripting and Regular Expressions  know your Unix tooling.
 Data Structures  demonstrate basic knowledge of common data structures.
 Bits and Bytes  know about bits, bytes, and binary numbers.
Things you absolutely, positively must know:
 Algorithm complexity
 Sorting  know how to sort, know at least 2 O(n log n) sort methods (merge sort and quicksort)
 Hashtables  the most useful data structure known to humankind.
 Trees  this is basic stuff, BFS/DFS, so learn it.
 Graphs  twice as important as you think they are.
 Other Data Structures  fill up your brain with other data structures.
 Math  discrete math, combinatorics, probability.
 Systems  operating system level, concurrency, threads, processing, memory.
Software Engineering: The Full Topics List
A much longer and fuller list of topics:

Algorithm complexity

Data structures
 Arrays
 Linked lists
 Stacks
 Queues
 Hash tables
 Trees
 Binary search trees
 Heap trees
 Priority queues
 Balanced search trees
 Tree traversal: preorder, inorder, postorder, BFS, DFS
 Graphs
 Directed
 Undirected
 Adjacency matrix
 Adjacency list
 BFS, DFS
 BuiltIn Data Structures
 Java Collections
 C++ Standard Library
 Sets
 Disjoint Sets
 Union Find
 Advanced Tree Structures
 RedBlack Trees
 Splay Trees
 AVL Trees
 kD Trees
 Van Emde Boas Trees
 Nary, Kary, Mary Trees
 Balanced Search Trees
 23 Trees, 24 Trees
 Augmented Data Structures

Algorithms
 NP, NPComplete, Approximation Algorithms
 Searching
 Sequential search
 Binary search
 Sorting
 Selection
 Insertion
 Heapsort
 Quicksort
 Merge sort
 String algorithms
 String search methods
 String manipulation methos
 Recursion
 Dynamic programming
 Computational Geometry
 Convex Hull

Object Oriented Programming
 Design patterns

Bits and Bytes

Mathematics
 Combinatorics
 Probability
 Linear Algebra
 FFT
 Bloom Filter
 HyperLogLog

Crypto and Security
 Information Theory
 Parity and Hamming Code
 Entropy
 Hash Attacks

Unix
 Kernel Basics
 Command Line Tools
 Emacs/Vim

Systems Level Programming
 Processing and threads
 Caching
 Memory
 System routines
 Messaging Systems
 Serialization
 Queue Systems

Scaling
 Parallel Programming
 Systems Deisng
 Scalability
 Data Handling

Supplemental topics
 Unicode
 Endianness
 Networking
 Compilers
 Compression
 Garbage Collection
Machine Learning
Machine Learning: The Basics
Topics to review so you don't get weeded out.
 Supervised learning
 Unsupervised learning
 Semisupervised learning
 Modeling business decisions usually uses supervised and unsupervised learning.
 Classification and regression are the most commonly seen machine learning models.
Machine Learning: The Full Topics List
A longer, fuller list of topics:

Regression
 Modeling relationship between variables, iteratively refined using an error measure.
 Linear Regression
 Logistic Regression
 OLS (Ordinary Least Squares) Regression
 Stepwise Regression
 MARS (Multivariate Adaptive Regression Splines)
 LOESS (Locally Estimated Scatterplot Smoothing)

Instance Based
 Build up database of data, compare new data to database; winnertakeall or memorybased learning.
 kNearest Neighbor
 Learning Vector quantization
 SelfOrganizing Map
 Localy Weighted Learning

Regularization
 Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.
 Ridge Regression
 LASSO (Least Absolute Shrinkage and Selection Operator)
 Elastic Net
 LARS (Least Angle Regression)

Decision Tree
 Construct a model of decisions made on actual values of attributes in the data.
 Classification and Regression Tree
 CHAID (ChiSquared Automatic Interaction Detection)
 Conditional Decision Trees

Bayesian
 Methods explicitly applying Bayes' Theorem for classification and regression problems.
 Naive Bayes
 Gaussian Naive Bayes
 Multinomial Naive Bayes
 Bayesian Netowrk
 BBN (Bayesian Belief Network)

Clustering
 Centroidbased and hierarchical modeling approaches; groups of maximum commonality.
 kMeans
 kMedians
 Expectation Maximization
 Hierarchical Clustering

Association Rule Algorithms
 Extract rules that best explain relationships between variables in data.
 Apriori algorithm
 Eclat algorithm

Neural Networks
 Inspired by structure and function of biological neural networks, used ofr regression and classification problems.
 Radial Basis Function Network (RBFN)
 Perceptron
 BackPropagation
 Hopfield Network

Deep Learning
 Neural networks that exploit cheap and abundant computational power; semisupervised, lots of data.
 Convolutional Neural Network (CNN)
 Recurrent Neural Network (RNN)
 LongShortTerm Memory Network (LSTM)
 Deep Boltzmann Machine (DBM)
 Deep Belief Network (DBN)
 Stacked AutoEncoders

Dimensionality Reduction
 Find inherent structure in data, in an unsupervised manner, to describe data using less information.
 PCA
 tSNE
 PLS (Partial Least Squares Regression)
 Sammon Mapping
 Multidimensional Scaling
 Projection Pursuit
 Principal Component Regression
 Partial Least Squares Discriminant Analysis
 Mixture Discriminant Analysis
 Quadratic Discriminant Analysis
 Regularized Discriminant Analysis
 Linear Discriminant Analysis

Ensemble
 Models composed of multiple weaker models, independently trained, that provide a combined prediction.
 Random Forest
 Gradient Boosting Machines (GBM)
 Boosting
 Bootstrapped Aggregation (Bagging)
 AdaBoost
 Stacked Generalization (Blending)
 Gradient Boosted Regression Trees
Daily Plan
Each day:
 Pick one subject from the list.
 Watch videos on the topic.
 Implement the concept in Java or Python.
 Optionally, implement in C (and/or in C++, with or without the stdlib).
 Write tests to ensure code is correct.
 Practice until you are sick of it.
 Work within limited constraints (think interviews).
 Know the builtin types.
Code:
Practice writing out on a whiteboard and/or on paper, before implementing on computer. Get a big drawing pad from the art store.