This repository contains checklists to prepare for software engineering and machine learning interviews and jobs.
Software Engineering: The Basics
Topics to review so you don't get weeded out.
- Coding - writing simple code with correct syntax (C, C++, Java).
- Object Oriented Design - basic concepts, class models, patterns.
- Scripting and Regular Expressions - know your Unix tooling.
- Data Structures - demonstrate basic knowledge of common data structures.
- Bits and Bytes - know about bits, bytes, and binary numbers.
Things you absolutely, positively must know:
- Algorithm complexity
- Sorting - know how to sort, know at least 2 O(n log n) sort methods (merge sort and quicksort)
- Hashtables - the most useful data structure known to humankind.
- Trees - this is basic stuff, BFS/DFS, so learn it.
- Graphs - twice as important as you think they are.
- Other Data Structures - fill up your brain with other data structures.
- Math - discrete math, combinatorics, probability.
- Systems - operating system level, concurrency, threads, processing, memory.
Software Engineering: The Full Topics List
A much longer and fuller list of topics:
- Linked lists
- Hash tables
- Binary search trees
- Heap trees
- Priority queues
- Balanced search trees
- Tree traversal: preorder, inorder, postorder, BFS, DFS
- Adjacency matrix
- Adjacency list
- BFS, DFS
- Built-In Data Structures
- Java Collections
- C++ Standard Library
- Disjoint Sets
- Union Find
- Advanced Tree Structures
- Red-Black Trees
- Splay Trees
- AVL Trees
- k-D Trees
- Van Emde Boas Trees
- N-ary, K-ary, M-ary Trees
- Balanced Search Trees
- 2-3 Trees, 2-4 Trees
- Augmented Data Structures
- NP, NP-Complete, Approximation Algorithms
- Sequential search
- Binary search
- Merge sort
- String algorithms
- String search methods
- String manipulation methos
- Dynamic programming
- Computational Geometry
- Convex Hull
Object Oriented Programming
- Design patterns
Bits and Bytes
- Linear Algebra
- Bloom Filter
Crypto and Security
- Information Theory
- Parity and Hamming Code
- Hash Attacks
- Kernel Basics
- Command Line Tools
Systems Level Programming
- Processing and threads
- System routines
- Messaging Systems
- Queue Systems
- Parallel Programming
- Systems Deisng
- Data Handling
- Garbage Collection
Machine Learning: The Basics
Topics to review so you don't get weeded out.
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Modeling business decisions usually uses supervised and unsupervised learning.
- Classification and regression are the most commonly seen machine learning models.
Machine Learning: The Full Topics List
A longer, fuller list of topics:
- Modeling relationship between variables, iteratively refined using an error measure.
- Linear Regression
- Logistic Regression
- OLS (Ordinary Least Squares) Regression
- Stepwise Regression
- MARS (Multivariate Adaptive Regression Splines)
- LOESS (Locally Estimated Scatterplot Smoothing)
- Build up database of data, compare new data to database; winner-take-all or memory-based learning.
- k-Nearest Neighbor
- Learning Vector quantization
- Self-Organizing Map
- Localy Weighted Learning
- Extension made to other methods, penalizes model complexity, favors simpler and more generalizable models.
- Ridge Regression
- LASSO (Least Absolute Shrinkage and Selection Operator)
- Elastic Net
- LARS (Least Angle Regression)
- Construct a model of decisions made on actual values of attributes in the data.
- Classification and Regression Tree
- CHAID (Chi-Squared Automatic Interaction Detection)
- Conditional Decision Trees
- Methods explicitly applying Bayes' Theorem for classification and regression problems.
- Naive Bayes
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- Bayesian Netowrk
- BBN (Bayesian Belief Network)
- Centroid-based and hierarchical modeling approaches; groups of maximum commonality.
- Expectation Maximization
- Hierarchical Clustering
Association Rule Algorithms
- Extract rules that best explain relationships between variables in data.
- Apriori algorithm
- Eclat algorithm
- Inspired by structure and function of biological neural networks, used ofr regression and classification problems.
- Radial Basis Function Network (RBFN)
- Hopfield Network
- Neural networks that exploit cheap and abundant computational power; semi-supervised, lots of data.
- Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN)
- Long-Short-Term Memory Network (LSTM)
- Deep Boltzmann Machine (DBM)
- Deep Belief Network (DBN)
- Stacked Auto-Encoders
- Find inherent structure in data, in an unsupervised manner, to describe data using less information.
- PLS (Partial Least Squares Regression)
- Sammon Mapping
- Multidimensional Scaling
- Projection Pursuit
- Principal Component Regression
- Partial Least Squares Discriminant Analysis
- Mixture Discriminant Analysis
- Quadratic Discriminant Analysis
- Regularized Discriminant Analysis
- Linear Discriminant Analysis
- Models composed of multiple weaker models, independently trained, that provide a combined prediction.
- Random Forest
- Gradient Boosting Machines (GBM)
- Bootstrapped Aggregation (Bagging)
- Stacked Generalization (Blending)
- Gradient Boosted Regression Trees
- Pick one subject from the list.
- Watch videos on the topic.
- Implement the concept in Java or Python.
- Optionally, implement in C (and/or in C++, with or without the stdlib).
- Write tests to ensure code is correct.
- Practice until you are sick of it.
- Work within limited constraints (think interviews).
- Know the built-in types.
Practice writing out on a whiteboard and/or on paper, before implementing on computer. Get a big drawing pad from the art store.