With this article I want to go a little bit further than widespread memos “X most common data structures for interviews”. I am not only interested in information about asymptotic complexity. The real value – understanding whether particular algorithms or data structure can be beneficial in particular case: real life tasks rarely given with hints like “Use directed graphs, Luke!” or which comparison function should you implement to have your favorite sort algorithm work accordingly to upper bound complexity from textbooks. On this page you can find not just theoretical high level summary but also few facts that I am considering important, practical or just afraid to forget.
Sorting
Few considerations worth to notice before speculating about “Best” sorting algorithms:
 How much data we are dealing with?
 Dozens numbers or billions?
 What is nature of data – i.e. what is cost of element comparison?
 Integer\strings\complex objects? For example it may involve disks seeks and we should aim for algorithms that minimize comparison or even stick with noncomparison based algorithms?
 What is the cost of elements transfer?
 Should we aim to algorithms with minimum number of swaps/data copying
 What is current state of data? Does it partially sorted?
 In case of partially sorted data some algorithms may touch they upper bounds but adaptive class of algorithms can deal with this input in efficient manner
 Are data allow random access?
 For example: linked list vs arrays. Classical quicksort on linked list always be inefficient.
 Are data fit to the memory?
 Otherwise we need modifications of algorithms that support external sorting.
 Are we are dealing with memory constrained environment?
 Can allow auxiliary space or have to stick with inplace algorithms?
 Is there possible duplicative values in array to be sorted?
 Do we need stable sorting – i.e. should they keep their initial order from input array?
Comparative based algorithms – i.e. we compare the whole values of input array (as opposite to distinct bits of those values). Any comparative based sorting algorithms lower bound worst case complexity – N log (N).
Comparative based algorithms – out of our league – not used in reality, but sometimes can be asked during interview
Algorithm  Time complexity  Space complexity  Comments  

Best  Average  Worst  Auxiliary  
Bubble sort  O ( N )  O ( N ^ 2 )  O ( N ^ 2 )  O ( 1 ) 
InPlace. 
Select sort  O ( N ^ 2 )  O ( N ^ 2 )  O ( N ^ 2 )  O ( 1 )  At ith iteration find index MIN of the smallest remaining entry and swap it ith element with MIN. N Exchange. (Stable for list.) InPlace. Nonstable. 
Practical comparative based algorithms
Algorithm  Time complexity  Space complexity  Comments  

Best  Average  Worst  Auxiliary  
Quicksort  O (N Log N)  O (N Log N)  O ( N ^ 2)  O ( Log N )  O( Log N ) probabilistic guarantee. Pick a pivot, partition array in order to group all elems less than pivot in the left subarray, and those which are bigger – to the right.Recursively apply. Pivot may be chosen randomly but there are better strategies like median of medians. InPlace. Stable. 
3Way Quicksort  O (K * N)  ?  O ( (2 ^ k) * N)  O ( Log N )  Used in case of duplicate keys. Pick a pivot element and partitions elements into three sets: less, equal or greater than pivot. There are variants with two pivots. InPlace. Stable. 
Mergesort  O (N Log N)  O (N Log N)  O (N Log N)  O ( N )
NOTE: For linked lists: 
Divide array into two subarrays and sort them recursively and merge in sorted order. In simplest version – till subarray have size 1, in more optimised – till it reach size of threshold (1020 elements) and another sorting algorithms can by applied (insertion sort for example). Easy modification for parallel and external sorting (aka notfit in memory). Less comparison than quicksort. More element transfer than quicksort. Not Inplace. Stable. 
Heapsort  O (N Log N)  O (N Log N)  O (N Log N)  O ( 1 )  Construct a max heap mapping array’s indexes to binary search tree. After constructing recursively delete root from tree until it is not empty. InPlace. Nonstable. 
Insertion sort  O ( N )  O ( N ^ 2)  O ( N ^ 2 )  O ( 1 )  At ith iteration swap ith element with each larger entry to its left. Useful for partially sorted array or small arrays – less than 10 elements. Adaptive. InPlace. Stable. 
Shell sort  O ( N )  ?  O (N ^ (3 / 2)) O(N * (Log N) ^ 2) 
O ( 1 )  Like insertion sort but we rely on Dorder sort, where D – gap between sorted entries. All estimations depend on choosing gap sequence. Used in embedded devices due to nonrecursive nature – no need for support deep and memory hungry call stack). Adaptive. Inplace. Nonstable. 
Timsort  O ( N )  O (N Log N)  O (N Log N)  O ( N )  Based on Merge sort and Insert sort: tries to find a consecutive sequence of ordered elements and merge them to decrease necessary processing to final results. Used in Java and Python as default sorting algorithm. Adaptive. Not Inplace. Stable. 
Introsort  O ( N )  O (N Log N)  O (N Log N)  O ( N )  Hybrid between quicksort & heap sort. Used in c++ as default implementation of sorting. Adaptive. Not Inplace. Nonstable. 
Treesort  O (N Log N)  O (N Log N)  O (N Log N)  O ( N )  Build binary search tree from elements and traverse it inorder. Variations with self balancing binary search tree or Splay tree (Adaptive). Estimations based on usage of balanced trees. Not Inplace. Stable. 
Cubesort  ?  ?  O (N Log N)  O ( N )  Parallel sorting algorithm Not Inplace. Stable. 
Non–comparison based sorting algorithms. For such kind of algorithms we do not compare the whole keys (values) but using individual bits of values (for example characters for strings or digits for numbers) – it allows to achieve linear time O(N) complexity. Price for it – necessity to tune it for every new type.
Algorithm  Time complexity  Space complexity  Comments  

Best  Average  Worst  Auxiliary  
Radix sort  O ( N * K )  O ( N * K )  O ( N * K )  O ( N + K )  Examines individual bits of keys. K – max number of digits in array’s elements. Applicable for integer numbers, string and float numbers in special formatting. Worth to use in situation when N >> K and K is fixed – famous question about sorting of million of 32bit numbers. MSD vs LSD – most significant vs least significant digits. InPlace. Stable. 
Bucket sort  O ( N + K )  O ( N + K )  O ( N ^ 2 )  O ( N * K )  Partitioning an array into a number of buckets. Each bucket is sorted either by recursive application of bucket sort or using other algorithms. Can be used as external sorting algorithms. Asymptotic given for cases when K ~ N. Applicable in case data evenly distributed over a range. 
Counting sort  O ( N + K )  O ( N + K )  O ( N + K )  O ( N + K )  Count number of keys related to item, use this count to determine position of elements. Asymptotic given for cases when K ~ N. Applicable in case N ~ K. Stable. 
Nth biggest\smallest element
just a variations of partial sort:
 QuickSelect – worst case O(n^2) best case and average – O(n) – as quicksort with pivot based partitioning but move in one direction only
 Median of medians – improve worst case of quickselect to O(n) – partition input to group of several(5) elements, find median within every group, and return median of n/5 medians
 Introselect – back up nth_element at C++ STL – start as classical quick select and switch on median of median method in case depth of recursion is too deep and at the same time sizes of subpartitions observed so far not leading to halving.
Hashing
Family of hashing functions: H(X) = ((a * x + b) mod p) mod N) a,b < p, a!= 0, p – prime number used to reduce number of collisions (it depend on your keys distribution).
Two main approaches: separate chaining vs open addressing (linear probing) with double hashing.
 Separate chaining – array of lists, at the very worst case – O(N) to retrieve record!
 Open addressing – circular buffer – i.e. in case of collisions try to insert record to next unoccupied cell – which lead to poor performance in case of bad choice of hashing function. Load factor – how it will perform if 25\50\80 % of cells of array will be occupied. Used for hashtable implementation in python.
Separate chaining advantages:
 Easy implementation of delete method
 Performance degrades gracefully
 Clustering is less sensitive to poorly designed hash functions
Linear probing advantages:
 Less wasted space (pointers)
 Better cache performance
There are various use cases & requirements for hashing that affect characteristics of good hash function:
 Quickly insert\return records with mapping Key to Value – cheap cost of computing hash and low number of collisions
 Find similar or close entries – need to maximise collisions if keys meet some criteria – locality sensitive caching (for example GeoHash) with rolling hash to speed up computations, another famous example: Rabin Karp string search algorithm.
 Make it difficult to find key corresponding to particular hash value – one way function – Cryptography related hashing – fixed size output for any input, no collisions and at the same time almost impossible (time and computation consuming wise) to find input that produce particular hash
 Cornerstone of more advanced data structures: bloom filter, merkle tree, distributed hash table.
 Interesting overview of details of hash implementation in various languages can be found here.
Heaps
Root always contains element that satisfy heap property: minheap or maxheap. Elements not sorted but partially ordered. Heaps may be constructed in O(N) using Floyd algorithm. Usually used as underneath structure for priority queues. Leftist heap as example of pure functional data structure.
Heap’s underneath datastructure  Time complexity of operations  Comments  

Find Max  Extract Max  Increase Key  Insert  Delete  Merge  
Linked List (unsorted)  O ( N )  O ( N )  O ( 1 )  O ( 1 )  O ( 1 )  O ( 1 )  
Array (sorted)  O ( 1 )  O ( 1 )  O ( N )  O ( N )  O ( 1 )  O ( N + M )  
Binary Heap  O ( 1 )  O (Log N)  O (Log N)  O (Log N)  O (Log N)  O (N + M)  Heapify: O(N)
Binary Heap (array representation) – ordered complete binary tree – parent’s key no smaller than children’s key. Complete binary tree. 
Binomial Heap  O (Log N)  O (Log N)  O (Log N)  O (Log N)  O (Log N)  O (Log N)  A binomial heap of order K has a root whose children are roots of binomial trees of orders K1, K2, … , 2, 1, 0 – in this order. Each binomial tree is a heap obeys the minimumheap property: the key of a node is greater than or equal to the key of its parent. There can only be either one or zero binomial trees for each order, including zero order. Support quick merging of two heaps. 
Fibonacci Heap  O ( 1 )  O (Log N)  O ( 1 )  O ( 1 )  O (Log N)  O ( 1 )  Collections of trees. Used for priority queue. Degree of nodes (degree means the number of children) are kept low; every node has a degree at most O(Log N) and the size of a subtree rooted in a node of degree K is at least F(K) + 2, where F(K) is Kth Fibonacci number. Amortised constant factor affect practical usage. High memory consumption. 
Tree
Full tree – every node except leafs have nonnull children. Complete tree – full tree that have all nodes at the last level are lean left.
Tree traversal:
 Preorder – via dfs
 Inorder – via dfs
 Postorder – via dfs
 Levelorder – via bfs
Search data structures
 Size of element (read as cost of access\copy\compare operations)
 Volume of data: can it fit to RAM?
 Usage patterns:
 evenly mixed read & writes or skew for particular operations?
 access single key or range?
 access to value using the whole key or interested to find all values where even parts of key matches?
Data structure  Average time complexity  Worst time complexity  Auxiliary Space  Comments  

Indexing  Searching  Insertion  Deletion  Indexing  Searching  Insertion  Deletion  
Arrays  O ( 1 )  O ( N )  O ( N )  O ( N )  O ( 1 )  O ( N )  O ( N )  O ( N )  O ( N )  Cache friendly, low memory overhead. Preallocated vs dynamic. 
Linked list  O ( N )  O ( N )  O ( 1 )  O ( 1 )  O ( N )  O ( N )  O ( 1 )  O ( 1 )  O ( N )  Size of data vs size of pointer(s). One directional vs bidirectional vs circular. Sequential access vs random. Internal or external storage (i.e. node contains only reference to actual data) Foundation of stack, queue and dequeue. 
Skip list  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( N )  O ( N )  O ( N )  O ( N )  O ( N Log N)  Allow fast search using additional references hierarchy: next, the next after next, every 4th, ith level – should contain 2 ^ i elements, but it used probabilistics skips – i.e. at ith level, next reference may point to 2^i element or further. Alternative for hash table and balanced tree – theoretical worst case performance can be worse, but in practice usually faster – depend on nature of data and cost of comparison operator. Foundation of various lockfree implementations for dictionaries and priority queues. Used in Redis, LevelDb and Lucene. 
Hash table (symbol table, associative array)  ?  O ( 1 )  O ( 1 )  O ( 1 )  ?  O ( N )  O ( N )  O ( N )  O ( N )  Not best in case of small number of keys in comparison to direct mapping key > array index. In practice, particular operation can be slow because of array resizing and cache unfriendly nature may lead to poor performance in comparison even with O(N) brute force lookup of simple arrays. C++: unordered_map, multimap Python: dict Java: HashMap 
Binary Search Tree  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( N )  O ( N )  O ( N )  O ( N )  O ( N )  Complete if it perfectly balanced, except for a bottom level. In worst case (depend on insertion order) it can be Nheight. Constructing as quick sort. Application of problem: 1Drange search. 
Cartesian Tree  ?  O ( Log N )  O ( Log N )  O ( Log N )  ?  O ( N )  O ( N )  O ( N )  O ( N )  It is heapordered and symmetric (inorder) traversal of the tree returns the original sequence. Used for RMQ (range minimum query) kind of tasks. 
BTree  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( N ) 
Not just two child anymore => heigh of tree lower => lookup faster. Various modifications.

23 Tree  O (C * Lg(N))  O (C * Lg(N))  O (C * Lg(N))  O (C * Lg(N))  O (C * Lg(N))  O (C * Lg(N))  O (C * Lg(N))  O (C * Lg(N))  ?  Maintain symmetric order and perfect balance. 
RedBlack Tree  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O( N ) 
Balance is preserved via painting different nodes in Red\Black colours to satisfy some predefined properties. Rearranging and recolouring can be performed efficiently. For example: no node has 2 red links connected to it. Red links lean left. Every path from root to null link has the same number of black links – perfect black balance. C++: set and map. Java: TreeMap and TreeSet (and in some situation HashMap fallback on their usage as well!) 
Splay Tree  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O( N ) 
Splay Tree is a selfadjusting binary search tree with additional property that recently accessed elements are quick to access again. Applications: cache, garbage collectors, memory allocators. 
AVL Tree  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O ( Log N )  O( N )  Balance in AVL Tree – the heights of two child subtrees of any node differ by at most one. AVL Tree better for lookup intensive applications than RedBlack Trees 
Kd Tree  ?  O ( Log N )  O ( Log N )  O ( Log N )  O(N Log N)  O ( N )  O ( N )  O ( N )  O( N ) 
Space partition Binary Tree. Used for range search and nearest neighbour search. Constructing N Log N. Not suitable for high dimensional space – better to use when N >> 2^k. Octree for 3D space. 
More advanced structures worth to be aware of:
 Trie (Prefix Tree) and its variations, mainly Radix Tree for lookup intensive applications. Worst case – O(M), where M is the length of string to search. DAFSA – can be a great alternative for cases with limited memory.
 Generalized Suffix Tree – search efficiently with mismatches, applications for various string related tasks: longest common substrings, longest repeated substrings, longest palindromic substring,
 Fenwick Tree (Binary indexed tree)– common use case: modify element of array and return result of some inverse operations for range of elements (for example partial sum but not max). Simpler and faster in comparison with segment trees. O(N) build time.
 Segment Tree – classical approach for RMQ kind of tasks, operation not necessary to be inverse. Double memory requirement in comparison to fenwick tree. Constant factor affect speed in practice, especially in cases of multiple dimensions.
 Implicit Treap – sequence of pairs, heap ordered by 1st entry, BST ordered by the 2nd. If we use index of element in array as first entry – than we can
 insert\remove element at any position
 move subset of elements in any position
 computation of some function for elements within some interval [i, j] – sum, min – including range updates using lazy propagations
 foundation for Rope implementation – to have similar asymptotic for very long strings
 and all of the above in O(Log N) – with huge constant factor though
 vEB Tree – ordered dictionary or associative array much faster (exponentially) than BST, memory overhead
 HeavyLight Decomposition: split tree on multiple notcrossing paths. Edge from A to B is considered heavy if number of vertexes reachable from B is at least half as big as number of vertexes reachable from A. HeavyLight Decomposition – union of light vertexes from bottom of tree to the root. Such decomposition allows us to serve queries like “Get some metric on the path between A and B” – for example sum or max. Not needed in case we do not have update queries. Usually used with segment tree and LCA.
Union Find (Disjoint set)
Approaches: colouring, trees on top of arrays using ranks\randomisation\path compression & merging either randomly or using ranks of trees. Additional information can be stored with leaders of set.
Applications: offline LCA, offline RMQ, dynamic connectivity variables dependencies during program analysis, variations of percolation problem – connections in social network, fluids or electricity flow.
Graph
Representation: incidence matrix vs adjacency list. Generally adjacency list is preferable for sparse graph, but there are variation of incidence matrix based on associative array where pair of vertexes used as a key.
Possible variations:
 Complete
 Directed
 Weighted
 Planar
 Bipartite
Cut – set of vertexes from graph, if they will be removed – graph become disconnected. Size of min cut – connectivity or Kconnectivity.
Graph related tasks
 Find cycles in graph – union find (complexity – Akkerman function from number of vertexes) or DFS – proportional to number of vertexes
 Determine whether graph is bipartite – dfs
 Eulerian path – route visiting every edge exactly once (exist if all vertexes has even degree and it is connected) – search of such path can be done in linear time O(E)
 Hamiltonian path – route including every vertex exactly once (as in Travelling salesman problem)
 Comparison of graph – graph’s isomorphism (still open question N or NP)
 Graph representation in such way that there are no edge crossing, except in vertexes (upward planar graph) – there are some special cases where such checks can be done in O(N) but generally it is NPhard problem.
 Minimum Spanning Tree (MST) – application is design of network (For example to avoid cycle of packets in Ethernet) or clustering, with custom distance function.
For directed graphs:
 Topological sorting – tasks with ordering limitations, schedule related tasks (DFS) – create schedule in case we do not have cycle – redraw graph in such way that all path follow from bottom to top – reverse dfs postorder (additional stack) = result
 path finding – Dijkstra, if weights are negative – BellmanFord or its optimised version – Shortest Path Faster Algorithm (SPFA)
 Shortest path for multiple sources – dense graph with negative weights – FloydWarshall or not dense – Johnson’s algorithms or Dijkstra
 Search for cycle – DFS – detection of deadlock in concurrent systems or topological sort
 Maximum flow – Ford–Fulkerson algorithm , Bipartite matching problem, baseball elimination
 Dynamic connectivity
 2SAT – satisfy system of constraints of boolean variables – scheduling, medicine testing. Can be represented as directed implication graph. Checking whether two vertexes are within the same connected component allow to answer question whether we can satisfy all conditions.
Substring search
 KnuthMorrisPratt (KMP) – based on DFA (deterministic finite state automaton) that have to be precomputed for pattern. O(R + N) where R – length of pattern and N – length of text. O(RN) space.
 BoyerMoore – scan from the end of pattern and in case of mismatch skip up to R characters – worst case is as brute force O (R * N) but on average considered the most efficient algorithms – O (N / R). O(R) space.
 RabinKarp – compute sliding hash for every substring of length R for hashing can use modular hashing with some prime number. For efficiency sake Horner’s method can be utilised to reuse computation from previous steps – O(R). Hash match – still can be a collisions so there are two variations in case of hash matches. 2D search or search for multiple patterns using precomputed table of hashes. O(1) space.
 AhoCorasick – in case we are dealing with multiple patterns and text where we search them are different. Build trie for all words, extend it to special finite state machine without backtracking – for failed patterns using suffix links. Used in grep – F flag.
 Zfunctions: Given string A, consider array of all possible suffixes: A[len(A)], A[len(A) – 1], … , A[1]. Compare symbols from the start of string and every suffix – number of matched symbols – represent particular component of Zfunction. For substring search: concatenate pattern, unique delimiter and original string and check if values of Zfunction equal to length of pattern.