lesforgesdessalles.info Environment ELEMENTS OF PROGRAMMING INTERVIEWS PDF

Elements of programming interviews pdf

Wednesday, May 15, 2019 admin Comments(0)

Elements of Programming Interviews The Insiders' Guide Adnan Aziz Tsung- Hsien Lee Amit Prakash This document is a sampling of our book, Elements of. Book Elements of Programming Interviews: The Insiders' Guide By Adnan Aziz, Tsung- Hsien Lee, Amit Prakash The Python version of EPI is available on. This document is a sampling of our book, Elements of. Programming Interviews ( EPI). Its purpose is to provide examples of EPI's organization, content, style.

Language: English, Spanish, Portuguese
Country: Dominica
Genre: Personal Growth
Pages: 760
Published (Last): 25.05.2016
ISBN: 329-2-29897-333-3
ePub File Size: 22.41 MB
PDF File Size: 20.50 MB
Distribution: Free* [*Regsitration Required]
Downloads: 23943
Uploaded by: HERMAN

Contribute to krishabhishek/test development by creating an account on GitHub. Should I buy the C++ version of Elements of Programming Interviews book? You Can Download and Read With File Format Pdf ePUB MOBI and Kindle. chapters on the nontechnical aspects of interviewing. We'd love to Elements of Programming Interviews: The Insiders' Guide by Adnan Aziz.

A streaming algorithm is one in which the input is presented as a sequence of items and is examined in only a few passes typically just one. Each segment is a closed interval [li , ri ] of the x-axis, a color, and a height. As good as "Cracking the coding interview", if not better. Note that with the exception of the root, every node has a unique parent. This is followed by chapters on basic and advanced data structures, algorithm design, concurrency, system design, probability and discrete mathematics. Other advice Be honest: Parallelism In the context of interview questions parallelism is useful when dealing with scale, i.

Amit Prakash. Search for Elements of Programming Interview in Python, or use the short link bit. Before you buy this book, please first head over to our sample page - elementsofprogramminginterviews. Complete programs are available at epibook. Since different candidates have different time constraints, EPI includes a study guide with several scenarios, ranging from weekend Hackathon to semester long preparation with a recommended a subset of problems for each scenario.

All problems are classified in terms of their difficulty level and include many variants to help you apply what you have learned more widely. All problems includes hints for readers who get stuck. This simulates what you will face in the real interview.

The version being sold by Amazon itself is always current. Some resellers may have older versions, especially if they sell used copies. Reading Book Elements of Programming Interviews: It emphasizes problems that stem from real-world applications and can be coded up in a reasonable time, and is a wonderful complement to a traditional computer science algorithms and data structures course.

More generally, for algorithms enthusiasts, EPI offers endless hours of entertainment while simultaneously learning neat coding tricks. Wanted to work at an exciting futuristic company?

Programming elements interviews pdf of

Struggled with an interview problem that could have been solved in 15 minutes? Wished you could study real-world computing problems? From the Back Cover The core of EPI is a collection of problems with detailed solutions, including over figures and tested programs. The problems are challenging, well-motivated, and accessible. They are representative of the questions asked at interviews at the most exciting companies. The book begins with a summary of patterns for data structure, algorithms, and problem solving that will help you solve the most challenging interview problems.

This is followed by chapters on basic and advanced data structures, algorithm design, concurrency, system design, probability and discrete mathematics. Why not share! An annual anal Embed Size px. Start on. Show related SlideShares at end. WordPress Shortcode. Published in: Full Name Comment goes here. Are you sure you want to Yes No. Be the first to like this. No Downloads. Views Total views. Actions Shares.

Embeds 0 No embeds. No notes for slide. Know array and linked list implementations. Binary trees Use for representing hierarchical data. Heaps Key benefit: O 1 lookup find-max, O log n insertion, and O log n deletion of max. Node and array repre- sentations.

Min-heap variant. Hash tables Key benefit: O 1 insertions, deletions and lookups. Key disadvantages: Understand implementation using array of buckets and collision chains.

Know hash functions for integers, strings, objects. Understand importance of equals function. Variants such as Bloom filters. Binary search trees Key benefit: O log n insertions, deletions, lookups, find-min, find-max, successor, predecessor when tree is balanced. Understand node fields, pointer imple- mentation.

Be familiar with notion of balance, and op- erations maintaining balance. Know how to augment a binary search tree, e. Primitive types You should be comfortable with the basic types chars, integers, doubles, etc.

For example, Java has no unsigned integers, and the integer width is compiler- and machine-dependent in C. Problem Solving Patterns A common problem related to basic types is computing the number of bits set to 1 in an integer-valued variable x.

To solve this problem you need to know how to manipulate individual bits in an integer. One straightforward approach is to iteratively test individual bits using an unsigned integer variable m initialized to 1. Iteratively identify bits of x that are set to 1 by examining the bitwise AND of m with x, shifting m left one bit at a time. The overall complexity is O n where n is the length of the integer.

The variable y is 1 at exactly the lowest set bit of x; all other bits in y are 0. The time complexity is O s , where s is the number of bits set to 1 in x. In practice, if the computation is done repeatedly, the most efficient approach would be to create a lookup table. In this case, we could use a entry integer-valued array P, such that P[i] is the number of bits set to 1 in i. If x is 64 bits, the result can be computed by decomposing x into 4 disjoint bit words, h3, h2, h1, and h0.

The bit words are computed using bitmasks and shifting, e. Array lookup and insertion are fast, making arrays suitable for a variety of applications. Reading past the last element of an array is a common error, invariably with catastrophic consequences. The following problem arises when optimizing quicksort: The key to the solution is to maintain two regions on opposite sides of the array that meet the requirements, and expand these regions one element at a time.

Strings A string can be viewed as a special kind of array, namely one made out of charac- ters. We treat strings separately from arrays because certain operations which are commonly applied to strings—for example, comparison, joining, splitting, searching for substrings, replacing one string by another, parsing, etc.

Our solution to the look-and-say problem illustrates operations on strings. The look-and-say sequence begins with 1; the subsequent integers describe the dig- its appearing in the previous number in the sequence. The first eight integers in the look-and-say sequence are h1, 11, 21, , , , , i.

Problem Solving Patterns 25 The look-and-say problem entails computing the n-th integer in this sequence. Al- though the problem is cast in terms of integers, the string representation is far more convenient for counting digits.

Lists An abstract data type ADT is a mathematical model for a class of data structures that have similar functionality. Strictly speaking, a list is an ADT, and not a data structure. It implements an ordered collection of values, which may include repetitions. In the context of this book we view a list as a sequence of nodes where each node has a link to the next node in the sequence.

In a doubly linked list each node also has a link to the prior node. A list is similar to an array in that it contains objects in a linear order. The key differences are that inserting and deleting elements in a list has time complexity O 1.

On the other hand, obtaining the k-th element in a list is expensive, having O n time complexity. Lists are usually building blocks of more complex data structures. However, they can be the subject of tricky problems in their own right, as illustrated by the following: Given a singly linked list hl0 , l1 , l2 ,. Suppose you were asked to write a function that computes the zip of a list, with the constraint that it uses O 1 space.

The operation of this function is illustrated in Figure 4. L l0 l1 l2 l3 l4 0x 0x 0x 0x 0x a List before zipping. The number in hex below each node represents its address in memory. L l0 l4 l1 l3 l2 0x 0x 0x 0x 0x b List after zipping. Note that nodes are reused—no memory has been allocated.

Figure 4.

Interviews pdf of programming elements

Zipping a list. Stacks and queues Stacks support last-in, first-out semantics for inserts and deletes, whereas queues are first-in, first-out. Both are ADTs, and are commonly implemented using linked lists or arrays. Similar to lists, stacks and queues are usually building blocks in a solution to a complex problem, but can make for interesting problems in their own right.

As an example consider the problem of evaluating Reverse Polish notation expres- sions, i. A stack is ideal for this purpose—operands are pushed on the stack, and popped as operators are processed, with intermediate results being pushed back onto the stack.

Problem Solving Patterns Binary trees A binary tree is a data structure that is used to represent hierarchical relationships. Binary trees are the subject of Chapter Binary trees most commonly occur in the context of binary search trees, wherein keys are stored in a sorted fashion. However, there are many other applications of binary trees. Consider a set of resources orga- nized as nodes in a binary tree. Processes need to lock resource nodes. A node may be locked if and only if none of its descendants and ancestors are locked.

Your task is to design and implement an application programming interface API for locking. Naively implemented, the time complexity for these methods is O n , where n is the number of nodes. However, these can be made to run in time O 1 , O h , and O h , respectively, where h is the height of the tree, if nodes have a parent field. Heaps A heap is a data structure based on a binary tree.

It efficiently implements an ADT called a priority queue. A priority queue resembles a queue, with one difference: Each trade appears as a separate line containing information about that trade. Lines begin with an integer-valued timestamp, and lines within a file are sorted in increasing order of timestamp. Suppose you were asked to design an algorithm that combines the set of files into a single file R in which trades are sorted by timestamp. This problem can be solved by a multistage merge process, but there is a trivial solution based on a min-heap data structure.

Entries are trade-file pairs and are ordered by the timestamp of the trade. Initially, the min-heap contains the first trade from each file. Hash tables A hash table is a data structure used to store keys, optionally, with corresponding values. Inserts, deletes and lookups run in O 1 time on average. One caveat is that these operations require a good hash function—a mapping from the set of all possible keys to the integers which is similar to a uniform random assignment.

Another caveat is that if the number of keys that is to be stored is not known in advance then the hash table needs to be periodically resized, which, depending on how the resizing is implemented, can lead to some updates having O n complexity.

Suppose you were asked to write a function which takes a string s as input, and returns true if the characters in s can be permuted to form a string that is palindromic, i. Working through examples, you should see that a string is palindromic ElementsOfProgrammingInterviews. Problem Solving Patterns 27 if and only if each character appears an even number of times, with possibly a single exception, since this allows for pairing characters in the first and second halves.

A hash table makes performing this test trivial. We build a hash table H whose keys are characters, and corresponding values are the number of occurrences for that character.

The hash table H is created with a single pass over the string. After computing the number of occurrences, we iterate over the key-value pairs in H.

If more than one character has an odd count, we return false; otherwise, we return true. Suppose you were asked to write an application that compares n programs for plagiarism. Specifically, your application is to break every program into overlapping character strings, each of length , and report on the number of strings that appear in each pair of programs.

A hash table can be used to perform this check very efficiently if the right hash function is used. Binary search trees Binary search trees BSTs are used to store objects that are comparable. BSTs are the subject of Chapter The underlying idea is to organize the objects in a binary tree in which the nodes satisfy the BST property on Page Insertion and deletion can be implemented so that the height of the BST is O log n , leading to fast O log n lookup and update times.

AVL trees and red-black trees are BST implementations that support this form of insertion and deletion. BSTs are a workhorse of data structures and can be used to solve almost every data structures problem reasonably efficiently.

It is common to augment the BST to make it possible to manipulate more complicated data, e. As an example application of BSTs, consider the following problem.

You are given a set of line segments. Each segment is a closed interval [li , ri ] of the x-axis, a color, and a height. For simplicity assume no two segments whose intervals overlap have the same height. When the x-axis is viewed from above the color at point x on the x-axis is the color of the highest segment that includes x. If no segment contains x, the color is blank. You are to implement a function that computes the sequence of colors as seen from the top.

The key idea is to sort the endpoints of the line segments and do a sweep from left-to-right. As we do the sweep, we maintain a list of line segments that intersect the current position as well as the highest line and its color. Algorithm design patterns An algorithm is a step-by-step procedure for performing a calculation. We classify common algorithm design patterns in Table 4. Roughly speaking, each pattern corresponds to a design methodology. An algorithm may use a combination of patterns.

Problem Solving Patterns Table 4. Algorithm design patterns. Technique Key points Sorting Uncover some structure by sorting the input. Recursion If the structure of the input is defined in a recursive manner, design a recursive algorithm that follows the input definition. Divide-and-conquer Divide the problem into two or more smaller inde- pendent subproblems and solve the original problem using solutions to the subproblems.

Dynamic program- Compute solutions for smaller instances of a given ming problem and use these solutions to construct a solution to the problem. Cache for performance. Greedy algorithms Compute a solution in stages, making choices that are locally optimum at step; these choices are never un- done.

Sorting Certain problems become easier to understand, as well as solve, when the input is sorted. The solution to the calendar rendering problem entails taking a set of intervals and computing the maximum number of intervals whose intersection is nonempty. However, once the interval endpoints have been sorted, it is easy to see that a point of maximum overlap can be determined by a linear time iteration through the endpoints.

Often it is not obvious what to sort on—for example, we could have sorted the intervals on starting points rather than endpoints. This sort sequence, which in some respects is more natural, does not work.

However, some experimentation with it will, in all likelihood, lead to the correct criterion.

(PDF) Elements of Programming Interviews | Shivam Gupta - lesforgesdessalles.info

Sorting is not appropriate when an O n or better algorithm is possible. Another good example of a problem where a total ordering is not required is the problem of rearranging elements in an array described on Page Furthermore, sorting can obfuscate the problem. Recursion A recursive function consists of base cases and calls to the same function with different arguments.

A recursive algorithm is often appropriate when the input is expressed using recursive rules, such as a computer grammar. More generally, searching, enumeration, divide-and-conquer, and decomposing a complex problem into a set of similar smaller instances are all scenarios where recursion may be suitable. Problem Solving Patterns 29 String matching exemplifies the use of recursion.

This problem can be solved by checking a number of cases based on the first one or two characters of the matching expression, and recursively matching the rest of the string. Divide-and-conquer A divide-and-conquer algorithm works by decomposing a problem into two or more smaller independent subproblems until it gets to instances that are simple enough to be solved directly; the results from the subproblems are then combined. More details and examples are given in Chapter 18; we illustrate the basic idea below.

A triomino is formed by joining three unit-sized squares in an L-shape. Mutilated chessboards. Problem Solving Patterns Divide-and-conquer is a good strategy for this problem. However, you will quickly see that this line of reasoning does not lead you anywhere. Now we can apply divide-and-conquer. Hence, a placement exists for any n that is a power of 2.

Divide-and-conquer is usually implemented using recursion. However, the two concepts are not synonymous. Recursion is more general—subproblems do not have to be of the same form.

In addition to divide-and-conquer, we used the generalization principle above. The idea behind generalization is to find a problem that subsumes the given problem and is easier to solve.

Other examples of divide-and-conquer include solving the number of pairs of elements in an array that are out of sorted order and computing the closest pair of points in a set of points in the plane. A key aspect of DP is maintaining a cache of solutions to subinstances. DP can be implemented recursively in which case the cache is typically a dynamic data structure such as a hash table or a BST , or iteratively in which case the cache is usually a one- or multi- dimensional array.

It is most natural to design a DP algorithm using recursion. Usually, but not always, it is more efficient to implement it using iteration.

As an example of the power of DP, consider the problem of determining the number of combinations of 2, 3, and 7 point plays that can generate a score of Let C s be the number of combinations that can generate a score of s. The recursion ends at small scores, specifically, when 1. This phenomenon results in the run time increasing exponentially with the size of the input. The solution is to store previously computed values of C in an array of length Details are given in Solution Greedy algorithms A greedy algorithm is one which makes decisions that are locally optimum and never changes them.

This strategy does not always yield the optimum solution. Furthermore, there may be multiple greedy algorithms for a given problem, and only some of them are optimum.

For example, consider 2n cities on a line, half of which are white, and the other half are black. We want to map white to black cities in a one-to-one fashion so that the total length of the road sections required to connect paired cities is minimized. Multiple pairs of cities may share a single section of road, e. The most straightforward greedy algorithm for this problem is to scan through the white cities, and, for each white city, pair it with the closest unpaired black city.

This algorithm leads to suboptimum results. Consider the case where white cities are at 0 and at 3 and black cities are at 2 and at 5. If the straightforward greedy algorithm processes the white city at 3 first, it pairs it with 2, forcing the cities at 0 and 5 to pair up, leading to a road length of 5, whereas the pairing of cities at 0 and 2, and 3 and 5 leads to a road length of 4.

However, a slightly more sophisticated greedy algorithm does lead to optimum results: More succinctly, let W and B be the arrays of white and black city coordinates.

Sort W and B, and pair W[i] with B[i]. We can prove this leads to an optimum pairing by induction. The idea is that the pairing for the first city must be optimum, since if it were to be paired with any other city, we could always change its pairing to be with the nearest black city without adding any road. Chapter 18 contains a number of problems whose solutions employ greedy al- gorithms.

Several problems in other chapters also use a greedy algorithm as a key subroutine. Invariants One common approach to designing an efficient algorithm is to use invariants. Briefly, an invariant is a condition that is true during execution of a program. This condition may be on the values of the variables of the program, or on the control ElementsOfProgrammingInterviews.

Problem Solving Patterns logic. A well-chosen invariant can be used to rule out potential solutions that are suboptimal or dominated by other solutions. An invariant can also be used to analyze a given algorithm, e. Here our focus is on designing algorithms with invariants, not analyzing them.

As an example, consider the 2-sum problem. We are given an array A of sorted integers, and a target value K. The brute-force algorithm for the 2-sum problem consists of a pair of nested for loops. Its complexity is O n2 , where n is the length of A.

While reducing time complexity to O n , this approach requires O n additional storage for H. Therefore, we can increment i, and preserve the invariant. At each step, we increment or decrement i or j. Since there are at most n steps, and each takes O 1 time, the time complexity is O n. Correctness follows from the fact that the invariant never discards a value for i or j which could possibly be the index of an element which sums with another element to K.

Identifying the right invariant is an art. Usually, it is arrived at by studying concrete examples and then making an educated guess. Often the first invariant is too strong, i.

The efficient frontier can be viewed as an invariant. For example, suppose we need to implement a stack that supports the max method, which is defined to return the largest value stored in the stack. We can associate for each entry in the stack the largest value stored at or below that entry. Problem Solving Patterns 33 This makes returning the largest value in the stack trivial. The invariant is that the value associated for each entry is the largest value stored at or below that entry.

The invariant certainly continues to hold after a pop. To ensure the invariant holds after a push, we compare the value v being pushed with the largest value m stored in the stack prior to the push which is the value associated with the entry currently at the top of the stack , and associate the entry being pushed with the larger of v and m. Abstract analysis patterns The mathematician George Polya wrote a book How to Solve It that describes a number of heuristics for problem solving. Inspired by this work we present some heuristics, summarized in Table 4.

Abstract analysis techniques. Analysis principle Key points Concrete examples Manually solve concrete instances of the problem and then build a general solution. Iterative refinement Most problems can be solved using a brute-force ap- proach.

Find such a solution and improve upon it. Reduction Use a well-known solution to some other problem as a subroutine. Graph modeling Describe the problem using a graph and solve it using an existing algorithm.

Concrete examples Problems that seem difficult to solve in the abstract can become much more tractable when you examine concrete instances. Specifically, the following types of inputs can offer tremendous insight: Problems 5. Consider the following problem.

Five hundred closed doors along a corridor are numbered from 1 to A person walks through the corridor and opens each door. Another person walks through the corridor and closes every alternate door.

Continuing in this manner, the i-th person comes and toggles the state open or closed of every i-th door starting from Door i. You must determine exactly how many doors are open after the th person has walked through the corridor. It is difficult to solve this problem using an abstract approach, e. However, if you try the same problem with 1, 2, 3, 4, 10, and 20 doors, it takes a short time to see ElementsOfProgrammingInterviews.

Problem Solving Patterns that the doors that remain open are 1, 4, 9, 16,. The 10 doors case is illustrated in Figure 4. Now the pattern is obvious— the doors that remain open are those corresponding to the perfect squares. Solution 5. Progressive updates to 10 doors. Case analysis In case analysis, a problem is divided into a number of separate cases, and analyzing each such case individually suffices to solve the initial problem.

Elements of Programming Interviews: The Insiders' Guide C++

Cases do not have to be mutually exclusive; however, they must be exhaustive, that is cover all possi- bilities. These cases are individually easy to prove, and are exhaustive. Case analysis is commonly used in mathematics and games of strategy. Here we consider an application of case analysis to algorithm design.

Of interviews pdf programming elements

Your task is to identify the largest, second-largest, and third-largest integers in S using SORT5 to compare and sort subsets of S; furthermore, you must minimize the number of calls to SORT5.

If all we had to compute was the largest integer in the set, the optimum approach would be to form five disjoint subsets S1 ,. This takes six calls to SORT5 but leaves ambiguity about the second and third largest integers. It may seem like many additional calls to SORT5 are still needed. Other terms are exhaustive search and generate-and-test.

Often this algorithm can be refined to one that is faster. At the very least it may offer hints into the nature of the problem. One straightforward solution is to sort A and interleave the bottom and top halves of the sorted array. Both these approaches have the same time complexity as sorting, namely O n log n. You will soon realize that it is not necessary to sort A to achieve the desired configuration—you could simply rearrange the elements around the median, and then perform the interleaving.

Median finding can be performed in time O n , which is the overall time complexity of this approach. Finally, you may notice that the desired ordering is very local, and realize that it is not necessary to find the median.

In code: However, it is much easier to implement and operates in an online fashion, i. As another example of iterative refinement, consider the problem of string search: Since s can occur at any offset in t, the brute-force solution is to test for a match at every offset.

This algorithm is perfectly correct; its time complexity is O nm , where n and m are the lengths of s and t. After trying some examples you may see that there are several ways to improve the time complexity of the brute-force algorithm.

As an example, if the character t[i] is not present in s you can advance the matching by n characters.

Furthermore, this skipping works better if we match the search string from its end and work backwards. These refinements will make the algorithm very fast linear time on random text and search strings; however, the worst-case complexity remains O nm. Problem Solving Patterns You can make the additional observation that a partial match of s that does not result in a full match implies other offsets that cannot lead to full matches.

As another example, the brute-force solution to computing the maximum subarray sum for an integer array of length n is to compute the sum of all subarrays, which has O n3 time complexity. This can be improved to O n2 by precomputing the sums of all the prefixes of the given arrays; this allows the sum of a subarray to be computed in O 1 time. The natural divide-and-conquer algorithm has an O n log n time complexity. Finally, one can observe that a maximum subarray must end at one of n indices, and the maximum subarray sum for a subarray ending at index i can be computed from previous maximum subarray sums, which leads to an O n algorithm.

Details are presented on Page Reduction Consider the problem of determining if one string is a rotation of the other, e.

A natural approach may be to rotate the first string by every possible offset and then compare it with the second string. This algorithm would have quadratic time complexity. You may notice that this problem is quite similar to string search, which can be done in linear time, albeit using a somewhat complex algorithm. Therefore, it is natural to try to reduce this problem to string search. Indeed, if we concatenate the second string with itself and search for the first string in the resulting string, we will find a match iff the two original strings are rotations of each other.

This reduction yields a linear time algorithm for our problem. Usually, you try to reduce the given problem to an easier problem. Sometimes, however, you need to reduce a problem known to be difficult to the given prob- lem. This shows that the given problem is difficult, which justifies heuristics and approximate solutions. Graph modeling Drawing pictures is a great way to brainstorm for a potential solution.

If the relation- ships in a given problem can be represented using a graph, quite often the problem can be reduced to a well-known graph problem. For example, suppose you are given a set of exchange rates among currencies and you want to determine if an arbitrage exists, i.

Problem Solving Patterns 37 Table 4. An arbitrage is possible for this set of exchange rates: Exchange rates for seven major currencies. If we can find a cycle in the graph with a positive weight, we would have found such a series of exchanges. Such a cycle can be solved using the Bellman-Ford algorithm.

The solutions to the problems of painting a Boolean matrix Problem System design patterns Sometimes, you will be asked how to go about creating a set of services or a larger system on top of an algorithm that you have designed.

We summarize patterns that are useful for designing systems in Table 4. System design patterns. Design principle Key points Decomposition Split the functionality, architecture, and code into man- ageable, reusable components.

Parallelism Decompose the problem into subproblems that can be solved independently on different machines. Caching Store computation and later look it up to save work. Decomposition Good decompositions are critical to successfully solving system-level design prob- lems. Functionality, architecture, and code all benefit from decomposition.

For example, in our solution to designing a system for online advertising, we decompose the goals into categories based on the stake holders. We decompose the architecture itself into a front-end and a back-end. The front-end is divided into user management, web page design, reporting functionality, etc.

The back-end is made up of middleware, storage, database, cron services, and algorithms for ranking ads. Problem Solving Patterns Decomposing code is a hallmark of object-oriented programming. The subject of design patterns is concerned with finding good ways to achieve code-reuse. Broadly speaking, design patterns are grouped into creational, structural, and behavioral pat- terns. Many specific patterns are very natural—strategy objects, adapters, builders, etc. Freeman et al.

Parallelism In the context of interview questions parallelism is useful when dealing with scale, i. Efficiency is typically measured in terms of central processing unit CPU time, ran- dom access memory RAM , network bandwidth, number of memory and database accesses, etc.

Consider the problem of sorting a petascale integer array. If we know the distri- bution of the numbers, the best approach would be to define equal-sized ranges of integers and send one range to one machine for sorting. The sorted numbers would just need to be concatenated in the correct order.

If the distribution is not known then we can send equal-sized arbitrary subsets to each machine and then merge the sorted results, e. The solution to Problem Caching Caching is a great tool whenever computations are repeated. For example, the central idea behind dynamic programming is caching results from intermediate computa- tions.

Caching is also extremely useful when implementing a service that is expected to respond to many requests over time, and many requests are repeated. Workloads on web services exhibit this property. Complexity Analysis The run time of an algorithm depends on the size of its input. One common approach to capture the run time dependency is by expressing asymptotic bounds on the worst- case run time as a function of the input size.

The big-O notation indicates an upper bound on running time. As an example, searching an unsorted array of integers of length n, for a given integer, has an asymptotic complexity of O n since in the worst-case, the given integer may not be present. Problem Solving Patterns 39 tries all numbers from 2 to the square root of the input number n.

What is its complexity? In the best case, n is divisible by 2. Generally speaking, if an algorithm has a run time that is a polynomial, i.

Notable exceptions exist—for example, the simplex algo- rithm for linear programming is not polynomial but works very well in practice. On the other hand, the AKS primality testing algorithm has polynomial run time but the degree of the polynomial is too high for it to be competitive with randomized algorithms for primality testing.

Complexity theory is applied in a similar manner when analyzing the space requirements of an algorithm. Usually, the space needed to read in an instance is not included; otherwise, every algorithm would have O n space complexity.

Several of our problems call for an algorithm that uses O 1 space. Conceptually, the memory used by such an algorithm should not depend on the size of the input instance. Specifically, it should be possible to implement the algorithm without dynamic memory allocation explicitly, or indirectly, e.

Furthermore, the maximum depth of the function call stack should also be a constant, independent of the input. The standard algorithm for depth-first search of a graph is an example of an algorithm that does not perform any dynamic allocation, but uses the function call stack for implicit storage—its space complexity is not O 1.

A streaming algorithm is one in which the input is presented as a sequence of items and is examined in only a few passes typically just one. These algorithms have limited memory available to them much less than the input size and also limited processing time per item. Algorithms for computing summary statistics on log file data often fall into this category.

As a rule, algorithms should be designed with the goal of reducing the worst-case complexity rather than average-case complexity for several reasons: