Interesting CS Interview Questions #2

This post is a continuation of my interesting CS interview questions series. The last two problems discussed in this post lean more towards logic and probability rather than a CS concept.

Consider the Lock and Unlock operation defined on an m-ary tree as follows:

Lock Node X => If (no node present in the subtree rooted at X (including X) is locked) AND (no direct ancestor of X is locked), then lock X

Unlock Node X => Assumes that it is called only a node that has been previously locked. Simply unlocks the node X.

Design an efficient data structure and algorithms to perform the required lock and unlock operations.

The first solutions that struck me was to simply perform a complete tree traversal and determine if the required conditions for locking are being met or not. This has complexity O(N) where N is the number of nodes. Is this the best that can be done?

On further thought, since we can decide the data structure which suits us best, we can improve the efficiency. Suppose for each node we have a pointer to the parent node. Then, determining if any direct ancestor is locked is simply a loop that climbs the tree, via the parent pointers. How to check if any of the children in the subtree is already locked? Another traversal is possible, but it would not improve our time complexity. If a node is locked, all the direct ancestors need to know that a child is locked. So, this gives the idea to use a flag for each node which is set if a child is locked. Now the lock operation can check if a child is locked in constant time. After locking successfully, it needs to traverse back up to the root via the parent pointers and mark for each node that a child is locked.

How will unlock happen then? We just unlock the node that is given, but we cannot simply unmark the child-node-locked flags in the parents. Why? Because there may be other children that are locked! This seems to complicate things. But not really. Instead of using a flag to hold the information, we use an integer. The integer simply counts the number of child nodes locked. Thus, while unlocking we simply decrement these counts.

Now, what is the time complexity? It is logarithmic because we only look at the path from the node to the root (via the parent pointers).

This one is like a puzzle. A table has 100 coins, 30 of them with Heads facing up. There is another table with no coins. With your eyes blindfolded, what will do so that the number of coins with Heads facing up is equal on each table? You cannot determine if a coin is facing Heads up or not by touching it. You are basically allowed to flip coins and move them from one table to the other.

This problem was pretty challenging for me. Initially the number of Heads in each table is 30 and 0. After experimenting various things with a smaller number of coins, one notices (or at least is supposed to notice) that moving one coin from the first table to the second and flipping it causes the difference in the number of Heads between the tables to decrease by 1. Why? Suppose the coin was a Head in the first table. On flipping and moving it to the other table, the number of Heads in the first table decreases by 1 but the number of Heads on the second table remains the same (as it is added to the Tails count). If the coin was a Tail, then it increases the Heads count in the second table without affecting that of the first table. So the solution is to repeat this process of choosing some coin and flipping it, and moving it to the other table, 30 times. Since the initial difference in the number of Heads is 30, after this process the difference becomes 0. Thus, the number of Heads is equal in each table.

Given a long stream of numbers, how will you choose a single number, such that it is uniformly random among the numbers so far seen?

Although, one can store all the numbers and then choose one at random using standard random functions (like in C), we would like a method that is space efficient. There does exist a method that requires only constant space. Any idea?

Say we have reached the N-th number in the sequence and suppose we already have the solution for the first (N-1) numbers that appeared in the sequence. Suppose this solution is X. Then X was chosen with probability 1/(N-1) from the first (N-1) numbers. Since the N-th number in the sequence should have a 1/N probability of being the random number chosen among the first N numbers in the sequence, we decide if it is indeed the chosen random number with a probability of 1/N. Suppose it were not, then the number X will be the random number chosen. We can verify its probability by noting that it is given by (1/(N-1))*((N-1)/N) = 1/N. Thus, we have the solution: for the i-th number in the sequence, choose it as the random number with probability 1/i, otherwise, do not modify the previously chosen random number.

Interesting CS Interview Questions #1

I’ve been through a lot of CS interviews recently and so I have come across many interesting problems. I find many of them worth sharing. So I am writing this short series of posts on this topic. All of these interviews were for a job post of Software Development Engineer. If you happen to be reading this, as a sort of preparation for an interview, I suggest you attempt to solve the problem yourself and then see how you like my solution (please correct me for any mistakes!). Most interviewers also ask for code. I am not posting that here, since after getting the solution, coding it should be fairly straightforward with a little practice.

Given a binary tree, how will you convert it into a doubly-linked circular list in-place? That is, the left-child and right-child pointers of a node in the tree should be considered as left-node and right-node pointers in the doubly-linked list.

The key here is that only a constant amount of extra space is allowed as the conversion should be performed in-place. That is, we cannot save the data in another, more convenient data structure and then perform the conversion to a linked list.

An important tip is to think of recursive algorithms when faced with problems about trees and especially binary trees.

Using this tip, we think of solving this problem recursively. When we are given the binary tree, what we are actually given is the head pointer of the tree. Thus, we have the head node, say head. We also immediately have the head nodes of the left and right subtrees of this node (head->left and head->right, using C-like notation). Now, we can simply solve the problem recursively. Thus, we can get the following pseudo-code:

getDLL(head):
[Takes the head pointer of a tree, and returns a circular DLL as required above containing the nodes in the given tree.]

  1. save = head
  2. leftList = getDLL(head->left)
  3. rightList = getDLL(head->right)
  4. Now, combine the above, in-order: (leftList, save, rightList). Don’t forget to make it circular by joining the first and the last nodes.

Note that I have not handled the base cases. I am just illustrating the idea. Clearly, the base cases to consider are when either of the sub-trees are empty. If, so just return whatever list is remaining, without adding any null nodes.

Ok, now that you’ve understood the solution, don’t think it’s over. Analyse the complexity. Time complexity is linear in the number of nodes in the tree, i.e. O(N), where N is the number of nodes in the tree. Why? In each, recusive call, the amount of work done other than the recursive calls is constant. [If you are not convinced, complete the code and see for yourself.] Also the number of recursive calls equals the number of nodes. Thus, the result. What is the space complexity? It is not constant because recursion uses the stack. How much space is that?! The maximum size of the stack is proportional to the depth of the tree (why??). I don’t yet know a solution that eliminates this logarithmic space requirement.

It is known that when function calls are made, activation records are created and pushed on top of a stack, which is managed by the operating system. In some systems the stack grows upwards and on some other systems the stack grows downwards. How will you find out if a stack grows upwards or downwards for a program on a particular computer?

This problem is actually pretty easy if you think about it. You are being asked to find if the address of elements of the activation record increase or decrease with nested function calls. The solution is to write a simple C program that saves the address of a local variable (as it is stored on the stack) in a global variable and then this information can be used in the main function, to determine if the stack grows up or down:

#include <stdio.h>
int a,b;
void f(){
       int b; a = &b;
}
void main(){
       int c; b = &c;
       f();
       if(a>b) { /*Stack grows down*/ }
       else { /*Stack grows up*/ }
}

Consider the following code snippets:

for(int i=0;i<100;i++) for(int j=0;j<1000;j++) ;

and,

for(int j=0;j<1000;j++) for(int i=0;i<100;i++) ;

Disregarding all compiler and processor level optimizations, which of these two codes runs faster? Why?

It seems like a nasty puzzle, but the answer is actually pretty simple. We note that all constituent assembly level instructions are of the types: zero assignment, increment, comparison and jump. Let us count them. The first snippet has 101 zero assignment statements, 100,000 increment statements, 1001*100+101 = 100,201 comparison and jump statements. The second snippet has 1001 zero assignment statements, 100,000 increment statements, 1000*101+1001 = 102,001 comparison and jump statements. Clearly, the first one is faster.

An interesting easy problem

I haven’t been writing for a pretty long time now 🙂 Anyways.

Someone asked me to solve this problem:

Given an array A of integers of length N, find a sub-array whose sum is a multiple of N.

To solve it in a brute force manner could be a reasonable exercise for beginners in programming. The solution is to just consider all possible sub-arrays and check if their sum is a multiple of N. This could be O(N3). If you use an cumulative sums array (ie, S[i] =∑A[k] where k goes from 1 to i) the solution is slightly smarter and can be solved in O(N2): just choose all possible (a,b) such that 1<=a<b<=N and see if S[b]-S[a] is a multiple of N (make sure you handle corner cases).

Now we come to the more interesting linear solution. Use the same cumulative sums array as above but make it so that S[i] = ((∑A[k]) mod N) or in C parlance: S[i] = (∑A[k])%N. Clearly, if any of these values is 0, we are done: the array A[1], A[2], …, A[i] constitutes a sub-array with the required property if S[i] = 0. Also, the notice that if any of the remainders occur twice, i.e. S[i] = S[j] for i < j, then A[i+1], A[i+2], …, A[j] constitutes a subarray with the required property. With these observations the linear solution should be obvious: just use another array of size N to keep track of which remainder was seen where (i.e. let R[i] be the index in the array S, where the remainder was i) and then output a solution when a particular remainder is seen twice, or when the remainder 0 is seen.

Some of you may also have stumbled onto another realization:

A solution to this problem always exists!

Its not such a big deal anyway: If none of the remainders occur twice in the cumulative sums array, clearly one of the remainders has to be 0 (since there are N cumulative sums and N remainders)! Thus, the result!