Wednesday, July 8, 2020

Lowest Common Ancestor

Lowest Common Ancestor Tree problems are so popular recently that weve seen so many candidates have been asked about it by companies like Google, Facebook, Microsoft and so on. On second thought, this makes a lot of sense. Tree is one of the most useful and fundamental data structures in real products. For instance, tree structure is widely used in machine learning like decision trees. Whats more, tree related interview questions can cover a lot of topics like iteration and recursion. This week, Im going to talk about LCA (Lowest Common Ancestor) problem with some extensions. Again, the answer itself is not important and what this post focuses on is how to come up with the right idea and how to analyze the problem from scratch. Ill cover topics including recursion, big-O analysis, iteration and so on. Question Given a binary tree and two nodes, how to find the common ancestor of the two nodes? In the above example, for nodes 5 and 4, the lowest common ancestor is 1. And for nodes 6 and 2, the ancestor is 0. The question has been asked by both Google and Microsoft recently. Analysis Think about the problem by yourself before reading the rest of the post. Here we go. IMHO, tree problems are relatively easier compared to other data structures in coding interviews. The reason is that there are clear patterns to solve tree problems and they are related to the basic concepts. Let me explain this in detail. If you have read our previous post Deepest Node In a Tree, you should know that generally there are two basic concepts about tree problems: Traverse. You should be very familiar with traversing a tree. Things like in-order traversal, BFS should come to your mind immediately. Recursion. Since its very easy to divide a tree problem to subproblems (left and right sub-trees), recursion is one of the most common techniques. Lets see how these two concepts can be used to solve LCA. Traverse It should be already clear to you if you know traversal can be used. Lets say for nodes A and B, it should be very easy to get the path from the root to the nodes. Ill skip the discussion about how to get the path as it should be easy for you. Once you have two paths root to A and root to B, you can just iterate over the two paths simultaneously and the last common node is the lowest common ancestor. Some people find it hard to think about getting the path, that is because they are not familiar with traversal. Once tree traversal has become your basic tools, everything comes naturally. What is the complexity of the algorithm? To get two paths, we need to traverse the tree twice. Finding the common node also requires one iteration. So the time complexity is O(N). Space complexity is also O(N) as we need extra space to store the paths. Recursion The disadvantage of the previous solution is that it needs to iterate 3 times with extra space. Lets see if we can make it better. To use recursion, you need to figure out two things: 1. What is the end point? 2. How to combine sub-problem solutions to solve the bigger problem? First, we can get LCA from both left tree and right tree (if exists). If the either of A or B is the root node, then the root is the LCA and we just return the root, which is the end point of the recursion. As we keep divide the tree into sub-trees, eventually, well hit either A and B. To combine sub-problem solutions, if LCA(left tree) returns a node, we know that both A and B locate in left tree and the returned node is the final result. If both LCA(left) and LCA(right) return non-empty nodes, it means A and B are in left and right tree respectively. In this case, the root node is the lowest common node. You might be confused why its possible for both left and right return non-empty nodes. This is because we assume that A and B must exist in the tree so that in any of the subproblem whose root is A or B, we just return the root. Since we traverse all nodes at most once without external space, both time and space complexity is O(N). Follow-up What if each node has a parent pointer that points to its parent? In fact, this is an even simpler problem. With the parent pointer, you can get the path from A/B to the root. Then we can easily get the LCA as the first solution. However, the time complexity here is O(h) where h is the height of the tree. This is because to get the path to the root, you dont need to traverse the whole tree as before. Similarly, the space complexity is also O(h). Takeaways I believe that with more posts about tree interview questions, you are more familiar with this data structure. Basic concepts like traversal. As you can see, once you are familiar with these basic techniques, its quite easy to come up with the right idea. If you find yourself confused about particular concepts, do go back and review the textbook. Recursion. Weve been using recursion to solve tree problems for so many times.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.