Preparation for Interviews

Commuication, code stype, runnable code, good test cases in a given time are more important than coming up with a very optimised solution.

  • Code style: Use meaningful variable names, use comments to explain the code, use functions to break down the problem.
  • Runnable code: Make sure the code can be run, use a simple test case to demonstrate the code works. You should have knowledge of the edge cases, and possible exceptions.
  • Communication: Explain the code, explain the logic behind the code, explain the time and space complexity of the code. It is always good to well understand the code and explain your idea before the implementation.
  • Test cases: Use simple test cases to demonstrate the code works, and explain the edge cases. It is always good to have a few test cases in mind before the interview.

String

Find the first occurrence of a substring in a string.

def strStr(haystack: str, needle: str) -> int:
    """ Return the index of first occurrence of needle in haystack, or -1 if not found
        This is the implementation of find() function in Python.
        time O(n*m) memory O(1)
    """
    n_needle = len(needle)
    # The last start index of needle is len(haystack) - n_needle
    for i in range(len(haystack) - n_needle + 1):
        match = True
        for j, v in enumerate(needle):
            if haystack[i+j] != v:
                match = False
                break
        if match:
            return i
    return -1

Recursive

When I say recusive, is the idea of recursion, not the function which call itself. To find all subsets of a set, we start we a list contain only on empty set. Each time we go through all elements in the list and add the current element to each subset in the list, then we add the new subsets to the list. This is a recursive idea, but it is implemented with a for loop. So the lenght list of subsets is doubled each time we add a new element to the set.

def subsets(nums: list[int]) -> list[list[int]]:
    """ https://leetcode.com/problems/subsets
        Given an integer array nums of unique elements,
        return all possible subsets (the power set).
        there is only for loop, but it using recursive idea,
        each time we update the subsets and reuse it.
    """
    subsets = [[]]
    for e in nums:
        new_subsets = []
        for s in subsets:
            new_s = s.copy()
            new_s.append(e)
            new_subsets.append(new_s)
        subsets += new_subsets
    return subsets

For the subsets of a set with duplicated elements, we can use a counter to count the number of each element in the set. Then we can generate all subsets by adding the current element to each subset in the list. The combination of several same elements is treated as a new element, and we can add it to the list of subsets the same time with the base element to not generate duplicate subset.

def subset2(nums: list[int]) -> list[list[int]]:
    """ https://leetcode.com/problems/subsets-ii
        This time, there are duplicated elements.
    """
    subsets = [[]]
    counts = Counter(nums)
    for e, count in counts.items():
        new_subsets = []
        for s in subsets:
            for c in range(count):
                new_s = s.copy()
                new_s += [e] * (c + 1)
                new_subsets.append(new_s)
        subsets += new_subsets
    return subsets

Binary Search

Binary search is a very common algorithm to find an element in a sorted array. The idea is to divide the array into two halves, and check if the element is in the left half or the right half. If it is in the left half, we continue to search in the left half, otherwise we search in the right half. This process continues until we find the element or the array is empty.

Usally in the following cases we use binary search:

  • Find a log(n) algorithm or optimise a O(n) algorithm.
  • Solve a problem in a sorted array or rotated sorted array.

There are two common ways to implement binary search:

  1. Iterative: We use a while loop to search for the element, and update the left and right pointers based on the comparison of the middle element and the target element.
  2. Recursive: We use a recursive function to search for the element, and return the index of the element if found, otherwise return -1. But this not a good idea for production code, as it can cause stack overflow for large arrays.

There are some common problems that can be solved with binary search:

  • Find any occurrence of an element in a sorted array.
  • Find the first occurrence of an element in a sorted array.
    • This is the behavior of the bisect_left function in Python. Get the position to insert a new element in a sorted array, such that the order of the array is preserved. If the element is already present, the insertion point will be before (to the left of) any existing entries. Bisect show the position where the new element should stay after the insertion.
  • Find the last occurrence of an element in a sorted array.
    • This is the behavior of the bisect_right function in Python. Get the position to insert a new element in a sorted array, such that the order of the array is preserved. If the element is already present, the insertion point will be after (to the right of) any existing entries.
def binary_search_first(nums:list[int], target:int):
    """ get the first occurence position of target (bisect.bisect_left)
        
    """
    if nums is None or len(nums) == 0:
        return -1
    left, right = 0, len(nums) -1
    while left + 1 < right:
        mid = left + (right - left)//2
        if nums[mid] == target:
            right = mid # for ,any position case, you can just return
        elif nums[mid] < target:
            left = mid # (+1, -1 may not work for case when first that is bigger than target... )
        else:
            right = mid
    if nums[left] == target:
        return left
    if nums[right] == target:
        return right
    return -1


def binary_search_last(nums:list[int], target:int):
    """ get the last occurence position of target (bisect.bisect_right)"""
    n = len(nums)
    if nums is None or n == 0:
        return -1
    left, right = 0, n -1
    while left + 1 < right:
        mid = left + (right - left)//2
        if nums[mid] == target:
            left = mid
        elif nums[mid] < target:
            left = mid
        else:
            right = mid
    if nums[left] == target:
        return left
    if nums[right] == target:
        return right
    return -1

There are three thing to mention here:

  1. The (left + 1 < right ) condition is used to avoid infinite loop, as we need to narrow down the search space. We break the loop when there are only two elements left in the search space.
  2. The mid = left + (right - left)//2 is used to avoid overflow, as mid = (left + right)//2 can cause overflow for large arrays.
  3. The if nums[mid] == target: condition is used to check if the middle element is the target element, and we update the left or right pointer based on the comparison of the middle element and the target element. If we want to find the first occurrence of the target element, we update the right pointer to mid, otherwise we update the left pointer to mid. If we want any occurrence of the target element, we can just return the mid index.

Get the square root of a number

Find the integer sqrt root of a number using binary search. This is the problem of finding the last occurrence x such that x*x <= n. The idea is to use binary search to find the integer square root of a number. We can use the same approach as above, but we need to check if mid*mid <= n instead of nums[mid] == target.

def sqrt(x:int) -> int:
    """ compute and return the square root of x
    idea: **Last** number i that i**2 <= x
    """
    left, right = 1, x
    while left + 1 < right:
        mid = left + (right - left)//2 
        sq_mid = mid ** 2
        if sq_mid == x:
            left = mid
        elif sq_mid < x:
            left = mid
        else:
            right = mid
        
    if right ** 2 <= x:
        return right
    return left

search-a-2d-matrix

This is a binary search if we transform the 2D matrix into a 1D array. row, col = divmod(mid, n) is used to find the original row and column. log(M) + log(N) = log(MN) = log(MN) is the time complexity of this algorithm, where M is the number of rows and N is the number of columns in the matrix. The space complexity is O(1) as we only use a few variables to store the left, right, mid, row, and col indices.

def search_matrix(matrix:list[list[int]], target:int) -> bool:
    """ https://leetcode.com/problems/search-a-2d-matrix
    Each row is sorted in non-decreasing order.
    The first integer of each row is greater
    than the last integer of the previous row.
    """
    if len(matrix) == 0 and len(matrix[0]) == 0:
        return False
    if len(matrix) == 1 and len(matrix[0]) == 1:
        return matrix[0][0] == target
    m, n = len(matrix), len(matrix[0])
    left, right = 0, m * n - 1
    while left + 1 < right:
        mid = left + (right - left)//2
        row, col = divmod(mid, n)
        if matrix[row][col] == target:
            return True
        elif matrix[row][col] < target:
            left = mid
        else:
            right = mid
    row1, col1 = divmod(left, n)
    row2, col2 = divmod(right, n)
    if matrix[row1][col1] == target or matrix[row2][col2] == target:
        return True
    return False

Or we can do two binary searches, one for the row and one for the column. But this is not a good idea as it is not efficient.

def search_matrix_method2(matrix: list[list[int]], target: int) -> bool:
    """
    Each row is sorted in non-decreasing order.
    The first integer of each row is greater than the last integer of the previous row.
    """
    if not matrix or not matrix[0]:
        return False
    m, n = len(matrix), len(matrix[0])
    starts = [row[0] for row in matrix]
    start_pos = bisect.bisect_right(starts, target) - 1
    if start_pos < 0:
        return False
    # Binary search in the identified row
    row = matrix[start_pos]
    col_pos = bisect.bisect_left(row, target)
    return col_pos < n and row[col_pos] == target

search in rotated sorted array

It is a binary search, but we need to handle the case when the array is rotated. The idea is to find the pivot index where the array is rotated, and then use binary search to find the target element in the left or right half of the array based on the pivot index. The pivot index is the index where the array is rotated, and it can be found by checking if the middle element is greater than the rightmost element or not.

def search_in_rotated_sorted_array(nums: list[int], target: int):
    """ https://leetcode.com/problems/search-in-rotated-sorted-array/
        [4,5,6,7,0,1,2] the sorted array is rotated at pivot index k = 3
        [k, k+1,...,n-1]
        All values of nums are unique.
        nums is an ascending array that is possibly rotated.

        For a mid, there can be four cases for target.
        In the first halp of not and bigger or smaller than mid
        So we need to discuss these four cases in the implementation
    """
    if not nums:
        return -1
    left, right = 0, len(nums) - 1
    while left + 1 < right:
        mid = left + (right - left) // 2
        if nums[mid] == target:
            return mid
        # Left half is sorted
        if nums[left] <= nums[mid]:
            if nums[left] <= target < nums[mid]:
                right = mid
            else:
                left = mid
        # Right half is sorted
        else:
            if nums[mid] < target <= nums[right]:
                left = mid
            else:
                right = mid
    if nums[left] == target:
        return left
    if nums[right] == target:
        return right
    return -1

Array and Numbers

Merge Sorted Array

Two merge two ordered array with one array has empty place. We should think of filling the array from the back to the front, so that we can avoid overwriting the elements in the first array and move all elements after the first element. This is a common problem in interview preparation, and it is a good practice to implement it in-place.

def merge_sorted_array(nums1:list[int], nums2:list[int]):
    """ Merge two sorted array nums1 and nums2 into a new sorted interger array
        Merge Sorted Array
        https://leetcode.com/problems/merge-sorted-array
        Do not return anything, modify nums1 in-place instead. Fill from back to front.
    """
    m, n = len(nums1), len(nums2)
    i, j, cur = m-1, n-1, m+n-1
    while(i>=0 or j>=0):
        if len(nums1)==0 or i<0:
            nums1[cur] = nums2[j]
            j-=1
        elif len(nums2)==0 or j<0:
            nums1[cur] = nums1[i]
            i-=1
        elif nums1[i] > nums2[j]:
            nums1[cur] = nums1[i]
            i-=1
        else:
            nums1[cur] = nums2[j]
            j-=1
        cur-=1

Hash Set and Hash Map

A hash set is a data structure that stores unique elements, and a hash map is a data structure that stores key-value pairs. Both of them are implemented with a hash function to compute the index of the element in the array. The time complexity of the operations in a hash set and a hash map is O(1) on average, but it can be O(n) in the worst case when there are many collisions.

Longest Repeating Character Replacement

You are given a string s consisting of only uppercase english characters and an integer k. You can choose up to k characters of the string and replace them with any other uppercase English character. After performing at most k replacements, return the length of the longest substring which contains only one distinct character.

The idea is to use a sliding window to find the longest substring with at most k replacements. We keep track of the counts of each character in the current window, and the maximum count of any character in the window. The length of the current window minus the maximum count gives us the number of characters that need to be replaced to make all characters in the window the same. If this number is less than or equal to k, we can update the maximum length of the substring. Otherwise, we move the left pointer of the window to shrink the window until we have at most k replacements.

from collections import defaultdict
class Solution:
    def characterReplacement(self, s: str, k: int) -> int:
        l, r, n = 0, 0, len(s)
        max_repeat = 1
        counts = defaultdict(int)
        while r < n:
            counts[s[r]] += 1
            cur_len = r - l + 1
            missed_char = cur_len - max(counts.values())
            if missed_char <= k:
                max_repeat = max(max_repeat, cur_len)
            else:
                counts[s[l]] -= 1
                l += 1
            r += 1
        return max_repeat

Longest Consecutive Sequence

Given an array of integers nums, return the length of the longest consecutive sequence of elements that can be formed.

The idea is to use a hash set to store the elements in the array, and then iterate through the array to find the longest consecutive sequence. For each element, we check if it is the start of a sequence (i.e., if the previous element is not in the set), and then we keep checking for the next elements in the sequence until we find a break in the sequence. We keep track of the longest sequence length and return it at the end.

class Solution:
    def longestConsecutive(self, nums: List[int]) -> int:
        num_set = set(nums)
        longest = 0
        for num in nums:
            if not num - 1 in num_set:
                # in this case num is a start of a sequence
                n = 1
                while num + 1 in num_set:
                    num += 1
                    n += 1
                longest = max(longest, n)
        return longest

Stack and Queue

Stack and queue are two common data structures that are used in many algorithms. A stack is a LIFO (Last In First Out) data structure, and a queue is a FIFO (First In First Out) data structure. We can use a stack to implement a queue, and we can use a queue to implement a stack. The idea is to use two stacks to implement a queue, and use two queues to implement a stack.

Min Stack

A min stack is a stack that supports push, pop, top, and retrieving the minimum element in constant time O(1). The idea is to use an auxiliary stack to keep track of the minimum element.

The min_stack is used to keep track of the minimum element in the stack. When we push an element onto the stack, we also push it onto the min_stack if it is smaller than or equal to the current minimum element. When we pop an element from the stack, we also pop it from the min_stack if it is equal to the current minimum element. The top of the min_stack will always be the minimum element in the stack.

The idea is that we care only about the minimum element. Poping non minimum elements does not change the getMin result.

class MinStack:

    def __init__(self):
        self.stack = []
        self.min_stack = []

    def push(self, val: int) -> None:
        self.stack.append(val)
        if len(self.min_stack) == 0 or val <= self.min_stack[-1]:
            self.min_stack.append(val)

    def pop(self) -> None:
        val = self.stack.pop()
        if self.min_stack[-1] == val:
            self.min_stack.pop()

    def top(self) -> int:
        return self.stack[-1]

    def getMin(self) -> int:
        return self.min_stack[-1]

Evaluate Reverse Polish Notation

Reverse Polish Notation is a mathematical notation in which every operator follows all of its operands. It is also known as postfix notation. The idea is to use a stack to evaluate the expression.

We go through each token in the expression, if it is a number, we push it onto the stack. If it is an operator, we pop the top two elements from the stack, apply the operator to them, and push the result back onto the stack. At the end of the expression, the stack will contain only one element, which is the result of the expression.

The time complexity of this algorithm is O(n), where n is the number of tokens in the expression, and the space complexity is O(n) in the worst case, when all tokens are numbers.

def evalRPN(self, tokens: List[str]) -> int:
    stack = []
    operators = {"+", "-", "*", "/"}
    for c in tokens:
        if not c in operators:
            stack.append(int(c))
        else:
            b = stack.pop()
            a = stack.pop()
            match c:
                case "+":
                    stack.append(a + b)
                case "-":
                    stack.append(a - b)
                case "*":
                    stack.append(a * b)
                case "/":
                    stack.append(int(a / b))
    return stack.pop()

Daily Temperatures

You are given an array of integers temperatures where temperatures[i] represents the daily temperatures on the ith day. Return an array result where result[i] is the number of days after the ith day before a warmer temperature appears on a future day. If there is no day in the future where a warmer temperature will appear for the ith day, set result[i] to 0 instead.

The idea is to use a stack to keep track of the indices of the temperatures. We go through the temperatures from the end to the beginning, and for each temperature, we pop the indices from the stack until we find a temperature that is greater than the current temperature. The difference between the current index and the index of the next warmer temperature is the number of days we need to wait for a warmer temperature. If there is no warmer temperature, we set the result to 0.

from collections import deque

class Solution:
    def dailyTemperatures(self, temperatures: List[int]) -> List[int]:
        n = len(temperatures)
        stack = deque([(n - 1, temperatures[-1])])   # (index, temperature)
        result = [0] * n
        for i in range(n - 2, -1, -1):
            t = temperatures[i]
            while stack and t >= stack[-1][1]:
                stack.pop()
            if stack:
                result[i] = stack[-1][0] - i
            stack.append((i, t))
        return result

Greedy Algorithm

The greedy algorithm is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece that offers the most immediate benefit. The idea is to make a locally optimal choice at each step with the hope of finding a global optimum.

Container With Most Water

You are given an integer array heights where heights[i] represents the height of the ith bar. You may choose any two bars to form a container. Return the maximum amount of water a container can store.

The idea is to use two pointers, one at the beginning and one at the end of the array. We calculate the area of the container formed by the two bars, and then we move the pointer that points to the shorter bar towards the other pointer. We repeat this process until the two pointers meet. The maximum area found during this process is the answer. The reason we move the pointer that points to the shorter bar is that the area is limited by the shorter bar, so moving the taller bar will not increase the area. By moving the shorter bar, we have a chance to find a taller bar that can increase the area.

class Solution:
    def maxArea(self, heights: List[int]) -> int:
        ans = 0

        l = 0
        r = len(heights)-1

        while l<r:
            area = min(heights[l], heights[r])*(r-l)

            if heights[l]<heights[r]:
                l+=1
            else:
                r-=1
            
            ans = max(ans, area)

        return ans

Dynamic Programming

Dynamic is more a way of thinking than a specific algorithm. It is used to solve problems that can be broken down into smaller subproblems, and the solution to the problem can be built from the solutions to the subproblems. The key idea is to store the solutions to the subproblems in a table, so that we can reuse them later.