Leetcode Fighter: Array & Hashing Finalized

Created on: 20 Jan 25 12:00 +0700 by Son Nguyen Hoang in English

Documents and study note for classic leetcode array & hashing problem

alt text

Finally I solved all other problem in “Array & Hashing” section of Neetcode. Here is a quick breakdown of solution. Begin with “Encode & Decode” string.

Encode & Decode String

Design an algorithm to encode a list of strings to a single string. The encoded string is then decoded back to the original list of strings.

Please implement encode and decode

Analysis & Solution

The first approach would be to divide the string by a special character. However, given that the prerequisite clearly states that each string can be anything in UTF-8. No special character is safe to use. Then, I thought about using a hashmap that store every character, their frequency and position. But then it no helpful when decoding the long hashmap into seperate string. I got lost in the mindset to transform the string into something else, pure numeric or some shit like that.

However, the solution is much more elegant. Instead of using a special character, using two characters prove to be much more reliable option.One is special character and another is a number indicate which character belong to the string.


func encode_string(input_array_string []string) string {
	encoded_string := ""
	for _, str := range input_array_string {
		string_length := len(str)

		encoded_string += strconv.Itoa(string_length) + "#" + str
	}
	return encoded_string
}

func decode_string(input_string string) []string {
	original_string := []string{}

	current_string := ""
	string_length := 0
	string_counter := ""

	for _, char := range input_string {
		if string_length == 0 {

			string_counter += string(char)
			if char == '#' {
				string_counter = string_counter[:len(string_counter)-1]
				if val, err := strconv.Atoi(string_counter); err != nil {
					print(err)
				} else {
					string_length += val
					string_counter = "" // refresh
					if string_length == 0 {
						original_string = append(original_string, "")

					}
				}

			}
		} else {

			current_string += string(char)
			string_length--

			if string_length == 0 {
				original_string = append(original_string, current_string)
				current_string = "" //refresh
			}
			continue
		}

	}
	return original_string
}

The good thing I learn when solving this problem is actual studying on UTF-8 vs Unicode and ANSCII code. I would quickly brief it here: The first, well-known and most simply to understand is ANSCII (used & invented in America). The thing is that ANSCII cannot support other languages (Japanese, Mandarin, etc ) simply because they couldn’t support enough character. Hence, Unicode is invented with larger support for charset. The similarity of the two is that for each character, they assign a number for it.

The thing is if using pure Unicode, simple English alphabet would take much more memory to store. Also, it would sometime cause error in older system. Why? Old system has some rule, such as 8 zeroes set to End of file. This work reseasonably well in ANSCII code but cannot work in pure Unicode. A simple A in Unicode can have so a list of zeroes that unwanted treated as End of File in old system.

Thus UTF-8 is a encryption way to work arround. UTF-8 use 8 bit to create Unicode character. The details of this algorithm is provided in a Computerphile video: https://www.youtube.com/watch?v=MijmeoH9LT4

Products of Array Except Self

Given an integer array nums, return an array output where output[i] is the product of all the elements of nums except nums[i].

Each product is guaranteed to fit in a 32-bit integer.

Follow-up: Could you solve it in O(n) time without using the division operation?

Analysis

alt text

The smartest way to solve is to create 2 extra array: P and Q as above. A fast example could be

Original_Array = [3,4,22,1]

P = [1, 3 , 12 , 242]
Q = [88 , 22, 1, 1]

So the result array is P x Q

Result = [88 , 66 , 12 , 242]

Otherwise, we can simply stick to solution that implement division: 1. Calculate total multiplication of all numbers 2. Make a new array, initialize each member to the total / the correspoinding number in the original array. Note that you must handle the zero case (to avoid division by zero problem)

Solution

func productExceptSelf_Solution2(nums []int) []int {
	totalProduct := 1
	lenght := len(nums)
	result1 := make([]int, lenght)
	result2 := make([]int, lenght)
	result3 := make([]int, lenght)

	for i := 0; i < lenght; i++ {
		if i == 0 {
			result1[i] = 1
			continue
		}
		result1[i] = totalProduct * nums[i-1]
		totalProduct = totalProduct * nums[i-1]
	}
	totalProduct = 1
	for i := lenght - 1; i >= 0; i-- {
		if i == lenght-1 {
			result2[i] = 1
			continue
		}
		result2[i] = totalProduct * nums[i+1]
		totalProduct = totalProduct * nums[i+1]

	}

	for i := 0; i < lenght; i++ {
		result3[i] = result1[i] * result2[i]
	}
	return result3
}

Valid Sudoku

You are given a a 9 x 9 Sudoku board board. A Sudoku board is valid if the following rules are followed:

Each row must contain the digits 1-9 without duplicates.
Each column must contain the digits 1-9 without duplicates.
Each of the nine 3 x 3 sub-boxes of the grid must contain the digits 1-9 without duplicates.
Return true if the Sudoku board is valid, otherwise return false

Note: A board does not need to be full or be solvable to be valid.

Analysis & Solution

My first solution is simple

Use hashmap to check for all invalidity for 9 rows (this would need a loop of 9 times)
Use hashmap to check for all invalidity for 9 columns (this would need a loop of 9 times)
Smartly loop to check for invalidity for each square. The logic is still running a loop for each row, but now check for each 3 unit per row. For each unit, check the nine numer that constitute the square.

alt text

For example, first with i = 0, j = 0, checking all number create with 0 <= i <= 2. Next we move to i = 3, j = 0, so we check all number created with 3 <= i <= 5 and 0 <= j <= 2. Refer to below solution.

func parsed(input []string) []int {
	array := make([]int, len(input))
	for i := 0; i < len(input); i++ {
		array[i], _ = strconv.Atoi(input[i])
	}
	return array
}
func is_valid_sudoku(input [][]string) bool {

	parsed_input := make([][]int, len(input))
	for i := 0; i < len(input); i++ {
		parsed_input[i] = parsed(input[i])
	}
	hashmap := make(map[int]bool)
	// Check all row
	for i := 0; i < len(parsed_input); i++ {
		hashmap = make(map[int]bool)

		for j := 0; j < len(parsed_input[i]); j++ {
			if parsed_input[i][j] == 0 {
				continue
			}
			if hashmap[parsed_input[i][j]] == true {
				return false
			}
			hashmap[parsed_input[i][j]] = true
		}
	}

	//Check all column
	for i := 0; i < len(parsed_input); i++ {
		hashmap = make(map[int]bool)

		for j := 0; j < len(parsed_input[i]); j++ {
			if parsed_input[j][i] == 0 {
				continue
			}
			if hashmap[parsed_input[j][i]] == true {
				return false
			}
			hashmap[parsed_input[j][i]] = true
		}
	}
	// Check each square
	count := 0
	for i, j := 0, 0; i < len(parsed_input) && j < len(parsed_input); {
		hashmap = make(map[int]bool)

		for sqrI := 0; sqrI < 3; sqrI++ {
			for sqrJ := 0; sqrJ < 3; sqrJ++ {
				if parsed_input[i+sqrI][sqrJ+j] == 0 {
					continue
				}
				if hashmap[parsed_input[i+sqrI][sqrJ+j]] == true {
					return false
				} else {
					hashmap[parsed_input[i+sqrI][sqrJ+j]] = true
				}
			}
		}
		i += 3
		count++
		if count == 3 {
			count = 0
			j += 3
			i = 0
		}
	}

	return true
}

However, it turns out that we can conveninent save loop by checking row & colums & square inside on big loop. The optimal solution has some big change:

instead of create 1 hashmap to keep track of validity of each row/colume (then refresh after finished) we use a dictionary of 9 values. 1 Dictionary for row, 1 for colums, 1 for square.
Smartly indexing each square by using row & colume number.

Refer to below for further investigation.

func is_valid_sudoku_optimal(input [][]string) bool {

	parsed_input := make([][]int, len(input))
	for i := 0; i < len(input); i++ {
		parsed_input[i] = parsed(input[i])
	}
	hashmap_row := make([]map[int]bool, 9)
	hashmap_col := make([]map[int]bool, 9)
	hashmap_sqr := make([]map[int]bool, 9)

	// Check all row
	for i := 0; i < len(parsed_input); i++ {

		for j := 0; j < len(parsed_input[i]); j++ {

			if parsed_input[i][j] != 0 && hashmap_row[i][parsed_input[i][j]] == true {
				return false
			}

			if parsed_input[j][i] != 0 && hashmap_col[j][parsed_input[j][i]] == true {
				return false
			}

			sqr_index := int(i/3)*3 + int(j/3)
			if hashmap_sqr[sqr_index][parsed_input[i][j]] == true {

				return false

			}

			if hashmap_sqr[sqr_index] == nil {
				hashmap_sqr[sqr_index] = make(map[int]bool)
			}
			if hashmap_row[i] == nil {
				hashmap_row[i] = make(map[int]bool)
			}
			if hashmap_col[j] == nil {
				hashmap_col[j] = make(map[int]bool)
			}
			if parsed_input[i][j] != 0 {
				hashmap_sqr[sqr_index][parsed_input[i][j]] = true

			}
			if parsed_input[i][j] != 0 {
				hashmap_row[i][parsed_input[i][j]] = true
			}
			if parsed_input[j][i] != 0 {
				hashmap_col[j][parsed_input[j][i]] = true
			}

		}
	}

	return true
}

Longest Consecutive Sequence

Given an array of integers nums, return the length of the longest consecutive sequence of elements that can be formed.

A consecutive sequence is a sequence of elements in which each element is exactly 1 greater than the previous element. The elements do not have to be consecutive in the original array.

You must write an algorithm that runs in O(n) time.

Analysis

The recommended solution is to:

Create a hashmap (to check duplicate of all number in the array).
Start looping from index 0. Keep track length
Smartly check if hashmap[array[i-1]] == false (which imply that i is the beginning index of a consecutive series in the original array). Once found, run a while loop and search for the existence of following, consecutive number from i.

It is agreed that we are having a nested loop here. However, it’s certaint that each number only be processed one. Thus we are certain that the algorithm is O(n). This is probably the most important lession i got from this problem. Having a nested loop doesn’t 100% meant the algorithm work at O(n^2) or O(n^3). In fact, number of nested looping is a heuristic method to evaluate the O(n) but never 100% rely on it.

The solution is like this:

func longestConsecutive(nums []int) int {

	hash := make(map[int]bool)
	for i := 0; i < len(nums); i++ {
		hash[nums[i]] = true
	}
	greatestConsCount := 0
	for i := 0; i < len(nums); i++ {
		if hash[nums[i]-1] == false {
			lenght := 1
			for hash[nums[i]+lenght] == true {
				lenght++

			}
			if lenght > greatestConsCount {
				greatestConsCount = lenght
			}
		}
	}
	return greatestConsCount
}

However, Neetcode provide a more crazier solution: OnePass and Hashmap, that doesn’t require a nested loop:

func longestConsecutive(nums []int) int {
    mp := make(map[int]int)
    res := 0

    for _, num := range nums {
        if mp[num] == 0 {
            left := mp[num - 1]
            right := mp[num + 1]
            sum := left + right + 1
            mp[num] = sum
            mp[num - left] = sum
            mp[num + right] = sum
            if sum > res {
                res = sum
            }
        }
    }
    return res
}

How the heck? Yes, briliantly, I begun to decode and understood this solution. In brief, it’s all about cleverly manipulating number. The core idea is looping and keep track of importance data. Then combine them in a smart way.

A more elegant way!

The first step (also my original first attemp) is to create a dictionary (hashmap) with key assigned to each number, value of each key will have special values. For example, what happend if you can store “current length” to this key-value?

Hypothetically, if my input is [11,99,12,13]. If somehow loop to 2, and I know that 11 is the beginning of a number series. I can easily know that [12] is the next number from 11 then assign length 2 to hashmap[12] -> 2. Then, onced loop to [13], if I know that there is previous number of [13] exist (which is 13 - 1 = 12, then we can check hashmap[12] != 0 to quickly verify this) then I would put hashmap[13] -> (2 + 1) or hashmap[13] -> 3

This model suggest that we must somehow store values of length and manipulate them to achieve what we want. This is my first attempt, albeit I hadn’t followed it to the end. Following this logic & some prequisite, we can safely deduce some key points to be taken accounted:

No matter of how many possible series exist in the array. There is never the case of one number belong to more than one series. For example: [11,99,12,13,10], [10] cannot create its own series of [10,11] only, it must belong to the series of [10,11,12,13] instead. (1)
What happen if the input is [11,99,12,13,10]? That meant when looping, there must be a way to update the series’s first_number_in_the_series. So that when encounter number [10], although [10] is not the last, next value from [13], it must replace [11] as the first_number_in_the_series. (2)
What happen if the input is [11,99,10,9,8,12]? Similarly, we must have someway to keep track the last_number_in_the_series.
Also, there is issue when merging: e.g [10,11, 13,14,15,12], which link two series into one (take notice the number [12])

The first solution of mine work like below:

func longestConsecutiveOnePassB(nums []int) int {

	hash := make(map[int]int)

	greatestConsCount := 0
	for _, num := range nums {
		if hash[num] == 0 {
			next := hash[num+1]
			previous := hash[num-1]

			if previous == 0 && next == 0 {
				hash[num] = 1
			}

			if next != 0 && previous == 0 {
				// if num is first number of a series, num has no "follower"

				// This meant "next" is the prev first number
				prev_series_length := hash[num+1] // this also MUST containt length of the consecutive

				// this meant we can accessed

				last_value_in_series := num + prev_series_length
				hash[last_value_in_series] = hash[last_value_in_series] + 1
				hash[num] = hash[last_value_in_series] // Also update "new" last index to the length of series
			}

			if previous != 0 && next == 0 {
				// if num is new leading number of the series, with no "follower"
				last_leading_value := hash[num-1] // this also MUST containt length of the consecutive
				hash[num] = last_leading_value + 1
				hash[num-last_leading_value] = last_leading_value + 1 // this supposed to be the "first" value in the series ("lower-bound")

			}

			if previous != 0 && next != 0 {
				last_leading_value := hash[num-1]  // the leading of a 1st series (upper bound)
				first_leading_value := hash[num+1] // the lower bound of a series

				new_series_length := last_leading_value + first_leading_value + 1
				hash[num] = hash[new_series_length]
				hash[num-last_leading_value] = new_series_length
				hash[num+first_leading_value] = new_series_length
			}

		}
	}
	for _, num := range hash {
		if num > greatestConsCount {
			greatestConsCount = num
		}
	}
	return greatestConsCount
}

Basically, from current index i,. What we calculate is to find array[i+1] and array[i-1]. It’s self-envidence that there are four possible scenarios:

Scenario A. We cannot find array[i-1] and array[i+1]: Which indicate that i is belong to a new series X. Thus array[i] is last_number_in_the_series and first_number_in_the_series. Also, hashmap[array[i]] = 1 (again, this store lenght of X).

Scenario B. We find array[i-1] but not array[i+1]: This meant array[i] belong to a already-formed series and it is the new last_number_in_the_series of that series. Update length of the series and assign new length to array[i] normally then.

Scenario C. We find array[i+1] but not array[i-1]: This meant array[i] belong to a already-formed series and it is the new first_number_in_the_series of that series. Update length of the series and assign new length to array[i] normally then.

Scenario D. We find array[i+1] and array[i-1]: This meant array[i] is a bridge that merge two priorly-unrelated series. This is very interesting! This also implies that array[i+1] is first_number_in_the_series or a series X, while array[i-1] is last_number_in_the_series or a series Y (before the merging happens).

We denote L(X) as current length of X, Last(X) is current last number in the series X, same with First(X) is current first number in the series X. From (B) and (C), after careful consideration, you would understand that for every series X,Y, the current length of that series must be kept track in Last(X) and First(X) as well. We also we use this to perform merging!

The most important benefit of having Last(X) and First(X) is that they are interchangeable and can be used to quickly indexing last/first number in the series. For example if series X has Last(X) = 15 and hashmap[Last(X)] is 10, that meant X has 10 length (current length). Because last number in X is 15 then this meant the First(X) is 15 - 10 = 5. First(X) is 5!!!

From above reasonale, we had successfully create the one-pass solution. However, let’s optimize this a bit!

Optimize #1:

func longestConsecutiveOnePassB(nums []int) int {

	/// ...
	for _, num := range nums {
		if hash[num] == 0 {
			next := hash[num+1]
			previous := hash[num-1]

			//if previous == 0 && next == 0 { // Optimize this
			//	hash[num] = 1
			//}

			/// ... (Same as above)

			if (previous != 0 && next != 0) || (previous == 0 && next == 0) {
				last_leading_value := hash[num-1]  // the leading of a 1st series (upper bound)
				first_leading_value := hash[num+1] // the lower bound of a series

				new_series_length := last_leading_value + first_leading_value + 1 // Same
				hash[num] = new_series_length
				hash[num-last_leading_value] = new_series_length
				hash[num+first_leading_value] = new_series_length
			}

		}
	}
			/// ... (Same as above)

}

This one is simply to understand. We skip the first check easily because if next & previous is 0, the new length of series is still 1. Not much to be discussed here. However, I did a bit renaming for better consistent:

func longestConsecutiveOnePassB(nums []int) int {

	hash := make(map[int]int)

	greatestConsCount := 0
	for _, num := range nums {
		if hash[num] == 0 {
			next := hash[num+1]
			previous := hash[num-1]

			if next != 0 && previous == 0 {
				prev_series_length := hash[num+1]
				new_series_length := hash[num+prev_series_length] + 1
				hash[num] = new_series_length 
			}

			if previous != 0 && next == 0 {
				next_series_length := hash[num-1]
				new_series_length := next_series_length + 1 
				hash[num] = new_series_length

			}

			if (previous != 0 && next != 0) || (previous == 0 && next == 0) {
				prev_series_length := hash[num-1] // the leading of a 1st series (upper bound)
				next_series_length := hash[num+1] // the lower bound of a series

				new_series_length := prev_series_length + next_series_length + 1
				hash[num] = new_series_length
				hash[num-prev_series_length] = new_series_length
				hash[num+next_series_length] = new_series_length
			}

		}
	}
	for _, num := range hash {
		if num > greatestConsCount {
			greatestConsCount = num
		}
	}
	return greatestConsCount
}

Optimize #2:

Look at this

/// Scenario C
if next != 0 && previous == 0 {
	prev_series_length := hash[num+1]
	new_series_length := hash[num+prev_series_length] + 1
	hash[num] = new_series_length 
}

In this scenario, previous == 0 then hash[num-1] = 0, which meant prev_series_length (if calculated) will also be 0 and cause no effect when calculating new_series_length! This also true in Scenario B. Thus we can safely optimize case B and C into D!

func longestConsecutiveOnePassB(nums []int) int {

	hash := make(map[int]int)

	greatestConsCount := 0
	for _, num := range nums {
		if hash[num] == 0 {

			prev_series_length := hash[num-1] // the leading of a 1st series (upper bound)
			next_series_length := hash[num+1] // the lower bound of a series

			new_series_length := prev_series_length + next_series_length + 1
			hash[num] = new_series_length
			hash[num-prev_series_length] = new_series_length
			hash[num+next_series_length] = new_series_length

		}
	}
	for _, num := range hash {
		if num > greatestConsCount {
			greatestConsCount = num
		}
	}
	return greatestConsCount
}

Which is one-to-one to Neetcode’solution:

func longestConsecutive(nums []int) int {
    mp := make(map[int]int)
    res := 0

    for _, num := range nums {
        if mp[num] == 0 {
            left := mp[num - 1]
            right := mp[num + 1]
            sum := left + right + 1
            mp[num] = sum
            mp[num - left] = sum
            mp[num + right] = sum
            if sum > res {
                res = sum
            }
        }
    }
    return res
}

Yes, you can put the comparison into the first loop, however, i intentionally seperate the two for better clarity.

This is the end of Array & Hasing seciton in Neetcode’s recommended path. Albeit, never I found indexing become such interesting and clever like this.