ZigZag Conversion

Problem Statement

You are given a string s and an integer num. Instead of writing out the string horizontally, we display it in a manner such that there are num number of rows and the characters in the string s follow a zigzag pattern. Below is an example for when s = ”COMPUTERSCIENCE” and num = 4:

C			E			N
O		T	R		E	C
M	U		S	I		E
P			C

In the end, we want to output the string that results from reading the pattern from left to right and top to bottom. In this example, it would be ”CENOTRECMUSIEPC”. Come up with an efficient solution that correctly manipulates the input string. Please also provide runtime and space complexity analysis. Source: LeetCode ZigZag Conversion

Starter Code

def zigzag(s, num):
"""
input s: a string
input num: an integer
output: a string after manipulating the input according to the instructions
"""
# your solution here

Test Cases

Test 1

Input: “EECS”, 1
Output: “EECS”

Test 2

Input: “CS”, 3
Output: “CS”

Test 3

Input: “COMPUTERSCIENCE”, 3
Output: “CUSNOPTRCECMEIE”

Hint

Hint 1 (Click to reveal!)

How to approach this problem?

Since in the end we read the characters by row (from top to bottom), can you think of a way to put the characters in the original string into their respective rows (in the correct order)?

Hint 2 (Click to reveal!)

Identifying a Pattern

Did you notice any special relationship between the indices of the characters and their respective row numbers?

Solution

Click Here!

Algorithm

Let’s first walk through the example of when s = ”COMPUTERSCIENCE” and num = 4. We first write out the pattern, along with the indices of each character:

C 00			E 06			N 12
O 01		T 05	R 07		E 11	C 13
M 02	U 04		S 08	I 10		E 14
P 03			C 09

As hint 1 suggested, we want to be able to put the characters in the original string into their respective rows in the correct order. Before doing so, we can try to group characters together in the following way: a group consists of a column and the diagonal following it (excluding the bottom most and the top most items of that diagonal). Thus, there are \(2 * num - 2\) characters in each group. In the example above, there are 6 characters in each group (and the groups are “COMPUT”, “URSCIE”, and “NCE”). Now, we notice a very interesting pattern: if you divide the index of each item in row i by the number of items in each group, the remainder is either i or -i (which is the same as the number of items in a group minus i). For example, if we take row 2 for example, we see:

M 02	U 04		S 08	I 10		E 14

The index for ’M’ is 2, and when it divides 6 (the number of items in each group), the remainder is 2. The index for ’U’ is 4, and when it divides 6 (the number of items in each group), the remainder is 4 (which is 6-2). The index for ’S’ is 8, and when it divides 6 (the number of items in each group), the remainder is 2. The index for ’I’ is 10, and when it divides 6 (the number of items in each group), the remainder is 4 (which is 6-2). The index for ’E’ is 14, and when it divides 6 (the number of items in each group), the remainder is 2. (Note: you can check this for other rows as well!)

Now, we can just code it up: we first create a list for each row, and then iterate through the string and add characters to their respective rows according to the pattern we discovered above!

Code

def zigzag(s, num):
    if num == 1: # if there is one row
        return s # we return the original string
    if len(s) <= num: # if there are less characters then the number of rows
        return s # we also return the original string
    rows_lst = [] # this helps store the characters in each of the rows
    for _ in range(num):
        rows_lst.append([])
    grp = 2 * num - 2 # declares the number of characters in each group
    for i in range(len(s)):
        rmdr = i % grp # computes the remainder when the index divides grp
        if rmdr <= grp / 2: # this is the case when remainder is the row number
            rows_lst[rmdr].append(s[i])
        else: # this is the case when the remainder is group size - row number
            rows_lst[grp-rmdr].append(s[i])
    for i in range(len(rows_lst)):
        rows_lst[i] = "".join(rows_lst[i]) # convert each row into a string
    return "".join(rows_lst) # join the strings from each row

Runtime Analysis

The runtime of this algorithm depends solely on the length of the input string s since we scan it once to place them in their respective row list. Thus, this solution is O(n) where n is the length of the input string s.

Space Complexity

O(n) where n is the length of the input string s because we created a list of lists (representing each row) and the total number of elements the interior lists equal to the number of characters in the input string.

Follow-up Question

How would you parallelize this problem? Keep in mind the number of machines you have may be less than the number of subproblems you may have, in this case, make sure you find an acceptable way of dividing the work equally amongst the machines.

Follow-up Solution

Solution for the Follow-up Question

Earlier, we brought up the idea of a “group”, which is essentially a column and the diagonal following it in the display (the diagonal does not include the top most item and does not double count the bottom most item). We can parallelize the above solution by assigning an equal number of groups to the n machines we have in order (or close to an equal number with the exception of the last machine). For example, if we have 8 groups and 3 machines, we will assign the first 3 groups to the first machine, next 3 groups to the second machine, and the last 2 groups to the final machine. On each of the machines, we complete the process stated above except for the very last step of joining the strings that result from individual rows. On the n machines, we will have n different list of strings (where each string represents the combined result from each row), and we wish to combine them. So for each row in all the n lists, we concatenate the strings, which would result in one list with num strings. Lastly we join the strings into one big output. Below is an example for when s = ”COMPUTERSCIENCE”, num = 3, and we wish to divide the work amongst 3 machines:

Originally the zigzag would look something like this:

C			E			N
O		T	R		E	C
M	U		S	I		E
P			C

We will divide the work amongst the machines in the following way:

C			E			N
O		T	R		E	C
M	U		S	I		E
P			C
1	1	1	2	2	2	3

The first machine will do computational work for the string “COMPUT”, the second machine will do computational work for the string “ERSCIE”, and the last machine will do computational work for the string “NCE”.

We will retrieve the fellow lists as outputs from each machine:

Machine 1 Output: [‘C’, ‘OT’, ‘MU’, ‘P’]
Machine 2 Output: [‘E’, ‘RE’, ‘SI’, ‘C’]
Machine 3 Output: [‘N’, ‘C’, ‘E’]

We then combine the three lists into one [‘CEN’, ‘OTREC’, ‘MUSIE’, ‘PC’], and then output the concatenation which is ‘CENOTRECMUSIEPC’