30. Substring with Concatenation of All Words

Substring with Concatenation of All Words

Substring with Concatenation of All Words

Hello, devs 👋! Today we will discuss a hard problem which is a very popular problem in coding interviews.


You are given a string s and an array of strings words. All the strings  words are of the same length.

concatenated substring in s is a substring that contains all the strings of any permutation of words concatenated.

  • For example, if words = ["ab","cd","ef"], then "abcdef""abefcd""cdabef""cdefab""efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated substring because it is not the concatenation of any permutation of words.

Return the starting indices of all the concatenated substrings in s. You can return the answer in any order.

 

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]

Explanation: Since words.length == 2 and words[i].length == 3, the concatenated substring has to be of length 6.
The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.
The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.
The output order does not matter. Returning [9,0] is fine too.


Example 2:

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []

Explanation: Since words.length == 4 and words[i].length == 4, the concatenated substring has to be of length 16.
There is no substring of length 16 is s that is equal to the concatenation of any permutation of words.
We return an empty array.


Example 3:

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]

Explanation: Since words.length == 3 and words[i].length == 3, the concatenated substring has to be of length 9.
The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"] which is a permutation of words.
The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"] which is a permutation of words.
The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"] which is a permutation of words.

 

Constraints:

  • 1 <= s.length <= 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • s and words[i] consist of lowercase English letters.

Approach) Brute Force

Idea:

We will be given a string s and an array of strings words. The ask is to combine every element of the words array, in any order to make a long string and check if a such string exists in the given string s. If it does, then we need to return the starting index of the combined string in the s.

For e.g., if s = barfoothefoobarman and words = ["foo","bar"], we have two concatenated permutations of strings in the words array which are foobar and barfoo.

Now, we need to search both these permutations in s and return their starting indices. In our case, we will return the indices list containing 0 and 9.

It is important to note that all the strings in the words array will be present in the combined string to be searched in s. If any string is not present in s, then we will return an empty list.

Also, note that every string in words is of the same length.


Algorithm

If we look at the problem, the first constraint we find is that all the strings in words the array should be included in the combination. Even if a string is repeated in the array, it should be considered in the combination that many times.

For e.g., if words == ["foo", "foo"], then we need to search foofoo in the s. Thus, we can use Map to store the count of each string in the words.

Since every string in words is of equal length, the length of the string combination we need to search in words will be -

// Length of each word
wordLength = words[0].length()
// Total length
wordArrayLength = wordLength * words.length

Now, we need to search for a string of length wordArrayLength in the s. After getting the substring string, we will check if all the strings present in the words are present in it. If they are, we will add the starting index, otherwise we will skip.

Below are steps -

  • Store the count of each string in words in a map, say wordCount.
  • Loop through the s and in each iteration, do the following -
  • Get the substring of length wordArrayLength.
  • Break this substring into further substrings of length wordLength.
  • Store the count of substrings fount in #4 into another map.
  • At last, check if both the maps are equal. If they are, then add the current index in the resultant list.

The most important part in the above algorithm is to understand why we are comparing maps 🤔? We are doing it because if the combined string lies in the s, then counts of all the strings in words will be equal to the count of all the partitions in the substring of length wordArryaLength.

Code:

var findSubstring = function (s, words) {
    // Resultant list
    const indices = [];
    // Base conditions
    if (s === null || s.length === 0 || words === null || words.length == 0) {
        return indices;
    }
    // Store the words and their counts in a hash map
    const wordCount = words.reduce((a, b) => {
        a[b] = (a[b] + 1) || 1;
        return a;
    }, {});
    // Length of each word in the words array`
    const wordLength = words[0].length;
    // Length of all the words combined in the array
    const wordArrayLength = wordLength * words.length;
    // Loop for the entire string
    for (let i = 0; i <= s.length - wordArrayLength; i++) {
        // Get the substring of length equal to wordArrayLength
        let current = s.substring(i, i + wordArrayLength);
        // Map to store each word of the substring
        const wordMap = {};
        // Index to loop through the words array
        let index = 0;
        // Index to get each word in the current
        let j = 0;
        // Loop through each word of the words array
        while (index < words.length) {
            // Divide the current string into strings of length of
            // each word in the array
            const part = current.substring(j, j + wordLength);
            // Put this string into the wordMap
            wordMap[part] = (wordMap[part] + 1) || 1;
            // Update j and index
            j += wordLength;
            index++;
        }
        // At this point compare the maps
        let a = JSON.stringify(Object.entries(wordCount).sort());
        let b = JSON.stringify(Object.entries(wordMap).sort());
        if (a === b) {
            indices.push(i);
        }
    }
    return indices;  
};

Approach) Recursion

The basic premise for this solution is that we loop through the original string, and for each iteration, check if all of the target words can be found from this point on in the string. To achieve the more complicated latter half of this, we will use recursion.

Code:

var findSubstring = function (s, words) {
    // Initialise an array to store our answers in
    let answers = [];

    // Calculate the total length of the words
    const totalLengthOfWords = words.reduce(
        (total, word) => (total += word.length),
        0
    );

    // Loop through the string, until there is insufficient space left to find all words
    for (let i = 0; i <= s.length - totalLengthOfWords; i++) {
        // If the string from this point contains all target words, store the starting position
        if (doesStringContainAllWords(s.substring(i), words.slice())) {
            answers.push(i);
        }
    }

    return answers;
};

function doesStringContainAllWords(string, words) {
    // If all words have been found
    if (!words.length) return true;

    // Loop through all words
    for (let i = 0; i < words.length; i++) {
        // Store the length of the target word (as it may be spliced)
        const targetWordLength = words[i].length;

        // Check if the word in question matches is found at the start of the string
        if (string.substring(0, targetWordLength) === words[i]) {
            // Remove the found word from the words array
            words.splice(i, 1);

            // Look for the remaining words in the rest of the string
            return doesStringContainAllWords(
                string.substring(targetWordLength),
                words
            );
        }
    }

    // If no words were found in the current string
    return false;
}

Time Complexity

Since we are scanning every string in words and every character in s, the time complexity will be O(mn), where m => length of words and n => length of s


Space Complexity

We are using two maps to store the contents of words and partitions of substrings of s therefore, the space complexity will be O(m + n).


Conclusion


That’s all folks! In this post, we solved LeetCode problem #30. Substring with Concatenation of All Words

I hope you have enjoyed this post. Feel free to share your thoughts on this.

You can find the complete source code on my GitHub repository. If you like what you learn. feel free to fork 🔪 and star ⭐ it.


In this blog, I have tried to collect & present the most important points to consider when improving Data structure and logic, feel free to add, edit, comment, or ask. For more information please reach me here
Happy coding!


Comments

Popular Post