Skip to content

Latest commit

 

History

History
 
 

727

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Given strings s1 and s2, return the minimum contiguous substring part of s1, so that s2 is a subsequence of the part.

If there is no such window in s1 that covers all characters in s2, return the empty string "". If there are multiple such minimum-length windows, return the one with the left-most starting index.

 

Example 1:

Input: s1 = "abcdebdde", s2 = "bde"
Output: "bcde"
Explanation: 
"bcde" is the answer because it occurs before "bdde" which has the same length.
"deb" is not a smaller window because the elements of s2 in the window must occur in order.

Example 2:

Input: s1 = "jmeqksfrsdcmsiwvaovztaqenprpvnbstl", s2 = "u"
Output: ""

 

Constraints:

  • 1 <= s1.length <= 2 * 104
  • 1 <= s2.length <= 100
  • s1 and s2 consist of lowercase English letters.

Companies: Google, Snapchat, eBay

Related Topics:
String, Dynamic Programming, Sliding Window

Similar Questions:

Solution 1. DP

Let dp[i+1][j+1] be the number of matched letters between s[0..i] and t[0..j], start[i+1][j+1] be the starting index of the minimum window corresponding to dp[i+1][j+1].

Initially dp[i+1][j+1] = 0, start[i+1][j+1] = -1.

If s[i] == t[j], dp[i+1][j+1] = 1 + dp[i][j]. start[i+1][j+1] = j == 0 ? i : start[i][j].

If s[i] != t[j], dp[i+1][j+1] = max(dp[i][j+1], dp[i][j]). If dp[i+1][j+1] == dp[i][j+1], start[i+1][j+1] = start[i][j + 1]. Additionally, if dp[i+1][j+1] = dp[i][j], start[i+1][j+1] = max(start[i+1][j+1], start[i][j])

// OJ: https://leetcode.com/problems/minimum-window-subsequence
// Author: github.com/lzl124631x
// Time: O(MN)
// Space: O(MN)
class Solution {
public:
    string minWindow(string s, string t) {
         int M = s.size(), N = t.size();
         vector<vector<int>> dp(M + 1, vector<int>(N + 1)), start(M + 1, vector<int>(N + 1, -1));
         for (int i = 0; i < M; ++i) {
             for (int j = 0; j < N; ++j) {
                 if (s[i] == t[j]) {
                     dp[i + 1][j + 1] = 1 + dp[i][j];
                     if (j == 0) {
                         start[i + 1][j + 1] = i;
                     } else {
                         start[i + 1][j + 1] = start[i][j];
                     }
                 } else {
                     dp[i + 1][j + 1] = max(dp[i][j + 1], dp[i][j]);
                     if (dp[i + 1][j + 1] == dp[i][j + 1]) {
                         start[i + 1][j + 1] = start[i][j + 1];
                     }
                     if (dp[i + 1][j + 1] == dp[i][j]) {
                         start[i + 1][j + 1] = max(start[i + 1][j + 1], start[i][j]);
                     }
                 }
             }
         }
         int minLen = INT_MAX, st = -1;
         for (int i = 0; i < M; ++i) {
             if (dp[i + 1][N] == N && i - start[i + 1][N] + 1 < minLen) {
                st = start[i + 1][N];
                minLen = i - start[i + 1][N] + 1;
             }
         }
         return minLen == INT_MAX ? "" : s.substr(st, minLen);
    }
};

Since dp[i + 1][j + 1] only depends on dp[i][j] and dp[i + 1][j], we can reduce the space complexity from O(MN) to O(N).

// OJ: https://leetcode.com/problems/minimum-window-subsequence
// Author: github.com/lzl124631x
// Time: O(MN)
// Space: O(N)
class Solution {
public:
    string minWindow(string s, string t) {
         int M = s.size(), N = t.size(), minLen = INT_MAX, st = -1;
         vector<int> dp(N + 1), start(N + 1, -1);
         for (int i = 0; i < M; ++i) {
             for (int j = N - 1; j >= 0; --j) {
                 if (s[i] == t[j]) {
                     dp[j + 1] = 1 + dp[j];
                     start[j + 1] = j == 0 ? i : start[j];
                 } else {
                     dp[j + 1] = max(dp[j + 1], dp[j]);
                     if (dp[j + 1] == dp[j]) start[j + 1] = max(start[j + 1], start[j]);
                 }
             }
             if (dp[N] == N && i - start[N] + 1 < minLen) {
                 st = start[N];
                 minLen = i - start[N] + 1;
             }
         }
         return minLen == INT_MAX ? "" : s.substr(st, minLen);
    }
};

Solution 2. Greedy

Intuition

Let start and end denote the left-most and right-most indices of a given window/substring in s1.

We can try all possible values of start, and for each of them, find the smallest value of end such that s2 is a subsequence of s1[start..end]. Let m denote the length of s2. We want to find a sequence of indices i0,i1,...,im−1i_0, i_1, ..., i_{m-1}i0,i1,...,im1 such that start−1<i0<i1<...<im−1\text{start} - 1 < i_0 < i_1 < ... < i_{m-1}start1<i0<i1<...<im1 and s1[i0]=s2[0],s1[i1]=s2[1],...,s1[im−1]=s2[m−1]s_1[i_0] = s_2[0], s_1[i_1] = s_2[1], ..., s_1[i_{m-1}] = s_2[m-1]s1[i0]=s2[0],s1[i1]=s2[1],...,s1[im1]=s2[m1]. We have im−1=endi_{m-1} = \text{end}im1=end and want this value to be the smallest possible.

To find the index sequence, we greedily find indices one by one. First, find the smallest i0i_0i0 such that start−1<i0\text{start} - 1 < i_0start1<i0 and s1[i0]=s2[0]s_1[i_0] = s_2[0]s1[i0]=s2[0]. Then find the smallest i1i_1i1 such that i0<i1i_0 < i_1i0<i1 and s1[i1]=s2[1]s_1[i_1] = s_2[1]s1[i1]=s2[1]. Continue the process until we find the whole sequence or cannot find a valid index.

For example, s1 = "abcdebdde", s2 = "bde".

  • Let start = 0. The smallest i0≥0i_0 \ge 0i00 such that s1[i0]s_1[i_0]s1[i0] = 'b' is 1. So we have i0=1i_0 = 1i0=1. Then we need to find the smallest i1>1i_1 > 1i1>1 such that s1[i1]=s_1[i_1] =s1[i1]= 'd'. We find that i1=3i_1 = 3i1=3. Finally, we find i2=4i_2 = 4i2=4 (s1[4]=s_1[4] =s1[4]= 'e'). Here end=i2=4\text{end} = i_2 = 4end=i2=4.
  • When start = 5, we have i0=5i_0 = 5i0=5 (s1[5]=s2[0]=s_1[5] = s_2[0] =s1[5]=s2[0]= 'b'), i1=6i_1 = 6i1=6, i2=8i_2 = 8i2=8.
  • For start = 6, no valid sequence exists.

For each letter c, we will keep an increasing array indices[c] of all positions where c occurs in the string s1. For example, for s1 = "abcdebdde", indices['d'] = [3, 6, 7], because s1[3] = s1[6] = s1[7] = 'd'.

Finding the smallest i0≥starti_0 \ge \text{start}i0start such that s1[i0]=s2[0]s_1[i_0] = s_2[0]s1[i0]=s2[0] is equivalent to finding the smallest element greater than or equal to start\text{start}start in the array indices[s2[0]]\text{indices}[s_2[0]]indices[s2[0]]. Let ind0\text{ind}_0ind0 denote the index of the element i0i_0i0 in this array: indices[s2[0]][ind0]=i0\text{indices}[s_2[0]][\text{ind}_0] = i_0indices[s2[0]][ind0]=i0.

Since the array indices[s2[0]]\text{indices}[s_2[0]]indices[s2[0]] is increasing, the above is equivalent to finding the smallest ind0\text{ind}_0ind0 such that indices[s2[0]][ind0]>start−1\text{indices}[s_2[0]][\text{ind}_0] > \text{start} - 1indices[s2[0]][ind0]>start1.

Similarly, we find the smallest i1>i0i_1 > i_0i1>i0 in the array indices[s2[1]]\text{indices}[s_2[1]]indices[s2[1]] by finding the smallest ind1\text{ind}_1ind1 such that indices[s2[1]][ind1]>i0\text{indices}[s_2[1]][\text{ind}_1] > i_0indices[s2[1]][ind1]>i0.

Then we do the same for finding i2i_2i2, i3i_3i3, ..., im−1i_{m-1}im1.

One can notice that when start\text{start}start increases, so do the indices i0i_0i0, i1i_1i1, ..., im−1i_{m-1}im1 and ind0\text{ind}_0ind0, ind1\text{ind}_1ind1, ..., indm−1\text{ind}_{m-1}indm1.

We use an array to track the values of ind\text{ind}ind and initially set ind0=ind1=...=indm−1=0\text{ind}_0 = \text{ind}_1 = ... = \text{ind}_{m-1} = 0ind0=ind1=...=indm1=0. During the process, we increase ind\text{ind}ind when necessary.

Initially, we set ind0=ind1=...=indm−1=0\text{ind}_0 = \text{ind}_1 = ... = \text{ind}_{m-1} = 0ind0=ind1=...=indm1=0. During the process, we increase ind\text{ind}ind when necessary.

For example, let's say that for a given start\text{start}start, we want to know i0i_0i0. We can use a variable prev\text{prev}prev that tracks the previous index. We initialize prev\text{prev}prev to start−1\text{start}-1start1 and increment ind0\text{ind}_0ind0 until indices[s2[0]][ind0]>prev\text{indices}[s_2[0]][\text{ind}_0] > \text{prev}indices[s2[0]][ind0]>prev. Then we set prev\text{prev}prev to i0i_0i0 and increment ind1\text{ind}_1ind1 until indices[s2[1]][ind1]>prev\text{indices}[s_2[1]][\text{ind}_1] > \text{prev}indices[s2[1]][ind1]>prev. We continue the process for finding i2i_2i2, i3i_3i3, ..., im−1i_{m-1}im1.

Algorithm

  1. Let n be the length of s1 and m be the length of s2.
  2. Initialize the answer with an empty string.
  3. For each letter c, find all its occurrences in the string s1 and add them to indices[c].
  4. Initialize an array ind of size m with zeros.
  5. Iterate start from 0 to n - 1.
    • Initialize prev with start - 1.
    • Iterate j from 0 to m - 1. This variable is iterating over the characters of s2.
      • Let curIndices be indices[s2[j]]. These are all the indices where the current character in s2 appears in s1.
      • While ind[j] < len(curIndices) and curIndices[ind[j]] <= prev
        • Increment ind[j].
      • If ind[j] = len(curIndices) (we could not find an element greater than last), there is no valid window starting at start, we can immediately return the answer, since all future values of start will be greater and there will also be no valid window.
      • Set prev to curIndices[ind[j]].
    • At this point, prev = end. Our candidate string is s1[start..prev]. If answer is empty or this candidate has a shorter length, update answer with the candidate.
  6. Return the answer.

Complexity Analysis

Given nnn as the length of s1 and mmm as the length of s2,

  • Time complexity: O(n⋅m)O(n \cdot m)O(nm).

We have n iterations of the outer for loop (start from 0 to n - 1) and m iterations of the inner for loop (j from 0 to m - 1). There also is a while loop inside that increments ind[j]. Thus the total complexity is O(n⋅m)O(n \cdot m)O(nm) + (the total number of iterations of the while loop).

Since each iteration of the while loop increments ind[j], the number of iterations of this loop equals the number of increments of ind[j].

The initial value of ind[j] is 0, and the final one does not exceed the length of indices[s2[j]] <= n. Thus there will be no more than n increments of each ind[j]. Since there are m elements ind[j], the total number of increments does not exceed n⋅mn \cdot mnm.

  • Space complexity: O(n)O(n)O(n).

We keep indices[c] for each letter c. We store each index from 0 to n - 1 exactly once in indices[s1[i]]. Thus the total size of all indices[c] is n.

Also, we store the array ind with m <= n elements.

The total space complexity is O(n)O(n)O(n).

class Solution {
public:
    string minWindow(string s1, string s2) {
        int n = s1.size(), m = s2.size();
        string answer;
        unordered_map<char, vector<int>> indices;
        for (int i = 0; i < n; i++) {
            indices[s1[i]].push_back(i);
        }
        vector<int> ind(m);
        for (int start = 0; start < n; start++) {
            int prev = start - 1;
            bool notFound = false;
            for (int j = 0; j < m; j++) {
                if (!indices.count(s2[j])) {
                    return "";
                }
                const vector<int>& curIndices = indices[s2[j]];
                while (ind[j] < curIndices.size() && curIndices[ind[j]] <= prev) {
                    ind[j]++;
                }
                if (ind[j] == curIndices.size()) {
                    return answer;
                }
                prev = curIndices[ind[j]];
            }
            if (answer == "" || prev - start + 1 < answer.size()) {
                answer = s1.substr(start, prev - start + 1);
            }
        }
        return answer;
    }
};