Given strings s1
and s2
, return the minimum contiguous substring part of s1
, so that s2
is a subsequence of the part.
If there is no such window in s1
that covers all characters in s2
, return the empty string ""
. If there are multiple such minimum-length windows, return the one with the left-most starting index.
Example 1:
Input: s1 = "abcdebdde", s2 = "bde" Output: "bcde" Explanation: "bcde" is the answer because it occurs before "bdde" which has the same length. "deb" is not a smaller window because the elements of s2 in the window must occur in order.
Example 2:
Input: s1 = "jmeqksfrsdcmsiwvaovztaqenprpvnbstl", s2 = "u" Output: ""
Constraints:
1 <= s1.length <= 2 * 104
1 <= s2.length <= 100
s1
ands2
consist of lowercase English letters.
Companies: Google, Snapchat, eBay
Related Topics:
String, Dynamic Programming, Sliding Window
Similar Questions:
Let dp[i+1][j+1]
be the number of matched letters between s[0..i]
and t[0..j]
, start[i+1][j+1]
be the starting index of the minimum window corresponding to dp[i+1][j+1]
.
Initially dp[i+1][j+1] = 0
, start[i+1][j+1] = -1
.
If s[i] == t[j]
, dp[i+1][j+1] = 1 + dp[i][j]
. start[i+1][j+1] = j == 0 ? i : start[i][j]
.
If s[i] != t[j]
, dp[i+1][j+1] = max(dp[i][j+1], dp[i][j])
. If dp[i+1][j+1] == dp[i][j+1]
, start[i+1][j+1] = start[i][j + 1]
. Additionally, if dp[i+1][j+1] = dp[i][j]
, start[i+1][j+1] = max(start[i+1][j+1], start[i][j])
// OJ: https://leetcode.com/problems/minimum-window-subsequence
// Author: github.com/lzl124631x
// Time: O(MN)
// Space: O(MN)
class Solution {
public:
string minWindow(string s, string t) {
int M = s.size(), N = t.size();
vector<vector<int>> dp(M + 1, vector<int>(N + 1)), start(M + 1, vector<int>(N + 1, -1));
for (int i = 0; i < M; ++i) {
for (int j = 0; j < N; ++j) {
if (s[i] == t[j]) {
dp[i + 1][j + 1] = 1 + dp[i][j];
if (j == 0) {
start[i + 1][j + 1] = i;
} else {
start[i + 1][j + 1] = start[i][j];
}
} else {
dp[i + 1][j + 1] = max(dp[i][j + 1], dp[i][j]);
if (dp[i + 1][j + 1] == dp[i][j + 1]) {
start[i + 1][j + 1] = start[i][j + 1];
}
if (dp[i + 1][j + 1] == dp[i][j]) {
start[i + 1][j + 1] = max(start[i + 1][j + 1], start[i][j]);
}
}
}
}
int minLen = INT_MAX, st = -1;
for (int i = 0; i < M; ++i) {
if (dp[i + 1][N] == N && i - start[i + 1][N] + 1 < minLen) {
st = start[i + 1][N];
minLen = i - start[i + 1][N] + 1;
}
}
return minLen == INT_MAX ? "" : s.substr(st, minLen);
}
};
Since dp[i + 1][j + 1]
only depends on dp[i][j]
and dp[i + 1][j]
, we can reduce the space complexity from O(MN)
to O(N)
.
// OJ: https://leetcode.com/problems/minimum-window-subsequence
// Author: github.com/lzl124631x
// Time: O(MN)
// Space: O(N)
class Solution {
public:
string minWindow(string s, string t) {
int M = s.size(), N = t.size(), minLen = INT_MAX, st = -1;
vector<int> dp(N + 1), start(N + 1, -1);
for (int i = 0; i < M; ++i) {
for (int j = N - 1; j >= 0; --j) {
if (s[i] == t[j]) {
dp[j + 1] = 1 + dp[j];
start[j + 1] = j == 0 ? i : start[j];
} else {
dp[j + 1] = max(dp[j + 1], dp[j]);
if (dp[j + 1] == dp[j]) start[j + 1] = max(start[j + 1], start[j]);
}
}
if (dp[N] == N && i - start[N] + 1 < minLen) {
st = start[N];
minLen = i - start[N] + 1;
}
}
return minLen == INT_MAX ? "" : s.substr(st, minLen);
}
};
Let start
and end
denote the left-most and right-most indices of a given window/substring in s1
.
We can try all possible values of start
, and for each of them, find the smallest value of end
such that s2
is a subsequence of s1[start..end]
. Let m
denote the length of s2
. We want to find a sequence of indices i0,i1,...,im−1i_0, i_1, ..., i_{m-1}i0,i1,...,im−1 such that start−1<i0<i1<...<im−1\text{start} - 1 < i_0 < i_1 < ... < i_{m-1}start−1<i0<i1<...<im−1 and s1[i0]=s2[0],s1[i1]=s2[1],...,s1[im−1]=s2[m−1]s_1[i_0] = s_2[0], s_1[i_1] = s_2[1], ..., s_1[i_{m-1}] = s_2[m-1]s1[i0]=s2[0],s1[i1]=s2[1],...,s1[im−1]=s2[m−1]. We have im−1=endi_{m-1} = \text{end}im−1=end and want this value to be the smallest possible.
To find the index sequence, we greedily find indices one by one. First, find the smallest i0i_0i0 such that start−1<i0\text{start} - 1 < i_0start−1<i0 and s1[i0]=s2[0]s_1[i_0] = s_2[0]s1[i0]=s2[0]. Then find the smallest i1i_1i1 such that i0<i1i_0 < i_1i0<i1 and s1[i1]=s2[1]s_1[i_1] = s_2[1]s1[i1]=s2[1]. Continue the process until we find the whole sequence or cannot find a valid index.
For example, s1 = "abcdebdde"
, s2 = "bde"
.
- Let
start = 0
. The smallest i0≥0i_0 \ge 0i0≥0 such that s1[i0]s_1[i_0]s1[i0] ='b'
is1
. So we have i0=1i_0 = 1i0=1. Then we need to find the smallest i1>1i_1 > 1i1>1 such that s1[i1]=s_1[i_1] =s1[i1]='d'
. We find that i1=3i_1 = 3i1=3. Finally, we find i2=4i_2 = 4i2=4 (s1[4]=s_1[4] =s1[4]='e'
). Here end=i2=4\text{end} = i_2 = 4end=i2=4. - When
start = 5
, we have i0=5i_0 = 5i0=5 (s1[5]=s2[0]=s_1[5] = s_2[0] =s1[5]=s2[0]='b'
), i1=6i_1 = 6i1=6, i2=8i_2 = 8i2=8. - For
start = 6
, no valid sequence exists.
For each letter c
, we will keep an increasing array indices[c]
of all positions where c
occurs in the string s1
. For example, for s1 = "abcdebdde"
, indices['d'] = [3, 6, 7]
, because s1[3] = s1[6] = s1[7] = 'd'
.
Finding the smallest i0≥starti_0 \ge \text{start}i0≥start such that s1[i0]=s2[0]s_1[i_0] = s_2[0]s1[i0]=s2[0] is equivalent to finding the smallest element greater than or equal to start\text{start}start in the array indices[s2[0]]\text{indices}[s_2[0]]indices[s2[0]]. Let ind0\text{ind}_0ind0 denote the index of the element i0i_0i0 in this array: indices[s2[0]][ind0]=i0\text{indices}[s_2[0]][\text{ind}_0] = i_0indices[s2[0]][ind0]=i0.
Since the array indices[s2[0]]\text{indices}[s_2[0]]indices[s2[0]] is increasing, the above is equivalent to finding the smallest ind0\text{ind}_0ind0 such that indices[s2[0]][ind0]>start−1\text{indices}[s_2[0]][\text{ind}_0] > \text{start} - 1indices[s2[0]][ind0]>start−1.
Similarly, we find the smallest i1>i0i_1 > i_0i1>i0 in the array indices[s2[1]]\text{indices}[s_2[1]]indices[s2[1]] by finding the smallest ind1\text{ind}_1ind1 such that indices[s2[1]][ind1]>i0\text{indices}[s_2[1]][\text{ind}_1] > i_0indices[s2[1]][ind1]>i0.
Then we do the same for finding i2i_2i2, i3i_3i3, ..., im−1i_{m-1}im−1.
One can notice that when start\text{start}start increases, so do the indices i0i_0i0, i1i_1i1, ..., im−1i_{m-1}im−1 and ind0\text{ind}_0ind0, ind1\text{ind}_1ind1, ..., indm−1\text{ind}_{m-1}indm−1.
We use an array to track the values of ind\text{ind}ind and initially set ind0=ind1=...=indm−1=0\text{ind}_0 = \text{ind}_1 = ... = \text{ind}_{m-1} = 0ind0=ind1=...=indm−1=0. During the process, we increase ind\text{ind}ind when necessary.
Initially, we set ind0=ind1=...=indm−1=0\text{ind}_0 = \text{ind}_1 = ... = \text{ind}_{m-1} = 0ind0=ind1=...=indm−1=0. During the process, we increase ind\text{ind}ind when necessary.
For example, let's say that for a given start\text{start}start, we want to know i0i_0i0. We can use a variable prev\text{prev}prev that tracks the previous index. We initialize prev\text{prev}prev to start−1\text{start}-1start−1 and increment ind0\text{ind}_0ind0 until indices[s2[0]][ind0]>prev\text{indices}[s_2[0]][\text{ind}_0] > \text{prev}indices[s2[0]][ind0]>prev. Then we set prev\text{prev}prev to i0i_0i0 and increment ind1\text{ind}_1ind1 until indices[s2[1]][ind1]>prev\text{indices}[s_2[1]][\text{ind}_1] > \text{prev}indices[s2[1]][ind1]>prev. We continue the process for finding i2i_2i2, i3i_3i3, ..., im−1i_{m-1}im−1.
- Let
n
be the length ofs1
andm
be the length ofs2
. - Initialize the
answer
with an empty string. - For each letter
c
, find all its occurrences in the strings1
and add them toindices[c]
. - Initialize an array
ind
of sizem
with zeros. - Iterate
start
from0
ton - 1
.- Initialize
prev
withstart - 1
. - Iterate
j
from0
tom - 1
. This variable is iterating over the characters ofs2
.- Let
curIndices
beindices[s2[j]]
. These are all the indices where the current character ins2
appears ins1
. - While
ind[j] < len(curIndices)
andcurIndices[ind[j]] <= prev
- Increment
ind[j]
.
- Increment
- If
ind[j] = len(curIndices)
(we could not find an element greater thanlast
), there is no valid window starting atstart
, we can immediately return the answer, since all future values ofstart
will be greater and there will also be no valid window. - Set
prev
tocurIndices[ind[j]]
.
- Let
- At this point,
prev = end
. Our candidate string iss1[start..prev]
. Ifanswer
is empty or this candidate has a shorter length, updateanswer
with the candidate.
- Initialize
- Return the
answer
.
Given nnn as the length of s1
and mmm as the length of s2
,
- Time complexity: O(n⋅m)O(n \cdot m)O(n⋅m).
We have n
iterations of the outer for loop (start
from 0
to n - 1
) and m
iterations of the inner for loop (j
from 0
to m - 1
). There also is a while loop inside that increments ind[j]
. Thus the total complexity is O(n⋅m)O(n \cdot m)O(n⋅m) + (the total number of iterations of the while loop).
Since each iteration of the while loop increments ind[j]
, the number of iterations of this loop equals the number of increments of ind[j]
.
The initial value of ind[j]
is 0
, and the final one does not exceed the length of indices[s2[j]] <= n
. Thus there will be no more than n
increments of each ind[j]
. Since there are m
elements ind[j]
, the total number of increments does not exceed n⋅mn \cdot mn⋅m.
- Space complexity: O(n)O(n)O(n).
We keep indices[c]
for each letter c
. We store each index from 0
to n - 1
exactly once in indices[s1[i]]
. Thus the total size of all indices[c]
is n
.
Also, we store the array ind
with m <= n
elements.
The total space complexity is O(n)O(n)O(n).
class Solution {
public:
string minWindow(string s1, string s2) {
int n = s1.size(), m = s2.size();
string answer;
unordered_map<char, vector<int>> indices;
for (int i = 0; i < n; i++) {
indices[s1[i]].push_back(i);
}
vector<int> ind(m);
for (int start = 0; start < n; start++) {
int prev = start - 1;
bool notFound = false;
for (int j = 0; j < m; j++) {
if (!indices.count(s2[j])) {
return "";
}
const vector<int>& curIndices = indices[s2[j]];
while (ind[j] < curIndices.size() && curIndices[ind[j]] <= prev) {
ind[j]++;
}
if (ind[j] == curIndices.size()) {
return answer;
}
prev = curIndices[ind[j]];
}
if (answer == "" || prev - start + 1 < answer.size()) {
answer = s1.substr(start, prev - start + 1);
}
}
return answer;
}
};