Skip to main content

Binary Search - Hard Level - Question 2

Binary Search - Hard Level - Question 2


Leetcode 727 Minimum Window Subsequence

Given strings S and T, find the minimum (contiguous) substring W of S, so that T is a subsequence of W.

If there is no such window in S that covers all characters in T, return the empty string "". If there are multiple such minimum-length windows, return the one with the left-most starting index.

Note:

All the strings in the input will only contain lowercase letters.

The length of S will be in the range [1, 20000].

The length of T will be in the range [1, 100].


Analysis:


The first step to think about this question may be how to determine a string is a subsequence of another string? 

We can use the two-pointer method: one is a pointer to the beginning of the first string and the other one is a pointer for the second string (to be matched). Once the second pointer can reach the end of the second string, it means we have found a subsequence in the first string.

The time complexity for this step is O(M*N). But this question is asking for the minimum length for all the valid substrings. If we do the same search staring at each char of string S, the time complexity becomes O(M*M*N), which is too large.

One improvement is to record the indexes of each char.

Once we have the indexes of each char, we just either start with the first char of T, or the last char of T. If we choose to start the last char of T, we will start with the last index of the last char of T in S. Then we can find the second last char of T with a most closest distance (to the last char of T) in S, with a binary search. If we can find the second last char of T in S, then we can move on to the third last one of T, until all are found. In such a greedy way, we can find the shortest substring in S having a subsequence as T ending with the last appearance of the last last char of T in S (there may be more than one the last char of T in S). The time complexity is O(N*logM).

Why does binary search work?

1. all the indexes for each char are sorted;

2. we always want to find the next char with a most closest distance.

To find the global shortest substring, we need to start with each appearance the last char of T in S. So the time complexity is O(number of the last char of T in S * N * logM) ~ O(M*N*logM).


See the code below:


string findShortestStr(string s, string t) {
	int m = s.size(), n = t.size(), start = -1, len = 0;
	vector<vector<int>> ids(26);
	for(int i=0; i<m; ++i) {
		ids[s[i]-'a'].push_back(i);
	}
	for(auto &id : ids[t[n-1]-'a']) {
		int idx = id; 
		bool find = true;
		for(int i=n-2; i>=0; --i) {
			int x = t[i] - 'a';
			int p = lower_bound(ids[x].begin(), ids[x].end(), idx) - ids[x].begin();	
			if(p==0) {
				find = false;
				break;
			}
			idx = ids[x][p - 1];
		}
		if(find) {
			if(len == 0 || len > id - idx + 1) {
				start = idx;
				len = id - idx + 1; 
			}
		}
	}
	return start == -1 ? "" : s.substr(start, len);
}

Note:

There are some other methods to solve this problem, such as dp, or the pure two-pointer method.


Upper Layer

Comments

Popular posts from this blog

Brute Force - Question 2

2105. Watering Plants II Alice and Bob want to water n plants in their garden. The plants are arranged in a row and are labeled from 0 to n - 1 from left to right where the ith plant is located at x = i. Each plant needs a specific amount of water. Alice and Bob have a watering can each, initially full. They water the plants in the following way: Alice waters the plants in order from left to right, starting from the 0th plant. Bob waters the plants in order from right to left, starting from the (n - 1)th plant. They begin watering the plants simultaneously. It takes the same amount of time to water each plant regardless of how much water it needs. Alice/Bob must water the plant if they have enough in their can to fully water it. Otherwise, they first refill their can (instantaneously) then water the plant. In case both Alice and Bob reach the same plant, the one with more water currently in his/her watering can should water this plant. If they have the same amount of water, then Alice ...

Sweep Line

Sweep (or scanning) line algorithm is very efficient for some specific questions involving discrete intervals. The intervals could be the lasting time of events, or the width of a building or an abstract square, etc. In the scanning line algorithm, we usually need to distinguish the start and the end of an interval. After the labeling of the starts and ends, we can sort them together based on the values of the starts and ends. Thus, if there are N intervals in total, we will have 2*N data points (since each interval will contribute 2). The sorting becomes the most time-consuming step, which is O(2N*log(2N) ~ O(N*logN). After the sorting, we usually can run a linear sweep for all the data points. If the data point is labeled as a starting point, it means a new interval is in the processing; when an ending time is reached, it means one of the interval has ended. In such direct way, we can easily figure out how many intervals are in the processes. Other related information can also be obt...

Dynamic Programming - Easy Level - Question 1

Dynamic Programming - Easy Level - Question 1 Leetcode 1646  Get Maximum in Generated Array You are given an integer n. An array nums of length n + 1 is generated in the following way: nums[0] = 0 nums[1] = 1 nums[2 * i] = nums[i] when 2 <= 2 * i <= n nums[2 * i + 1] = nums[i] + nums[i + 1] when 2 <= 2 * i + 1 <= n Return the maximum integer in the array nums​​​. Constraints: 0 <= n <= 100 Analysis: This question is quick straightforward: the state and transitional formula are given; the initialization is also given. So we can just ready the code to iterate all the states and find the maximum. See the code below: class Solution { public: int getMaximumGenerated(int n) { int res = 0; if(n<2) return n; vector<int> f(n+1, 0); f[1] = 1; for(int i=2; i<=n; ++i) { if(i&1) f[i] = f[i/2] + f[i/2+1]; else f[i] = f[i/2]; // cout<<i<<" "<<f[i]<<endl; ...