Skip to main content

Rolling Hash - Question 2

1316. Distinct Echo Substrings

Return the number of distinct non-empty substrings of text that can be written as the concatenation of some string with itself (i.e. it can be written as a + a where a is some string).

Constraints:

1 <= text.length <= 2000

text has only lowercase English letters.


Analysis:


There are O(N^2) substring in total, and if we check each substring one by one, the time complexity for checking is O(N). So the overall time complexity is O(N^3).

The longest test case may have N as 2000, so N^3 ~ 10^10, which is too large.

We need to find some other ways faster.

One way is to use a rolling hashing method. For a fixed length, we just scan the string from left to right once with double pointers.

The first pointer is the ending of the first half; the second one is the ending of the second half. If they are the same, or give the same hash value, then we consider they are the same.

To avoid repeating counting, we can use a set.

See the code below,


class Solution {
public:
    int distinctEchoSubstrings(string text) {
        unordered_set<int> st;
        int n = text.size();
        for(int i=1; i<=n/2; ++i) {
            long h1 = 0, h2 = 0, p = 199, mod = 1e9+7, r = 1;
            for(int j=0, k=i; k<n; ++j, ++k) {
                if(j<i) r = r*p%mod;
                h1 = (h1*p%mod + text[j]-'a'+1)%mod;
                h2 = (h2*p%mod + text[k]-'a'+1)%mod;
                if(j>=i) {
                    h1 = (h1 + mod - (text[j-i]-'a'+1)*r%mod)%mod;
                    h2 = (h2 + mod - (text[k-i]-'a'+1)*r%mod)%mod;
                }
                if(j>=i-1 && h1 == h2) {
                    // cout<<text.substr(j-i+1, i*2)<<endl;
                    st.insert(h1);
                }
            }
        }
        return st.size();
    }
};


If do not want to use the rolling hash method, we can just use a counting to record the same number of elements in order. 

If the counting is the same as the length, we find one valid answer; when moving onto the next element, we just need to decrease the counting by 1;

If the two poninters point to two different elements, we need to reset the value of counting to be 0;

If the two pointers point to elements with the same value, the counting will be increased by 1. 


 See the code below,

class Solution {
public:
    int distinctEchoSubstrings(string text) {
        unordered_set<string> st;
        int n = text.size();
        for(int i=1; i<=n/2; ++i) {
            long ct = 0;
            for(int j=0, k=i; k<n; ++j, ++k) {
                if(text[j] == text[k]) ++ct;
                else ct = 0;
                if(ct == i) {
                    st.insert(text.substr(j-i+1, i));
                    --ct;
                }
            }
        }
        return st.size();
    }
};





Comments

Popular posts from this blog

Brute Force - Question 2

2105. Watering Plants II Alice and Bob want to water n plants in their garden. The plants are arranged in a row and are labeled from 0 to n - 1 from left to right where the ith plant is located at x = i. Each plant needs a specific amount of water. Alice and Bob have a watering can each, initially full. They water the plants in the following way: Alice waters the plants in order from left to right, starting from the 0th plant. Bob waters the plants in order from right to left, starting from the (n - 1)th plant. They begin watering the plants simultaneously. It takes the same amount of time to water each plant regardless of how much water it needs. Alice/Bob must water the plant if they have enough in their can to fully water it. Otherwise, they first refill their can (instantaneously) then water the plant. In case both Alice and Bob reach the same plant, the one with more water currently in his/her watering can should water this plant. If they have the same amount of water, then Alice ...

Sweep Line

Sweep (or scanning) line algorithm is very efficient for some specific questions involving discrete intervals. The intervals could be the lasting time of events, or the width of a building or an abstract square, etc. In the scanning line algorithm, we usually need to distinguish the start and the end of an interval. After the labeling of the starts and ends, we can sort them together based on the values of the starts and ends. Thus, if there are N intervals in total, we will have 2*N data points (since each interval will contribute 2). The sorting becomes the most time-consuming step, which is O(2N*log(2N) ~ O(N*logN). After the sorting, we usually can run a linear sweep for all the data points. If the data point is labeled as a starting point, it means a new interval is in the processing; when an ending time is reached, it means one of the interval has ended. In such direct way, we can easily figure out how many intervals are in the processes. Other related information can also be obt...

Dynamic Programming - Easy Level - Question 1

Dynamic Programming - Easy Level - Question 1 Leetcode 1646  Get Maximum in Generated Array You are given an integer n. An array nums of length n + 1 is generated in the following way: nums[0] = 0 nums[1] = 1 nums[2 * i] = nums[i] when 2 <= 2 * i <= n nums[2 * i + 1] = nums[i] + nums[i + 1] when 2 <= 2 * i + 1 <= n Return the maximum integer in the array nums​​​. Constraints: 0 <= n <= 100 Analysis: This question is quick straightforward: the state and transitional formula are given; the initialization is also given. So we can just ready the code to iterate all the states and find the maximum. See the code below: class Solution { public: int getMaximumGenerated(int n) { int res = 0; if(n<2) return n; vector<int> f(n+1, 0); f[1] = 1; for(int i=2; i<=n; ++i) { if(i&1) f[i] = f[i/2] + f[i/2+1]; else f[i] = f[i/2]; // cout<<i<<" "<<f[i]<<endl; ...