Lintcode DP Distinct Subsequences
18 Dec 2015###Task1 Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, “ACE” is a subsequence of “ABCDE” while “AEC” is not).
Have you met this question in a real interview? Yes
Example
Given S = “rabbbit”, T = “rabbit”, return 3.
Challenge Do it in O(n2) time and O(n) memory.
O(n2) memory is also acceptable if you do not know how to optimize memory.
###Java
public class Solution {
/**
* @param S, T: Two string.
* @return: Count the number of distinct subsequences
*/
public int numDistinct(String S, String T) {
// write your code here
int n = S.length();
int m = T.length();
if (n < m) {
return 0;
}
int[][] dp = new int[n + 1][m + 1];
dp[0][0] = 0;
for (int i = 0; i <= n; i++) {
dp[i][0] = 1;
}
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= m; j++) {
dp[i][j] = dp[i - 1][j];
if (S.charAt(i - 1) == T.charAt(j - 1)) {
dp[i][j] += dp[i - 1][j - 1];
}
}
}
return dp[n][m];
}
}
###point 一看要求解的个数,然后又是string匹配,而且形式上和Minimum Edit Distance那题很像,基本就是DP题跑不了了。DP题惯例的三步走:定义状态,推导递推公式,确定状态计算方向和起始状态。
1. 状态:i, j分别表示T中长度为i的prefix:T[0:i-1],和S中长度为j的prefix:S[0:j-1]。
DP[i][j]:S[0:j-1]中存在T[0:i-1]作为distinct subsequence的个数。显然如果j<i,DP[i][j] = 0。
2. 递推公式:
(a) T[i]!=s[j]:
T = r a b
S = r c a c b c
DP[i+1][j+1] = DP[i+1][j]
(b) T[i] = s[j]:
T = r a b b
S = r a b b b - DP[i+1][j] = 1
S = r a b b b - DP[i][j] = 2
S = r a b b b /
DP[i+1][j+1] = DP[i][j] + DP[i+1][j]
公式总结:
S[j-1]!= T[i-1]:DP[i][j] = DP[i][j-1]
S[j-1]==T[i-1]:DP[i][j] = DP[i-1][j-1] + DP[i][j-1]
3. 计算方向和起始状态:
DP[i][j]
DP[i+1][j] DP[i+1][j+1]
所以从上向下,从左到右的顺序可以计算。
根据计算顺序,需要先设置第0行、0列的值。
第0列:DP[i][0] = 0,i>0。因为T的长度大于S的长度,不可能成为S的subsequence。
第0行:DP[0][j] = 1,j>=0。这是为了保证第1行的计算正确:
T = r
S = r a r b r c
r a r b r c
1 1 1 1 1 1 1
r 0 1 1 2 2 3 3