Edit distance variant — longest common substring
We take a look at a variant of the Longest common subsequence problem — which is the longest common substring problem.
8–3. The longest common substring (not subsequence) of two strings X and Y is the longest string that appears as a run of consecutive letters in both strings. For example, the longest common substring of photograph and tomography is ograph.
- Let n=|X| and m=|Y|. Give a Θ(nm) dynamic programming algorithm for longest common substring based on the longest common subsequence/edit distance algorithm.
In order to solve this problem — we need to figure out the recurrence relation
String X = “oabcd”
String Y = “oxbcda”
To evaluate a substring, we need to realize that if two characters in a string are equal, then the preceding characters also need to be added up if we need to evaluate the maximum length of the common substring.
dp[i][j] = 0, if X.charAt(i) != Y.charAt(j)
dp[i][j] = 1 + dp[i-1][j-1], if X.charAt(i) == Y.charAt(j)
Similar problem on leetcode : https://leetcode.com/problems/maximum-length-of-repeated-subarray/
Solution of https://leetcode.com/problems/maximum-length-of-repeated-subarray/:
public int findLength(int[] A, int[] B) {
int[][] dp = new int[A.length+1][B.length +1];
int max = 0;
for(int i = 1; i <= A.length; i++) {
for(int j = 1; j <= B.length; j++) {
int A_i = A[i-1];
int B_j = B[j-1];
if(A_i == B_j) {
dp[i][j] = 1 + dp[i-1][j-1];
}
max = Math.max(max, dp[i][j]);
}
}
return max;
}