📜  最长公共子序列 | DP-4

📅  最后修改于: 2021-09-17 07:05:46             🧑  作者: Mango

我们已经分别在第 1 组和第 2 组中讨论了重叠子问题和最优子结构属性。我们还讨论了第 3 组中的一个示例问题。让我们讨论最长公共子序列 (LCS) 问题作为可以使用动态规划解决的另一个示例问题。

LCS 问题陈述:给定两个序列,找出它们中存在的最长子序列的长度。子序列是以相同的相对顺序出现的序列,但不一定是连续的。例如,“abc”、“abg”、“bdf”、“aeg”、’”acefg”等都是“abcdefg”的子序列。

为了找出蛮力方法的复杂性,我们首先需要知道一个长度为n的字符串可能的不同子序列的数量,即找到长度范围为1,2,..n-1的子序列的数量.回忆排列组合理论,1 个元素的组合数是n C 1 。具有 2 个元素的组合数是n C 2等等。我们知道n C 0 + n C 1 + n C 2 + … n C n = 2 n 。因此,长度为 n 的字符串具有 2 n -1 个不同的可能子序列,因为我们不考虑长度为 0 的子序列。这意味着蛮力方法的时间复杂度将为 O(n * 2 n )。请注意,检查一个子序列是否对两个字符串都相同需要 O(n) 时间。这个时间复杂度可以使用动态规划来改进。

它是一个经典的计算机科学问题,是 diff(输出两个文件之间差异的文件比较程序)的基础,在生物信息学中有应用。

例子:
输入序列“ABCDGH”和“AEDFHR”的 LCS 是长度为 3 的“ADH”。
输入序列“AGGTAB”和“GXTXAYB”的 LCS 是长度为 4 的“GTAB”。

这个问题的天真解决方案是生成两个给定序列的所有子序列并找到最长的匹配子序列。该解决方案在时间复杂度方面呈指数级增长。让我们看看这个问题如何拥有动态规划 (DP) 问题的两个重要属性。

1) 最优子结构:
设输入序列分别为长度为 m 和 n 的 X[0..m-1] 和 Y[0..n-1]。并令 L(X[0..m-1], Y[0..n-1]) 为两个序列 X 和 Y 的 LCS 的长度。以下是 L(X[0..n-1]) 的递归定义。 m-1], Y[0..n-1])。

如果两个序列的最后一个字符匹配(或 X[m-1] == Y[n-1]),则
L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2])

如果两个序列的最后一个字符不匹配(或 X[m-1] != Y[n-1]),则
L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L( X[0..m-1], Y[0..n-2]) )

例子:
1) 考虑输入字符串“AGGTAB”和“GXTXAYB”。字符串的最后一个字符匹配。所以 LCS 的长度可以写成:
L(“AGGTAB”, “GXTXAYB”) = 1 + L(“AGGTA”, “GXTXAY”)

最长公共子序列

2)考虑输入字符串“ABCDGH”和“AEDFHR”。字符串的最后一个字符不匹配。所以 LCS 的长度可以写成:
L(“ABCDGH”, “AEDFHR”) = MAX ( L(“ABCDG”, “AEDFH R ”), L(“ABCDG H ”, “AEDFH”) )
因此 LCS 问题具有最优子结构性质,因为主要问题可以使用子问题的解来解决。

2)重叠子问题:
以下是 LCS 问题的简单递归实现。实现只是简单地遵循上面提到的递归结构。

C++
/* A Naive recursive implementation of LCS problem */
#include 
using namespace std;
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
    if (m == 0 || n == 0)
        return 0;
    if (X[m-1] == Y[n-1])
        return 1 + lcs(X, Y, m-1, n-1);
    else
        return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
/* Driver code */
int main()
{
    char X[] = "AGGTAB";
    char Y[] = "GXTXAYB";
     
    int m = strlen(X);
    int n = strlen(Y);
     
    cout<<"Length of LCS is "<< lcs( X, Y, m, n ) ;
     
    return 0;
}
 
// This code is contributed by rathbhupendra


C
/* A Naive recursive implementation of LCS problem */
#include
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
if (m == 0 || n == 0)
    return 0;
if (X[m-1] == Y[n-1])
    return 1 + lcs(X, Y, m-1, n-1);
else
    return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
/* Driver program to test above function */
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
 
int m = strlen(X);
int n = strlen(Y);
 
printf("Length of LCS is %d", lcs( X, Y, m, n ) );
 
return 0;
}


Java
/* A Naive recursive implementation of LCS problem in java*/
public class LongestCommonSubsequence
{
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char[] X, char[] Y, int m, int n )
{
    if (m == 0 || n == 0)
    return 0;
    if (X[m-1] == Y[n-1])
    return 1 + lcs(X, Y, m-1, n-1);
    else
    return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
public static void main(String[] args)
{
    LongestCommonSubsequence lcs = new LongestCommonSubsequence();
    String s1 = "AGGTAB";
    String s2 = "GXTXAYB";
 
    char[] X=s1.toCharArray();
    char[] Y=s2.toCharArray();
    int m = X.length;
    int n = Y.length;
 
    System.out.println("Length of LCS is" + " " +
                                lcs.lcs( X, Y, m, n ) );
}
 
}
 
// This Code is Contributed by Saket Kumar


Python
# A Naive recursive Python implementation of LCS problem
 
def lcs(X, Y, m, n):
 
    if m == 0 or n == 0:
    return 0;
    elif X[m-1] == Y[n-1]:
    return 1 + lcs(X, Y, m-1, n-1);
    else:
    return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
 
 
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X , Y, len(X), len(Y))


C#
/* C# Naive recursive implementation of LCS problem */
using System;
 
class GFG
{
     
 
    /* Returns length of LCS for X[0..m-1], Y[0..n-1] */
    static int lcs( char[] X, char[] Y, int m, int n )
    {
        if (m == 0 || n == 0)
        return 0;
        if (X[m - 1] == Y[n - 1])
        return 1 + lcs(X, Y, m - 1, n - 1);
        else
        return max(lcs(X, Y, m, n - 1), lcs(X, Y, m - 1, n));
    }
     
    /* Utility function to get max of 2 integers */
    static int max(int a, int b)
    {
        return (a > b)? a : b;
    }
     
    public static void Main()
    {
        String s1 = "AGGTAB";
        String s2 = "GXTXAYB";
     
        char[] X=s1.ToCharArray();
        char[] Y=s2.ToCharArray();
        int m = X.Length;
        int n = Y.Length;
     
        Console.Write("Length of LCS is" + " "
                    +lcs( X, Y, m, n ) );
    }
}
// This code is Contributed by Sam007


PHP


C++
/* Dynamic Programming C++ implementation of LCS problem */
#include
using namespace std;
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
    int L[m + 1][n + 1];
    int i, j;
     
    /* Following steps build L[m+1][n+1] in
    bottom up fashion. Note that L[i][j]
    contains length of LCS of X[0..i-1]
    and Y[0..j-1] */
    for (i = 0; i <= m; i++)
    {
        for (j = 0; j <= n; j++)
        {
        if (i == 0 || j == 0)
            L[i][j] = 0;
     
        else if (X[i - 1] == Y[j - 1])
            L[i][j] = L[i - 1][j - 1] + 1;
     
        else
            L[i][j] = max(L[i - 1][j], L[i][j - 1]);
        }
    }
         
    /* L[m][n] contains length of LCS
    for X[0..n-1] and Y[0..m-1] */
    return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
// Driver Code
int main()
{
    char X[] = "AGGTAB";
    char Y[] = "GXTXAYB";
     
    int m = strlen(X);
    int n = strlen(Y);
     
    cout << "Length of LCS is "
        << lcs( X, Y, m, n );
     
    return 0;
}
 
// This code is contributed by rathbhupendra


C
/* Dynamic Programming C implementation of LCS problem */
#include
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
int L[m+1][n+1];
int i, j;
 
/* Following steps build L[m+1][n+1] in bottom up fashion. Note
    that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
for (i=0; i<=m; i++)
{
    for (j=0; j<=n; j++)
    {
    if (i == 0 || j == 0)
        L[i][j] = 0;
 
    else if (X[i-1] == Y[j-1])
        L[i][j] = L[i-1][j-1] + 1;
 
    else
        L[i][j] = max(L[i-1][j], L[i][j-1]);
    }
}
     
/* L[m][n] contains length of LCS for X[0..n-1] and Y[0..m-1] */
return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
/* Driver program to test above function */
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
 
int m = strlen(X);
int n = strlen(Y);
 
printf("Length of LCS is %d", lcs( X, Y, m, n ) );
 
return 0;
}


Java
/* Dynamic Programming Java implementation of LCS problem */
public class LongestCommonSubsequence
{
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char[] X, char[] Y, int m, int n )
{
    int L[][] = new int[m+1][n+1];
 
    /* Following steps build L[m+1][n+1] in bottom up fashion. Note
        that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
    for (int i=0; i<=m; i++)
    {
    for (int j=0; j<=n; j++)
    {
        if (i == 0 || j == 0)
            L[i][j] = 0;
        else if (X[i-1] == Y[j-1])
            L[i][j] = L[i-1][j-1] + 1;
        else
            L[i][j] = max(L[i-1][j], L[i][j-1]);
    }
    }
return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
public static void main(String[] args)
{
    LongestCommonSubsequence lcs = new LongestCommonSubsequence();
    String s1 = "AGGTAB";
    String s2 = "GXTXAYB";
 
    char[] X=s1.toCharArray();
    char[] Y=s2.toCharArray();
    int m = X.length;
    int n = Y.length;
 
    System.out.println("Length of LCS is" + " " +
                                lcs.lcs( X, Y, m, n ) );
}
 
}
 
// This Code is Contributed by Saket Kumar


Python
# Dynamic Programming implementation of LCS problem
 
def lcs(X , Y):
    # find the length of the strings
    m = len(X)
    n = len(Y)
 
    # declaring the array for storing the dp values
    L = [[None]*(n+1) for i in xrange(m+1)]
 
    """Following steps build L[m+1][n+1] in bottom up fashion
    Note: L[i][j] contains length of LCS of X[0..i-1]
    and Y[0..j-1]"""
    for i in range(m+1):
        for j in range(n+1):
            if i == 0 or j == 0 :
                L[i][j] = 0
            elif X[i-1] == Y[j-1]:
                L[i][j] = L[i-1][j-1]+1
            else:
                L[i][j] = max(L[i-1][j] , L[i][j-1])
 
    # L[m][n] contains the length of LCS of X[0..n-1] & Y[0..m-1]
    return L[m][n]
#end of function lcs
 
 
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X, Y)
 
# This code is contributed by Nikhil Kumar Singh(nickzuck_007)


C#
// Dynamic Programming C# implementation
// of LCS problem
using System;
 
class GFG
{
 
    /* Returns length of LCS for X[0..m-1], Y[0..n-1] */
    static int lcs( char[] X, char[] Y, int m, int n )
    {
        int [,]L = new int[m+1,n+1];
     
        /* Following steps build L[m+1][n+1]
        in bottom up fashion. Note
        that L[i][j] contains length of
        LCS of X[0..i-1] and Y[0..j-1] */
        for (int i = 0; i <= m; i++)
        {
            for (int j = 0; j <= n; j++)
            {
                if (i == 0 || j == 0)
                    L[i, j] = 0;
                else if (X[i - 1] == Y[j - 1])
                    L[i, j] = L[i - 1, j - 1] + 1;
                else
                    L[i, j] = max(L[i - 1, j], L[i, j - 1]);
            }
        }
        return L[m, n];
    }
     
    /* Utility function to get max of 2 integers */
    static int max(int a, int b)
    {
        return (a > b)? a : b;
    }
     
    // Driver code
    public static void Main()
    {
         
        String s1 = "AGGTAB";
        String s2 = "GXTXAYB";
     
        char[] X=s1.ToCharArray();
        char[] Y=s2.ToCharArray();
        int m = X.Length;
        int n = Y.Length;
     
        Console.Write("Length of LCS is" + " " +lcs( X, Y, m, n ) );
    }
}
// This Code is Contributed by Sam007


PHP


Javascript


输出:

Length of LCS is 4

上述朴素递归方法的时间复杂度在最坏情况下为 O(2^n),最坏情况发生在 X 和 Y 的所有字符不匹配时,即 LCS 的长度为 0。

考虑到上述实现,以下是输入字符串“AXYT”和“AYZX”的部分递归树

lcs("AXYT", "AYZX")
                       /                 
         lcs("AXY", "AYZX")            lcs("AXYT", "AYZ")
         /                              /               
lcs("AX", "AYZX") lcs("AXY", "AYZ")   lcs("AXY", "AYZ") lcs("AXYT", "AY")

在上面的部分递归树中,lcs(“AXY”, “AYZ”) 被求解了两次。如果我们画出完整的递归树,那么我们可以看到有很多子问题被一次又一次地解决了。所以这个问题具有重叠子结构的性质,并且可以通过使用 Memoization 或 Tabulation 来避免相同子问题的重新计算。以下是 LCS 问题的列表实现。

C++

/* Dynamic Programming C++ implementation of LCS problem */
#include
using namespace std;
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
    int L[m + 1][n + 1];
    int i, j;
     
    /* Following steps build L[m+1][n+1] in
    bottom up fashion. Note that L[i][j]
    contains length of LCS of X[0..i-1]
    and Y[0..j-1] */
    for (i = 0; i <= m; i++)
    {
        for (j = 0; j <= n; j++)
        {
        if (i == 0 || j == 0)
            L[i][j] = 0;
     
        else if (X[i - 1] == Y[j - 1])
            L[i][j] = L[i - 1][j - 1] + 1;
     
        else
            L[i][j] = max(L[i - 1][j], L[i][j - 1]);
        }
    }
         
    /* L[m][n] contains length of LCS
    for X[0..n-1] and Y[0..m-1] */
    return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
// Driver Code
int main()
{
    char X[] = "AGGTAB";
    char Y[] = "GXTXAYB";
     
    int m = strlen(X);
    int n = strlen(Y);
     
    cout << "Length of LCS is "
        << lcs( X, Y, m, n );
     
    return 0;
}
 
// This code is contributed by rathbhupendra

C

/* Dynamic Programming C implementation of LCS problem */
#include
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
int L[m+1][n+1];
int i, j;
 
/* Following steps build L[m+1][n+1] in bottom up fashion. Note
    that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
for (i=0; i<=m; i++)
{
    for (j=0; j<=n; j++)
    {
    if (i == 0 || j == 0)
        L[i][j] = 0;
 
    else if (X[i-1] == Y[j-1])
        L[i][j] = L[i-1][j-1] + 1;
 
    else
        L[i][j] = max(L[i-1][j], L[i][j-1]);
    }
}
     
/* L[m][n] contains length of LCS for X[0..n-1] and Y[0..m-1] */
return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
/* Driver program to test above function */
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
 
int m = strlen(X);
int n = strlen(Y);
 
printf("Length of LCS is %d", lcs( X, Y, m, n ) );
 
return 0;
}

Java

/* Dynamic Programming Java implementation of LCS problem */
public class LongestCommonSubsequence
{
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char[] X, char[] Y, int m, int n )
{
    int L[][] = new int[m+1][n+1];
 
    /* Following steps build L[m+1][n+1] in bottom up fashion. Note
        that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
    for (int i=0; i<=m; i++)
    {
    for (int j=0; j<=n; j++)
    {
        if (i == 0 || j == 0)
            L[i][j] = 0;
        else if (X[i-1] == Y[j-1])
            L[i][j] = L[i-1][j-1] + 1;
        else
            L[i][j] = max(L[i-1][j], L[i][j-1]);
    }
    }
return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
public static void main(String[] args)
{
    LongestCommonSubsequence lcs = new LongestCommonSubsequence();
    String s1 = "AGGTAB";
    String s2 = "GXTXAYB";
 
    char[] X=s1.toCharArray();
    char[] Y=s2.toCharArray();
    int m = X.length;
    int n = Y.length;
 
    System.out.println("Length of LCS is" + " " +
                                lcs.lcs( X, Y, m, n ) );
}
 
}
 
// This Code is Contributed by Saket Kumar

Python

# Dynamic Programming implementation of LCS problem
 
def lcs(X , Y):
    # find the length of the strings
    m = len(X)
    n = len(Y)
 
    # declaring the array for storing the dp values
    L = [[None]*(n+1) for i in xrange(m+1)]
 
    """Following steps build L[m+1][n+1] in bottom up fashion
    Note: L[i][j] contains length of LCS of X[0..i-1]
    and Y[0..j-1]"""
    for i in range(m+1):
        for j in range(n+1):
            if i == 0 or j == 0 :
                L[i][j] = 0
            elif X[i-1] == Y[j-1]:
                L[i][j] = L[i-1][j-1]+1
            else:
                L[i][j] = max(L[i-1][j] , L[i][j-1])
 
    # L[m][n] contains the length of LCS of X[0..n-1] & Y[0..m-1]
    return L[m][n]
#end of function lcs
 
 
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X, Y)
 
# This code is contributed by Nikhil Kumar Singh(nickzuck_007)

C#

// Dynamic Programming C# implementation
// of LCS problem
using System;
 
class GFG
{
 
    /* Returns length of LCS for X[0..m-1], Y[0..n-1] */
    static int lcs( char[] X, char[] Y, int m, int n )
    {
        int [,]L = new int[m+1,n+1];
     
        /* Following steps build L[m+1][n+1]
        in bottom up fashion. Note
        that L[i][j] contains length of
        LCS of X[0..i-1] and Y[0..j-1] */
        for (int i = 0; i <= m; i++)
        {
            for (int j = 0; j <= n; j++)
            {
                if (i == 0 || j == 0)
                    L[i, j] = 0;
                else if (X[i - 1] == Y[j - 1])
                    L[i, j] = L[i - 1, j - 1] + 1;
                else
                    L[i, j] = max(L[i - 1, j], L[i, j - 1]);
            }
        }
        return L[m, n];
    }
     
    /* Utility function to get max of 2 integers */
    static int max(int a, int b)
    {
        return (a > b)? a : b;
    }
     
    // Driver code
    public static void Main()
    {
         
        String s1 = "AGGTAB";
        String s2 = "GXTXAYB";
     
        char[] X=s1.ToCharArray();
        char[] Y=s2.ToCharArray();
        int m = X.Length;
        int n = Y.Length;
     
        Console.Write("Length of LCS is" + " " +lcs( X, Y, m, n ) );
    }
}
// This Code is Contributed by Sam007

PHP


Javascript


输出:

Length of LCS is 4

上述实现的时间复杂度为 O(mn),这比 Naive Recursive 实现的最坏情况时间复杂度要好得多。
上述算法/代码仅返回 LCS 的长度。请参阅以下帖子以打印 LCS。
打印最长公共子序列
您还可以在以下位置查看 LCS 的空间优化版本
濒海战斗舰空间优化方案

参考:
http://www.youtube.com/watch?v=V5hZoJ6uK-s
http://www.algorithmist.com/index。 PHP/Longest_Common_Subsequence
http://www.ics.uci.edu/~eppstein/161/960229.html
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

如果您希望与专家一起参加现场课程,请参阅DSA 现场工作专业课程学生竞争性编程现场课程