📜  实现短文本大小的字符串搜索算法的Java程序

📅  最后修改于: 2022-05-13 01:54:54.139000             🧑  作者: Mango

实现短文本大小的字符串搜索算法的Java程序

模式搜索是计算机科学中的一个关键问题。当我们在记事本/word 文件或浏览器或数据库中搜索字符串时,会使用模式搜索算法来显示搜索结果。

一个典型的问题陈述是——
给定一个文本 txt[0..n-1] 和一个模式 pat[0..m-1],编写一个函数search(char pat[], char txt[]) 来打印 txt 中所有出现的 pat[] []。

例子:

Input:  txt[] = "THIS IS A TEST TEXT"
        pat[] = "TExT"
Output: Pattern found at index 15

Input:  txt[] =  "AABAACAADAABAABA"
        pat[] =  "AABA"
Output: Pattern found at index 0
        Pattern found at index 9
        Pattern found at index 12

在这个程序中,输入一个文本和一个模式,并在文本中搜索一个模式,我们得到模式的所有实例作为输出。

算法:

  • 以文本和模式作为输入。
  • 运行从 0 到模式文本长度长度的外部 for 循环。
  • 运行从 0 到模式长度的内部循环。
  • 通过这种方式,对于文本中每个索引处的每个字符,从该索引开始直到索引+模式长度,在文本中搜索该模式。
  • 如果找到该模式,则打印在文本中找到该模式的外循环的索引。
  • 否则,如果未找到该模式,则打印未找到。

下面是上述方法的实现:

Java
// Java Program to Implement the String Search
// Algorithm for Short Text Sizes
  
import java.io.*;
  
class GFG {
    public static void main(String[] args)
    {
  
        String text = "geeksforgeeks is a coding website for geeks";
        String pattern = "geeks";
  
        // calling the method that is designed for
        // printing the instances of pattern
        // found in the text string
        stringMatch(text, pattern);
    }
    public static void stringMatch(String text, String pattern)
    {
  
        int len_t = text.length();
        int len_p = pattern.length();
  
        int k = 0, i = 0, j = 0;
  
        // loop to find out the position Of searched pattern
        for (i = 0; i <= (len_t - len_p); i++) {
  
            for (j = 0; j < len_p; j++)
            {
                if (text.charAt(i + j) != pattern.charAt(j))
                    break;
            }
            
            if (j == len_p)
            {
                k++;
                System.out.println("Pattern Found at Position: " + i);
            }
        }
        
        if (k == 0)
            System.out.println("No Match Found!");
        else
            System.out.println("Total Instances Found = " + k);
    }
}


Java
// Java Program to Implement the String Search
// Algorithm for Short Text Sizes
  
class KMP_String_Matching {
    void KMPSearch(String pat, String txt)
    {
        int M = pat.length();
        int N = txt.length();
  
        // create lps[] that will hold the longest
        // prefix suffix values for pattern
        int lps[] = new int[M];
        int j = 0; // index for pat[]
  
        // Preprocess the pattern (calculate lps[]
        // array)
        computeLPSArray(pat, M, lps);
  
        int i = 0; // index for txt[]
        while (i < N) {
            if (pat.charAt(j) == txt.charAt(i)) {
                j++;
                i++;
            }
            if (j == M) {
                System.out.println("Found pattern "
                                   + "at index " + (i - j));
                j = lps[j - 1];
            }
  
            // mismatch after j matches
            else if (i < N
                     && pat.charAt(j) != txt.charAt(i)) {
                // Do not match lps[0..lps[j-1]] characters,
                // they will match anyway
                if (j != 0)
                    j = lps[j - 1];
                else
                    i = i + 1;
            }
        }
    }
  
    void computeLPSArray(String pat, int M, int lps[])
    {
        // length of the previous longest prefix suffix
        int len = 0;
        int i = 1;
        lps[0] = 0; // lps[0] is always 0
  
        // the loop calculates lps[i] for i = 1 to M-1
        while (i < M) {
            if (pat.charAt(i) == pat.charAt(len)) {
                len++;
                lps[i] = len;
                i++;
            }
            else // (pat[i] != pat[len])
            {
                // This is tricky. Consider the example.
                // AAACAAAA and i = 7. The idea is similar
                // to search step.
                if (len != 0) {
                    len = lps[len - 1];
  
                    // Also, note that we do not increment
                    // i here
                }
                else // if (len == 0)
                {
                    lps[i] = len;
                    i++;
                }
            }
        }
    }
  
    // Driver program to test above function
    public static void main(String args[])
    {
        String text
            = "geeksforgeeks is a coding website for geeks";
        String pattern = "geeks";
  
        KMP_String_Matching effective
            = new KMP_String_Matching();
        effective.KMPSearch(pattern, text);
    }
}


输出
Pattern Found at Position: 0
Pattern Found at Position: 8
Pattern Found at Position: 38
Total Instances Found = 3

最坏情况时间复杂度: O(m(n-m+1))

有效方法:

KMP 算法是一种在文本中搜索模式的有效方法。在检测到不匹配时进行遍历时,下一个窗口的文本中的某些字符是已知的。利用这个优势,时间复杂度降低到 O(n)。

下面是有效方法的实现:

Java

// Java Program to Implement the String Search
// Algorithm for Short Text Sizes
  
class KMP_String_Matching {
    void KMPSearch(String pat, String txt)
    {
        int M = pat.length();
        int N = txt.length();
  
        // create lps[] that will hold the longest
        // prefix suffix values for pattern
        int lps[] = new int[M];
        int j = 0; // index for pat[]
  
        // Preprocess the pattern (calculate lps[]
        // array)
        computeLPSArray(pat, M, lps);
  
        int i = 0; // index for txt[]
        while (i < N) {
            if (pat.charAt(j) == txt.charAt(i)) {
                j++;
                i++;
            }
            if (j == M) {
                System.out.println("Found pattern "
                                   + "at index " + (i - j));
                j = lps[j - 1];
            }
  
            // mismatch after j matches
            else if (i < N
                     && pat.charAt(j) != txt.charAt(i)) {
                // Do not match lps[0..lps[j-1]] characters,
                // they will match anyway
                if (j != 0)
                    j = lps[j - 1];
                else
                    i = i + 1;
            }
        }
    }
  
    void computeLPSArray(String pat, int M, int lps[])
    {
        // length of the previous longest prefix suffix
        int len = 0;
        int i = 1;
        lps[0] = 0; // lps[0] is always 0
  
        // the loop calculates lps[i] for i = 1 to M-1
        while (i < M) {
            if (pat.charAt(i) == pat.charAt(len)) {
                len++;
                lps[i] = len;
                i++;
            }
            else // (pat[i] != pat[len])
            {
                // This is tricky. Consider the example.
                // AAACAAAA and i = 7. The idea is similar
                // to search step.
                if (len != 0) {
                    len = lps[len - 1];
  
                    // Also, note that we do not increment
                    // i here
                }
                else // if (len == 0)
                {
                    lps[i] = len;
                    i++;
                }
            }
        }
    }
  
    // Driver program to test above function
    public static void main(String args[])
    {
        String text
            = "geeksforgeeks is a coding website for geeks";
        String pattern = "geeks";
  
        KMP_String_Matching effective
            = new KMP_String_Matching();
        effective.KMPSearch(pattern, text);
    }
}
输出
Found pattern at index 0
Found pattern at index 8
Found pattern at index 38

时间复杂度: O(n)