用于 Anagram 子串搜索（或搜索所有排列）的Python程序

给定一个文本 txt[0..n-1] 和一个模式 pat[0..m-1]，编写一个函数search(char pat[], char txt[]) 打印所有出现的 pat[] 及其txt[] 中的排列（或字谜）。你可以假设 n > m。
预期时间复杂度为 O(n)

例子：

1) Input:  txt[] = "BACDGABCDA"  pat[] = "ABCD"
   Output:   Found at Index 0
             Found at Index 5
             Found at Index 6
2) Input: txt[] =  "AAABABAA" pat[] = "AABA"
   Output:   Found at Index 0
             Found at Index 1
             Found at Index 4

我们强烈建议您单击此处并进行练习，然后再继续使用解决方案。

一个简单的想法是修改 Rabin Karp 算法。例如，我们可以将哈希值保留为所有字符的 ASCII 值之和，以一个大素数为模。对于文本的每个字符，我们可以将当前字符添加到哈希值并减去前一个窗口的第一个字符。此解决方案看起来不错，但与标准 Rabin Karp 一样，此解决方案的最坏情况时间复杂度为 O(mn)。最坏的情况发生在所有哈希值都匹配并且我们一一匹配所有字符时。
在字母大小是固定的假设下，我们可以实现 O(n) 时间复杂度，这通常是正确的，因为我们在 ASCII 中最多有 256 个可能的字符。这个想法是使用两个计数数组：

1）第一个计数数组存储模式中字符的频率。
2）第二个计数数组存储当前文本窗口中字符的频率。

需要注意的重要一点是，比较两个计数数组的时间复杂度是 O(1)，因为它们中的元素数量是固定的（与模式和文本大小无关）。以下是该算法的步骤。
1) 将模式频率的计数存储在第一个计数数组countP[]中。还将字符频率计数存储在数组countTW[]中的第一个文本窗口中。

2) 现在运行一个从 i = M 到 N-1 的循环。在循环中执行以下操作。
.....a) 如果两个计数数组相同，我们发现了一个事件。
.....b) countTW[] 中文本当前字符的增量计数
.....c) 减少 countWT[] 中前一个窗口中第一个字符的计数

3）最后一个窗口没有被上面的循环检查，所以明确检查它。

Python3

# Python program to search all
# anagrams of a pattern in a text
  
MAX = 256 
  
# This function returns true
# if contents of arr1[] and arr2[]
# are same, otherwise false.
def compare(arr1, arr2):
    for i in range(MAX):
        if arr1[i] != arr2[i]:
            return False
    return True
      
# This function search for all
# permutations of pat[] in txt[]  
def search(pat, txt):
  
    M = len(pat)
    N = len(txt)
  
    # countP[]:  Store count of
    # all characters of pattern
    # countTW[]: Store count of
    # current window of text
    countP = [0]*MAX
  
    countTW = [0]*MAX
  
    for i in range(M):
        (countP[ord(pat[i]) ]) += 1
        (countTW[ord(txt[i]) ]) += 1
  
    # Traverse through remaining
    # characters of pattern
    for i in range(M, N):
  
        # Compare counts of current
        # window of text with
        # counts of pattern[]
        if compare(countP, countTW):
            print("Found at Index", (i-M))
  
        # Add current character to current window
        (countTW[ ord(txt[i]) ]) += 1
  
        # Remove the first character of previous window
        (countTW[ ord(txt[i-M]) ]) -= 1
      
    # Check for the last window in text    
    if compare(countP, countTW):
        print("Found at Index", N-M)
          
# Driver program to test above function       
txt = "BACDGABCDA"
pat = "ABCD"       
search(pat, txt)   
  
# This code is contributed
# by Upendra Singh Bartwal

输出：

('Found at Index', 0)
('Found at Index', 5)
('Found at Index', 6)

有关详细信息，请参阅有关 Anagram Substring Search（或 Search for all permutations）的完整文章！