📜  在给定的 HTML 代码中查找缺少的结束标记

📅  最后修改于: 2021-10-29 06:46:20             🧑  作者: Mango

给定一个字符串htmlCode ,它是网页的 HTML 代码,任务是在 HTML 代码中找到缺失的结束标记。

例子:

Input: htmlCode = "


    
        GeeksforGeeks
    


    

Input: htmlCode = "


    

Hello

" Output:

方法:这个想法是使用堆栈来跟踪 HTML 代码中的当前开始标签,如果在任何时候有一个结束标签与表示开始标签的堆栈顶部不匹配。然后首先关闭堆栈顶部的标签。因此,堆栈顶部将是 HTML 代码中所需的缺失标签。

下面是上述方法的实现:

Python
# Python implementation to find the 
# the missing end in the HTML code
  
# Function to Auto complete the 
# missing tag in the html Code
def autoComplete(s):
      
    # Split the html Code in line
    linesOfCode = list(s.strip().split("\n"))
      
    # Tags which are self closed doesn't
    # needs to be closed
    selfClosedTags = ["area",  "base", "br", \
            "col",   "embed",  "hr",    "img", \
            "input", "link", "meta", "param", \
                    "source", "track", "wbr"]
    n = len(linesOfCode)
  
    stack = []
      
    # Loop to iterate over the
    # lines of code
    for i in range(n):
        j = 0
          
        # Current Line
        line = linesOfCode[i]
        while j < len(linesOfCode[i]):
              
            # Condition to check if the current 
            # character is a end tag in the code
            if j + 1 < len(line) and line[j] == "<"\
                            and line[j + 1] == "/":
                tag = []
                j += 2
                  
                # Loop to get the tag
                while j < len(line) and\
                   "a" <= line[j] <= "z":
                    tag.append(line[j])
                    j += 1
                while j < len(line) and line[j] != ">":
                    j += 1
                if stack[-1] != tag:
                    tag = stack[-1]
                    return ""
                stack.pop()
                  
            # Conditio to check if the current 
            # character denotes the code is 
            # of the HTML 5
            elif j + 1 < len(line) and line[j] == "<"\
                                  and line[j] == "!":
                continue
                  
            # Condition to check if the current 
            # tag is a start tag of the HTML Code
            elif line[j] == "<":
                tag = []
                j += 1
                  
                # Loop to get the tag of the HTML
                while j < len(line) and\
                      "a" <= line[j] <= "z":
                    tag.append(line[j])
                    j += 1
                while j < len(line) and line[j] != ">":
                    j += 1
                      
                # Condition to check if the 
                # current tag is not a self closed tag
                if "".join(tag) not in selfClosedTags:
                    stack.append(tag)
            j += 1
              
    # Condition to check if any tag 
    # is unbalanced then return that tag
    if stack:
        tag = stack.pop()
        return ""
    return -1
  
# Driver Code
if __name__ == "__main__":
    s = """


    
        GeeksforGeeks
    


    


输出: