📜  在给定的HTML代码中找到缺少的结束标记

📅  最后修改于: 2021-04-29 06:10:11             🧑  作者: Mango

给定字符串htmlCode(它是网页的HTML代码),任务是在HTML代码中找到丢失的结束标签。

例子:

Input: htmlCode = "


    
        GeeksforGeeks
    


    

Input: htmlCode = "


    

Hello

" Output:

方法:想法是使用堆栈来跟踪HTML代码中的当前开始标签,并且如果在任何时候有一个结束标签与堆栈顶部不匹配,则表示开始标签。然后,将首先关闭位于堆栈顶部的标签。因此,堆栈顶部将是HTML代码中所需的缺少标记。

下面是上述方法的实现:

Python
# Python implementation to find the 
# the missing end in the HTML code
  
# Function to Auto complete the 
# missing tag in the html Code
def autoComplete(s):
      
    # Split the html Code in line
    linesOfCode = list(s.strip().split("\n"))
      
    # Tags which are self closed doesn't
    # needs to be closed
    selfClosedTags = ["area",  "base", "br", \
            "col",   "embed",  "hr",    "img", \
            "input", "link", "meta", "param", \
                    "source", "track", "wbr"]
    n = len(linesOfCode)
  
    stack = []
      
    # Loop to iterate over the
    # lines of code
    for i in range(n):
        j = 0
          
        # Current Line
        line = linesOfCode[i]
        while j < len(linesOfCode[i]):
              
            # Condition to check if the current 
            # character is a end tag in the code
            if j + 1 < len(line) and line[j] == "<"\
                            and line[j + 1] == "/":
                tag = []
                j += 2
                  
                # Loop to get the tag
                while j < len(line) and\
                   "a" <= line[j] <= "z":
                    tag.append(line[j])
                    j += 1
                while j < len(line) and line[j] != ">":
                    j += 1
                if stack[-1] != tag:
                    tag = stack[-1]
                    return ""
                stack.pop()
                  
            # Conditio to check if the current 
            # character denotes the code is 
            # of the HTML 5
            elif j + 1 < len(line) and line[j] == "<"\
                                  and line[j] == "!":
                continue
                  
            # Condition to check if the current 
            # tag is a start tag of the HTML Code
            elif line[j] == "<":
                tag = []
                j += 1
                  
                # Loop to get the tag of the HTML
                while j < len(line) and\
                      "a" <= line[j] <= "z":
                    tag.append(line[j])
                    j += 1
                while j < len(line) and line[j] != ">":
                    j += 1
                      
                # Condition to check if the 
                # current tag is not a self closed tag
                if "".join(tag) not in selfClosedTags:
                    stack.append(tag)
            j += 1
              
    # Condition to check if any tag 
    # is unbalanced then return that tag
    if stack:
        tag = stack.pop()
        return ""
    return -1
  
# Driver Code
if __name__ == "__main__":
    s = """


    
        GeeksforGeeks
    


    


输出: