使用 BeautifulSoup 查找标签的兄弟姐妹
先决条件: BeautifulSoup
BeautifulSoup(bs4)是一个Python库,用于从 HTML 和 XML 文件中提取数据。这个模块不是内置在Python的。要安装此类型,请在终端中输入以下命令。在本文中,我们将使用 BeautifulSoup 了解 HTML 标签中的兄弟姐妹。
这里我们将讨论这四个兄弟属性:
- previous_sibling用于查找给定元素的前一个元素
- next_sibling用于查找给定元素的下一个元素
- previous_siblings用于查找给定元素的所有先前元素
- next_siblings用于查找给定元素的所有下一个元素
方法
- 导入模块
- 加载或创建 HTML 代码
- 解析 HTML 代码
- 打印所需的兄弟。
示例 1:打印下一个直接兄弟
Python3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# next element
print(soup.b.next_sibling)
Python3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# previous element
print(soup.c.previous_sibling)
Python3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text3 text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# next element
for element in soup.b.next_siblings:
print(element)
Python3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text3 text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# previous element
for element in soup.c.previous_siblings:
print(element)
输出:
示例 2:获取前一个直接兄弟
蟒蛇3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# previous element
print(soup.c.previous_sibling)
输出:
text1
假设我们要查找标签的所有下一个元素。为此,我们只是简单地遍历兄弟姐妹并打印所需的标签。
示例 3:获取标签旁边的所有兄弟姐妹
蟒蛇3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text3 text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# next element
for element in soup.b.next_siblings:
print(element)
输出:
示例 4:获取所有以前的兄弟姐妹
蟒蛇3
# Import Module
from bs4 import BeautifulSoup
# HTML CODE
html_code = """text1text3 text2 """
# Parse HTML CODE
soup = BeautifulSoup(html_code, 'html.parser')
# previous element
for element in soup.c.previous_siblings:
print(element)
输出:
text1