📜  使用 BeautifulSoup 抓取 Covid-19 统计数据

📅  最后修改于: 2022-05-13 01:55:31.702000             🧑  作者: Mango

使用 BeautifulSoup 抓取 Covid-19 统计数据

冠状病毒是最大的流行病之一,已将全世界置于危险之中。除此之外,它是热门新闻之一,每个人都有这一天。在本文中,我们将以人类可读的形式抓取数据并打印 Covid-19 统计数据。数据将从本网站抓取
先决条件:

  • 必须安装库“requests”、“bs4”和“texttable”——
pip install bs4
pip install requests
pip install texttable

项目:让我们开始编写代码,创建一个名为 run.py 的文件。

Python3
# importing modules
import requests
from bs4 import BeautifulSoup
 
# URL for scrapping data
url = 'https://www.worldometers.info/coronavirus/countries-where-coronavirus-has-spread/'
 
# get URL html
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
 
data = []
 
# soup.find_all('td') will scrape every
# element in the url's table
data_iterator = iter(soup.find_all('td'))
 
# data_iterator is the iterator of the table
# This loop will keep repeating till there is
# data available in the iterator
while True:
    try:
        country = next(data_iterator).text
        confirmed = next(data_iterator).text
        deaths = next(data_iterator).text
        continent = next(data_iterator).text
 
        # For 'confirmed' and 'deaths',
        # make sure to remove the commas
        # and convert to int
        data.append((
            country,
            int(confirmed.replace(', ', '')),
            int(deaths.replace(', ', '')),
            continent
        ))
 
    # StopIteration error is raised when
    # there are no more elements left to
    # iterate through
    except StopIteration:
        break
 
# Sort the data by the number of confirmed cases
data.sort(key = lambda row: row[1], reverse = True)


Python3
# create texttable object
import texttable as tt
table = tt.Texttable()
 
# Add an empty row at the beginning for the headers
table.add_rows([(None, None, None, None)] + data)
 
# 'l' denotes left, 'c' denotes center,
# and 'r' denotes right
table.set_cols_align(('c', 'c', 'c', 'c')) 
table.header((' Country ', ' Number of cases ', ' Deaths ', ' Continent '))
 
print(table.draw())


为了以人类可读的格式打印数据,我们将使用库“ texttable

Python3

# create texttable object
import texttable as tt
table = tt.Texttable()
 
# Add an empty row at the beginning for the headers
table.add_rows([(None, None, None, None)] + data)
 
# 'l' denotes left, 'c' denotes center,
# and 'r' denotes right
table.set_cols_align(('c', 'c', 'c', 'c')) 
table.header((' Country ', ' Number of cases ', ' Deaths ', ' Continent '))
 
print(table.draw())

输出:

+---------------------------+-------------------+----------+-------------------+
|          Country          |  Number of cases  |  Deaths  |     Continent     |
+===========================+===================+==========+===================+
|       United States       |      644348       |  28554   |   North America   |
+---------------------------+-------------------+----------+-------------------+
|           Spain           |      180659       |  18812   |      Europe       |
+---------------------------+-------------------+----------+-------------------+
|           Italy           |      165155       |  21645   |      Europe       |
+---------------------------+-------------------+----------+-------------------+
...


注意:输出取决于当前统计信息
待在家里,注意安全!