📌  相关文章
📜  Python|计算给定文本文件中每个单词的出现次数(使用字典)

📅  最后修改于: 2022-05-13 01:55:29.016000             🧑  作者: Mango

Python|计算给定文本文件中每个单词的出现次数(使用字典)

很多时候需要计算文本文件中每个单词的出现次数。为此,我们使用了一个字典对象,该对象将单词存储为键,将其计数存储为相应的值。我们遍历文件中的每个单词并将其添加到字典中,计数为 1。如果字典中已经存在该单词,我们将其计数增加 1。

示例 #1:
首先,我们创建一个文本文件,我们要计算其中的单词。让这个文件为sample.txt ,内容如下:

Mango banana apple pear
Banana grapes strawberry
Apple pear mango banana
Kiwi apple mango strawberry

注意:确保文本文件与Python文件位于同一目录中。

# Open the file in read mode
text = open("sample.txt", "r")
  
# Create an empty dictionary
d = dict()
  
# Loop through each line of the file
for line in text:
    # Remove the leading spaces and newline character
    line = line.strip()
  
    # Convert the characters in line to 
    # lowercase to avoid case mismatch
    line = line.lower()
  
    # Split the line into words
    words = line.split(" ")
  
    # Iterate over each word in line
    for word in words:
        # Check if the word is already in dictionary
        if word in d:
            # Increment count of word by 1
            d[word] = d[word] + 1
        else:
            # Add the word to dictionary with count 1
            d[word] = 1
  
# Print the contents of dictionary
for key in list(d.keys()):
    print(key, ":", d[key])

输出:

mango : 3
banana : 3
apple : 3
pear : 2
grapes : 1
strawberry : 2
kiwi : 1


示例 #2:

考虑一个包含带有标点符号的句子的文件sample.txt

Mango! banana apple pear.
Banana, grapes strawberry.
Apple- pear mango banana.
Kiwi "apple" mango strawberry.
import string
  
# Open the file in read mode
text = open("sample.txt", "r")
  
# Create an empty dictionary
d = dict()
  
# Loop through each line of the file
for line in text:
    # Remove the leading spaces and newline character
    line = line.strip()
  
    # Convert the characters in line to 
    # lowercase to avoid case mismatch
    line = line.lower()
  
    # Remove the punctuation marks from the line
    line = line.translate(line.maketrans("", "", string.punctuation))
  
    # Split the line into words
    words = line.split(" ")
  
    # Iterate over each word in line
    for word in words:
        # Check if the word is already in dictionary
        if word in d:
            # Increment count of word by 1
            d[word] = d[word] + 1
        else:
            # Add the word to dictionary with count 1
            d[word] = 1
  
# Print the contents of dictionary
for key in list(d.keys()):
    print(key, ":", d[key])

输出:

mango : 3
banana : 3
apple : 3
pear : 2
grapes : 1
strawberry : 2
kiwi : 1