Python|对同时存在于文件夹中的所有图像进行 OCR(1)

📌 相关文章

📜 Python|对同时存在于文件夹中的所有图像进行 OCR(1)

📅 最后修改于: 2023-12-03 15:04:25.608000 🧑 作者: Mango

Python对同时存在于文件夹中的所有图像进行 OCR

OCR（Optical Character Recognition，光学字符识别）技术可以将图片中的文本转换为可编辑的电子文本。在我们日常工作中，经常需要处理一些批量的图片文本，此时使用 Python 实现 OCR 技术可以大大提高工作效率。

本文介绍如何使用 Python 对同时存在于文件夹中的所有图像进行 OCR。

程序设计思路

我们的程序将会按照以下流程完成：

定位到存储图像的文件夹，并获取该文件夹中的所有图像文件；
使用 OCR 技术对每张图像进行识别，获取文本结果；
将结果保存在一个文本文件中，每张图像的结果占一行。

实现步骤

安装依赖

在 Python 中使用 OCR 技术需要安装 tesseract OCR 引擎库。在 Windows 上，可以从官网 https://github.com/UB-Mannheim/tesseract/wiki 下载安装，或者使用以下命令安装 tesseract-ocr 和 pytesseract：

sudo apt-get install tesseract-ocr
pip install pytesseract

导入模块

在程序开始之前，需要导入必要的模块，包括 os、glob 和 pytesseract。

import os
import glob
import pytesseract

定位到文件夹并获取所有图像文件

使用 os 模块的 listdir 方法，将程序运行路径下的所有文件和文件夹列出来。然后使用 glob 模块匹配包含指定后缀名的文件。

img_dir = os.getcwd()  # 获取当前工作路径
img_files = glob.glob(img_dir + "/*.png") + glob.glob(img_dir + "/*.jpg")

识别图像并保存结果

使用 pytesseract 模块中的 image_to_string 方法，对每张图像进行 OCR 识别，并将结果转换为文本格式。将结果保存在一个文本文件中，每张图像的结果占一行。

with open(os.path.join(img_dir, "result.txt"), "w", encoding="utf-8") as f:
    for img_file in img_files:
        text = pytesseract.image_to_string(img_file, lang='eng')
        f.write(text + "\n")

至此，程序实现完毕。

完整代码

import os
import glob
import pytesseract

img_dir = os.getcwd()  # 获取当前工作路径
img_files = glob.glob(img_dir + "/*.png") + glob.glob(img_dir + "/*.jpg")

with open(os.path.join(img_dir, "result.txt"), "w", encoding="utf-8") as f:
    for img_file in img_files:
        text = pytesseract.image_to_string(img_file, lang='eng')
        f.write(text + "\n")

总结

在本文中，我们介绍了如何使用 Python 对同时存在于文件夹中的所有图像进行 OCR，步骤包括导入模块、定位文件夹及图像文件、识别图像以及保存结果。对于大规模的 OCR 任务，程序可以轻松地扩展到多线程或分布式任务。