Python OpenCV中英文字母的OCR

OCR代表光学字符识别，是一种计算机视觉技术，用于识别数字、字母、符号等字符。这些字符在日常生活中很常见，我们可以根据自己的要求进行字符识别。我们将使用 OpenCV 实现英文字母的光学字符识别。在这里，我们将使用用于分类的 KNN 算法。

注意：您可以在此处找到我们将执行 OCR 的数据。

有 20000 行数据，包含 17 列，其中第一列代表字母表，其余 16 列将代表其不同的特征。我们必须通过将字母转换为 ASCII字符来处理数据。为了执行分类，我们将使用 10000 行作为 training_data 和 10000 行作为 testing_data。

下面是实现。

Python3

#Import the libraries
import cv2 as cv
import numpy as np
  
 
# Read data and use converters
# to convert the alphabets to
# Numeric value.
data= np.loadtxt('letter-recognition',
                 dtype= 'float32',
                 delimiter = ',',
                 converters= {0: lambda ch: ord(ch)-ord('A')})
 
# split the data into train_data
# and test_data
train_data, test_data = np.vsplit(data,2)
  
# split train_data and test_data
# to features and responses.
responses, training = np.hsplit(train_data,[1])
classes, testing = np.hsplit(test_data,[1])
  
# Create the knn classifier
knn = cv.ml.KNearest_create()
knn.train(training, cv.ml.ROW_SAMPLE, responses)
  
# Obtain the results of the classifier
# determine the number of neighbors.
ret, Output, neighbours,
distance = knn.findNearest(testing, k=7)
  
# Match the Output to find the
# number of wrong predictions.
correct_OP = np.count_nonzero(Output == classes)
  
#calculate accuracy and display it.
accuracy = (correct_OP*100.0)/(10000)
print( accuracy )

输出

92.82