使用Python脸和手部地标检测 – Mediapipe、OpenCV

在本文中，我们将使用mediapipe Python库来检测人脸和手部地标。我们将使用来自mediapipe解决方案的整体模型来检测所有面部和手部地标。我们还将了解如何访问面部和手部的不同地标，这些地标可用于不同的计算机视觉应用，例如手语检测、困倦检测等。

所需的库

Mediapipe是由 Google 开发的跨平台库，可为计算机视觉任务提供惊人的即用型 ML 解决方案。
Python中的OpenCV库是一个广泛用于图像分析、图像处理、检测、识别等的计算机视觉库。

安装所需的库

pip install opencv-python mediapipe msvc-runtime

以下是面部和手部地标检测的逐步方法

STEP-1：导入所有必要的库，在我们的例子中只需要两个库。

Python3

# Import Libraries
import cv2
import time
import mediapipe as mp

Python3

# Grabbing the Holistic Model from Mediapipe and
# Initializing the Model
mp_holistic = mp.solutions.holistic
holistic_model = mp_holistic.Holistic(
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)
  
# Initializing the drawng utils for drawing the facial landmarks on image
mp_drawing = mp.solutions.drawing_utils

Python3

# (0) in VideoCapture is used to connect to your compyter's default camera
capture = cv2.VideoCapture(0)
  
# Initializing current time and precious time for calculating the FPS
previousTime = 0
currentTime = 0
  
while capture.isOpened():
    # capture frame by frame
    ret, frame = capture.read()
  
    # resizing the frame for better view
    frame = cv2.resize(frame, (800, 600))
  
    # Converting the from from BGR to RGB
    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
  
    # Making predictions using holistic model
    # To improve performance, optionally mark the image as not writeable to
    # pass by reference.
    image.flags.writeable = False
    results = holistic_model.process(image)
    image.flags.writeable = True
  
    # Converting back the RGB image to BGR
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  
    # Drawing the Facial Landmarks
    mp_drawing.draw_landmarks(
      image,
      results.face_landmarks,
      mp_holistic.FACE_CONNECTIONS,
      mp_drawing.DrawingSpec(
        color=(255,0,255),
        thickness=1,
        circle_radius=1
      ),
      mp_drawing.DrawingSpec(
        color=(0,255,255),
        thickness=1,
        circle_radius=1
      )
    )
  
    # Drawing Right hand Land Marks
    mp_drawing.draw_landmarks(
      image, 
      results.right_hand_landmarks, 
      mp_holistic.HAND_CONNECTIONS
    )
  
    # Drawing Left hand Land Marks
    mp_drawing.draw_landmarks(
      image, 
      results.left_hand_landmarks, 
      mp_holistic.HAND_CONNECTIONS
    )
      
    # Calculating the FPS
    currentTime = time.time()
    fps = 1 / (currentTime-previousTime)
    previousTime = currentTime
      
    # Displaying FPS on the image
    cv2.putText(image, str(int(fps))+" FPS", (10, 70), cv2.FONT_HERSHEY_COMPLEX, 1, (0,255,0), 2)
  
    # Display the resulting image
    cv2.imshow("Facial and Hand Landmarks", image)
  
    # Enter key 'q' to break the loop
    if cv2.waitKey(5) & 0xFF == ord('q'):
        break
  
# When all the process is done
# Release the capture and destroy all windows
capture.release()
cv2.destroyAllWindows()

Python3

# Code to access landmarks
for landmark in mp_holistic.HandLandmark:
    print(landmark, landmark.value)
  
print(mp_holistic.HandLandmark.WRIST.value)

第 2步：初始化整体模型和绘图工具，用于检测和绘制图像上的地标。

蟒蛇3

# Grabbing the Holistic Model from Mediapipe and
# Initializing the Model
mp_holistic = mp.solutions.holistic
holistic_model = mp_holistic.Holistic(
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)
  
# Initializing the drawng utils for drawing the facial landmarks on image
mp_drawing = mp.solutions.drawing_utils

让我们看看整体模型的参数：

Holistic(
  static_image_mode=False, 
  model_complexity=1, 
  smooth_landmarks=True, 
  min_detection_confidence=0.5, 
  min_tracking_confidence=0.5
)

static_image_mode：用于指定输入图像必须被视为静态图像还是视频流。默认值为 False。
model_complexity：用于指定姿势地标模型的复杂度：0、1或2。随着模型的模型复杂度增加，地标精度和延迟增加。默认值为 1。
smooth_landmarks：该参数用于通过过滤不同输入图像的位姿标志来减少预测中的抖动。默认值是true。
min_detection_confidence：用于指定从人员检测模型中检测需要被视为成功的最小置信度值。可以在 [0.0,1.0] 中指定一个值。默认值为 0.5。
min_tracking_confidence：它用于指定最小置信度值，必须将标志性跟踪模型的检测视为成功。可以在 [0.0,1.0] 中指定一个值。默认值为 0.5。

第 3步：从图像中检测人脸和手部地标。整体模型处理图像并为人脸、左手、右手生成地标，并检测人脸的姿势

使用 OpenCV 从相机连续捕获帧。
将 BGR 图像转换为 RGB 图像并使用初始化的整体模型进行预测。
整体模型所做的预测保存在结果变量中，我们可以从中分别使用 results.face_landmarks、results.right_hand_landmarks、results.left_hand_landmarks 访问地标。
使用绘图工具中的 draw_landmarks函数在图像上绘制检测到的地标。
显示生成的图像。

蟒蛇3

# (0) in VideoCapture is used to connect to your compyter's default camera
capture = cv2.VideoCapture(0)
  
# Initializing current time and precious time for calculating the FPS
previousTime = 0
currentTime = 0
  
while capture.isOpened():
    # capture frame by frame
    ret, frame = capture.read()
  
    # resizing the frame for better view
    frame = cv2.resize(frame, (800, 600))
  
    # Converting the from from BGR to RGB
    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
  
    # Making predictions using holistic model
    # To improve performance, optionally mark the image as not writeable to
    # pass by reference.
    image.flags.writeable = False
    results = holistic_model.process(image)
    image.flags.writeable = True
  
    # Converting back the RGB image to BGR
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  
    # Drawing the Facial Landmarks
    mp_drawing.draw_landmarks(
      image,
      results.face_landmarks,
      mp_holistic.FACE_CONNECTIONS,
      mp_drawing.DrawingSpec(
        color=(255,0,255),
        thickness=1,
        circle_radius=1
      ),
      mp_drawing.DrawingSpec(
        color=(0,255,255),
        thickness=1,
        circle_radius=1
      )
    )
  
    # Drawing Right hand Land Marks
    mp_drawing.draw_landmarks(
      image, 
      results.right_hand_landmarks, 
      mp_holistic.HAND_CONNECTIONS
    )
  
    # Drawing Left hand Land Marks
    mp_drawing.draw_landmarks(
      image, 
      results.left_hand_landmarks, 
      mp_holistic.HAND_CONNECTIONS
    )
      
    # Calculating the FPS
    currentTime = time.time()
    fps = 1 / (currentTime-previousTime)
    previousTime = currentTime
      
    # Displaying FPS on the image
    cv2.putText(image, str(int(fps))+" FPS", (10, 70), cv2.FONT_HERSHEY_COMPLEX, 1, (0,255,0), 2)
  
    # Display the resulting image
    cv2.imshow("Facial and Hand Landmarks", image)
  
    # Enter key 'q' to break the loop
    if cv2.waitKey(5) & 0xFF == ord('q'):
        break
  
# When all the process is done
# Release the capture and destroy all windows
capture.release()
cv2.destroyAllWindows()

整体模型产生 468 个面部标志、21 个左侧标志和 21 个右侧标志。可以通过指定所需地标的索引来访问各个地标。示例：results.left_hand_landmarks.landmark[0]。您可以使用以下代码获取所有单个地标的索引：

蟒蛇3

# Code to access landmarks
for landmark in mp_holistic.HandLandmark:
    print(landmark, landmark.value)
  
print(mp_holistic.HandLandmark.WRIST.value)

HandLandmark.WRIST 0
HandLandmark.THUMB_CMC 1
HandLandmark.THUMB_MCP 2
HandLandmark.THUMB_IP 3
HandLandmark.THUMB_TIP 4
HandLandmark.INDEX_FINGER_MCP 5
HandLandmark.INDEX_FINGER_PIP 6
HandLandmark.INDEX_FINGER_DIP 7
HandLandmark.INDEX_FINGER_TIP 8
HandLandmark.MIDDLE_FINGER_MCP 9
HandLandmark.MIDDLE_FINGER_PIP 10
HandLandmark.MIDDLE_FINGER_DIP 11
HandLandmark.MIDDLE_FINGER_TIP 12
HandLandmark.RING_FINGER_MCP 13
HandLandmark.RING_FINGER_PIP 14
HandLandmark.RING_FINGER_DIP 15
HandLandmark.RING_FINGER_TIP 16
HandLandmark.PINKY_MCP 17
HandLandmark.PINKY_PIP 18
HandLandmark.PINKY_DIP 19
HandLandmark.PINKY_TIP 20
0

手形地标及其索引

输出：