人脸检测概念
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
人脸检测功能可定位数字图片或视频等视觉媒体中的人脸。当检测到人脸时,它具有关联的位置、大小和方向;并且可以搜索眼睛和鼻子等特征点。
以下是我们在机器学习套件的人脸检测功能中使用的一些术语:
面部跟踪将面部检测扩展到视频序列。系统可以逐帧跟踪出现在视频中任意时长的任何人脸。这意味着,在连续的视频帧中检测到的人脸可以被识别为同一个人。请注意,这并不是一种人脸识别形式;人脸跟踪仅根据人脸在视频序列中的位置和动作进行推断。
特征点是指人脸内的兴趣点。左眼、右眼和鼻基底部都是特征点。机器学习套件能够在检测到的面部上查找特征点。
轮廓是一组与面部特征形状相一致的点。机器学习套件能够找到面部的轮廓线。
分类决定了是否出现了某些面部特征。例如,人脸可以按照眼睛是睁开还是闭着,或者脸部是否有微笑进行分类。
人脸朝向
以下术语描述了人脸相对于镜头的角度:
- 欧拉 X:如果欧拉 X 角为正,表示面部向上。
- 欧拉 Y:如果欧拉 Y 角为正,表示面部朝向镜头右侧;如果面部为负,则朝向左侧。
- 欧拉 Z:如果欧拉 Z 角为正,表示人脸相对于镜头逆时针旋转。
当同时设置 LANDMARK_MODE_NONE
、CONTOUR_MODE_ALL
、CLASSIFICATION_MODE_NONE
和 PERFORMANCE_MODE_FAST
时,机器学习套件不会报告检测到的面部的欧拉 X、欧拉 Y 或欧拉 Z 角。
地标
特征点是指面部的一些关键点。左眼、右眼、鼻基都是特征点。
机器学习套件在不查找地标的情况下检测人脸。
地标检测是一个可选步骤,默认情况下处于停用状态。
下表总结了在给定关联人脸的欧拉 Y 角的情况下可检测到的所有特征点:
欧拉 Y 角 |
可检测到的特征点 |
< -36 度 |
左眼、左嘴、左耳、鼻基、左脸颊 |
-36 度到 -12 度 |
左嘴、鼻基、嘴巴底部、右眼、左眼、左脸颊、左耳尖 |
-12 度到 12 度 |
右眼、左眼、鼻基、左脸颊、右脸颊、左侧嘴巴、右侧嘴巴、嘴巴底部 |
12 度到 36 度 |
右嘴、鼻基、嘴巴底部、左眼、右眼、右脸颊、右耳尖 |
> 36 度 |
右眼、右嘴、右耳、鼻基、右脸颊 |
检测到的每个特征点都包含它在图片中的关联位置。
轮廓
轮廓是一组表示面部特征形状的点。下图说明了这些点如何映射到人脸。点击图片将其放大:
机器学习套件检测到的每个特征轮廓都由固定数量的点表示:
椭圆形脸 |
36 点 |
上唇(上部) |
11 点 |
左眉毛(上侧) |
5 点 |
上唇(底部) |
9 点 |
左眉毛(下侧) |
5 点 |
下唇(上部) |
9 点 |
右眉毛(上侧) |
5 点 |
下唇(底部) |
9 点 |
右眉毛(下侧) |
5 点 |
鼻梁 |
2 分 |
左眼 |
16 点 |
鼻部下方 |
3 分 |
右眼 |
16 点 |
左脸颊(中心) |
1 个端点 |
右脸颊(中心) |
1 分 |
当您一次性获得所有面部的轮廓时,您会获得一个由 133 个点组成的数组,这些点将映射到如下所示的特征轮廓:
特征轮廓的索引 |
0-35 |
椭圆形脸 |
36-40 |
左眉毛(上侧) |
41-45 |
左眉毛(下侧) |
46-50 |
右眉毛(上侧) |
51-55 |
右眉毛(下侧) |
56-71 |
左眼 |
72-87 |
右眼 |
88-96 |
上唇(底部) |
97-105 |
下唇(上部) |
106-116 |
上唇(上部) |
117-125 |
下唇(底部) |
126、127 |
鼻梁 |
128-130 |
鼻部底部(请注意,中心点位于索引 128 处) |
131 |
左脸颊(中心) |
132 |
右脸颊(中心) |
分类
分类用于指示是否出现了某些面部特征。
机器学习套件目前支持两种分类:睁眼和笑容。
分类是一个确定性值。它表示出现面部特征的置信度。例如,笑容分类的值为 0.7 或以上表示一个人很有可能在笑。
这两个分类都依赖于特征点检测。
另请注意,“睁眼”和“微笑”分类仅适用于正面的人脸,即欧拉 Y 角较小(介于 -18 度到 18 度之间的人脸)。
最小面尺寸
最小人脸大小是所需的人脸大小,表示为头部宽度与图片宽度的比率。例如,值 0.1 表示要搜索的最小人脸大约为所搜索图片宽度的 10%。
最小人脸大小是在性能与准确性之间的权衡:将最小大小设置得较小,可让检测器找到较小的人脸,但检测时间会更长;设置为较大可能会排除较小的人脸,但运行速度会更快。
最小人脸大小并非硬性限制;检测器可能会发现略小于指定大小的人脸。
后续步骤
在 iOS 或 Android 应用中使用人脸检测:
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-29。
[null,null,["最后更新时间 (UTC):2025-08-29。"],[[["\u003cp\u003eML Kit's face detection feature locates and analyzes human faces in images and videos, identifying position, size, orientation, and facial landmarks like eyes and nose.\u003c/p\u003e\n"],["\u003cp\u003eFace tracking in ML Kit follows faces across video frames, enabling the identification of the same person without using facial recognition.\u003c/p\u003e\n"],["\u003cp\u003eOptional features include landmark detection (identifying points like eyes, nose, mouth), contour detection (outlining facial features), and classification (detecting smiles or open eyes).\u003c/p\u003e\n"],["\u003cp\u003ePerformance can be adjusted by setting the minimum face size, balancing speed and accuracy of detection.\u003c/p\u003e\n"]]],[],null,["Face detection locates human faces in visual media such as digital images or\nvideo. When a face is detected it has an associated position, size, and\norientation; and it can be searched for landmarks such as the eyes and nose.\n\nHere are some of the terms that we use regarding the face detection feature\nof ML Kit:\n\n- **Face tracking** extends face detection to video sequences. Any face that\n appears in a video for any length of time can be tracked from frame to frame.\n This means a face detected in consecutive video frames can be identified as\n being the same person. Note that this isn't a form of *face recognition*; face\n tracking only makes inferences based on the position and motion of the faces in\n a video sequence.\n\n- A **landmark** is a point of interest within a face. The left eye, right eye,\n and base of the nose are all examples of landmarks. ML Kit provides the\n ability to find landmarks on a detected face.\n\n- A **contour** is a set of points that follow the shape of a facial feature.\n ML Kit provides the ability to find the contours of a face.\n\n- **Classification** determines whether a certain facial\n characteristic is present. For example, a face can be classified by\n whether its eyes are open or closed, or if the face is smiling or not.\n\nFace orientation\n\nThe following terms describe the angle a face is oriented with respect to the\ncamera:\n\n- **Euler X**: A face with a positive Euler X angle is facing upward.\n- **Euler Y**: A face with a positive Euler Y angle is looking to the right of the camera, or looking to the left if negative.\n- **Euler Z**: A face with a positive Euler Z angle is rotated counter-clockwise relative to the camera.\n\nML Kit doesn't report the Euler X, Euler Y or Euler Z angle of a detected face when\n`LANDMARK_MODE_NONE`, `CONTOUR_MODE_ALL`, `CLASSIFICATION_MODE_NONE` and\n`PERFORMANCE_MODE_FAST`are set together.\n\nLandmarks\n\nA landmark is a point of interest within a face. The left eye, right eye, and\nnose base are all examples of landmarks.\n\nML Kit detects faces without looking for landmarks.\nLandmark detection is an optional step that is disabled by default.\n\nThe following table summarizes all of the landmarks that can be detected given\nthe Euler Y angle of an associated face:\n\n| Euler Y angle | Detectable landmarks |\n|----------------------------|------------------------------------------------------------------------------------------------|\n| \\\u003c -36 degrees | left eye, left mouth, left ear, nose base, left cheek |\n| -36 degrees to -12 degrees | left mouth, nose base, bottom mouth, right eye, left eye, left cheek, left ear tip |\n| -12 degrees to 12 degrees | right eye, left eye, nose base, left cheek, right cheek, left mouth, right mouth, bottom mouth |\n| 12 degrees to 36 degrees | right mouth, nose base, bottom mouth, left eye, right eye, right cheek, right ear tip |\n| \\\u003e 36 degrees | right eye, right mouth, right ear, nose base, right cheek |\n\nEach detected landmark includes its associated position in the image.\n\nContours\n\nA contour is a set of points that represent the shape of a facial feature. The\nfollowing image illustrates how these points map to a face. Click the image to\nenlarge it:\n\n[](/static/ml-kit/vision/face-detection/images/face_contours.svg)\n\nEach feature contour that ML Kit detects is represented by a fixed number of\npoints:\n\n| Face oval | 36 points | Upper lip (top) | 11 points |\n| Left eyebrow (top) | 5 points | Upper lip (bottom) | 9 points |\n| Left eyebrow (bottom) | 5 points | Lower lip (top) | 9 points |\n| Right eyebrow (top) | 5 points | Lower lip (bottom) | 9 points |\n| Right eyebrow (bottom) | 5 points | Nose bridge | 2 points |\n| Left eye | 16 points | Nose bottom | 3 points |\n| Right eye | 16 points |\n| Left cheek (center) | 1 point |\n| Right cheek (center) | 1 points |\n|------------------------|-----------|--------------------|-----------|\n\nWhen you get all of a face's contours at once, you get an array of 133 points,\nwhich map to feature contours as shown below:\n\n| Indexes of feature contours ||\n|----------|----------------------------------------------------------|\n| 0-35 | Face oval |\n| 36-40 | Left eyebrow (top) |\n| 41-45 | Left eyebrow (bottom) |\n| 46-50 | Right eyebrow (top) |\n| 51-55 | Right eyebrow (bottom) |\n| 56-71 | Left eye |\n| 72-87 | Right eye |\n| 88-96 | Upper lip (bottom) |\n| 97-105 | Lower lip (top) |\n| 106-116 | Upper lip (top) |\n| 117-125 | Lower lip (bottom) |\n| 126, 127 | Nose bridge |\n| 128-130 | Nose bottom (note that the center point is at index 128) |\n| 131 | Left cheek (center) |\n| 132 | Right cheek (center) |\n\nClassification\n\nClassification determines whether a certain facial characteristic is present.\nML Kit currently supports two classifications: **eyes open** and **smiling**.\n\nClassification is a certainty value. It indicates the confidence\nthat a facial characteristic is present. For example, a value of 0.7 or more\nfor the smiling classification indicates that it's likely that a person is\nsmiling.\n\nBoth of these classifications rely upon landmark detection.\n\nAlso note that the classifications \"eyes open\" and \"smiling\" only work for\nfrontal faces, i.e., faces with a small Euler Y angle (between -18 and 18\ndegrees).\n\nMinimum Face Size\n\nThe minimum face size is the desired face size, expressed as the ratio of the width of\nthe head to the width of the image. For example, the value of 0.1 means that\nthe smallest face to search for is roughly 10% of the width of the image being\nsearched.\n\nThe minimum face size is a performance vs. accuracy trade-off: setting the\nminimum size smaller lets the detector find smaller faces but detection\nwill take longer; setting it larger might exclude smaller faces but\nwill run faster.\n\nThe minimum face size is not a hard limit; the detector may find faces slightly\nsmaller than specified.\n\nNext Steps\n\nUse face detection in your iOS or Android app:\n\n- [iOS](/ml-kit/vision/face-detection/ios)\n- [Android](/ml-kit/vision/face-detection/android)"]]