You can use ML Kit to label objects recognized in an image. The default model provided with ML Kit supports 400+ different labels.
Try it out
- Play around with the sample app to see an example usage of this API.
Before you begin
- Include the following ML Kit pods in your Podfile:
pod 'GoogleMLKit/ImageLabeling', '7.0.0'
- After you install or update your project's Pods, open your Xcode project using its
.xcworkspace
. ML Kit is supported in Xcode version 12.4 or greater.
Now you are ready to label images.
1. Prepare the input image
Create a VisionImage
object using a UIImage
or a
CMSampleBuffer
.
If you use a UIImage
, follow these steps:
- Create a
VisionImage
object with theUIImage
. Make sure to specify the correct.orientation
.Swift
let image = VisionImage(image: UIImage) visionImage.orientation = image.imageOrientation
Objective-C
MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image]; visionImage.orientation = image.imageOrientation;
If you use a
CMSampleBuffer
, follow these steps:-
Specify the orientation of the image data contained in the
CMSampleBuffer
.To get the image orientation:
Swift
func imageOrientation( deviceOrientation: UIDeviceOrientation, cameraPosition: AVCaptureDevice.Position ) -> UIImage.Orientation { switch deviceOrientation { case .portrait: return cameraPosition == .front ? .leftMirrored : .right case .landscapeLeft: return cameraPosition == .front ? .downMirrored : .up case .portraitUpsideDown: return cameraPosition == .front ? .rightMirrored : .left case .landscapeRight: return cameraPosition == .front ? .upMirrored : .down case .faceDown, .faceUp, .unknown: return .up } }
Objective-C
- (UIImageOrientation) imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation cameraPosition:(AVCaptureDevicePosition)cameraPosition { switch (deviceOrientation) { case UIDeviceOrientationPortrait: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationLeftMirrored : UIImageOrientationRight; case UIDeviceOrientationLandscapeLeft: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationDownMirrored : UIImageOrientationUp; case UIDeviceOrientationPortraitUpsideDown: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationRightMirrored : UIImageOrientationLeft; case UIDeviceOrientationLandscapeRight: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationUpMirrored : UIImageOrientationDown; case UIDeviceOrientationUnknown: case UIDeviceOrientationFaceUp: case UIDeviceOrientationFaceDown: return UIImageOrientationUp; } }
- Create a
VisionImage
object using theCMSampleBuffer
object and orientation:Swift
let image = VisionImage(buffer: sampleBuffer) image.orientation = imageOrientation( deviceOrientation: UIDevice.current.orientation, cameraPosition: cameraPosition)
Objective-C
MLKVisionImage *image = [[MLKVisionImage alloc] initWithBuffer:sampleBuffer]; image.orientation = [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation cameraPosition:cameraPosition];
2. Configure and run the image labeler
To label objects in an image, pass theVisionImage
object to theImageLabeler
'sprocessImage()
method.- First, get an instance of
ImageLabeler
.
Swift
let labeler = ImageLabeler.imageLabeler() // Or, to set the minimum confidence required: // let options = ImageLabelerOptions() // options.confidenceThreshold = 0.7 // let labeler = ImageLabeler.imageLabeler(options: options)
Objective-C
MLKImageLabeler *labeler = [MLKImageLabeler imageLabeler]; // Or, to set the minimum confidence required: // MLKImageLabelerOptions *options = // [[MLKImageLabelerOptions alloc] init]; // options.confidenceThreshold = 0.7; // MLKImageLabeler *labeler = // [MLKImageLabeler imageLabelerWithOptions:options];
- Then, pass the image to the
processImage()
method:
Swift
labeler.process(image) { labels, error in guard error == nil, let labels = labels else { return } // Task succeeded. // ... }
Objective-C
[labeler processImage:image completion:^(NSArray
*_Nullable labels, NSError *_Nullable error) { if (error != nil) { return; } // Task succeeded. // ... }]; 3. Get information about labeled objects
If image labeling succeeds, the completion handler receives an array of
ImageLabel
objects. EachImageLabel
object represents something that was labeled in the image. The base model supports 400+ different labels. You can get each label's text description, index among all labels supported by the model, and the confidence score of the match. For example:Swift
for label in labels { let labelText = label.text let confidence = label.confidence let index = label.index }
Objective-C
for (MLKImageLabel *label in labels) { NSString *labelText = label.text; float confidence = label.confidence; NSInteger index = label.index; }
Tips to improve real-time performance
If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:
- For processing video frames, use the
results(in:)
synchronous API of the image labeler. Call this method from theAVCaptureVideoDataOutputSampleBufferDelegate
'scaptureOutput(_, didOutput:from:)
function to synchronously get results from the given video frame. KeepAVCaptureVideoDataOutput
'salwaysDiscardsLateVideoFrames
astrue
to throttle calls to the image labeler. If a new video frame becomes available while the image labeler is running, it will be dropped. - If you use the output of the image labeler to overlay graphics on the input image, first get the result from ML Kit, then render the image and overlay in a single step. By doing so, you render to the display surface only once for each processed input frame. See the updatePreviewOverlayViewWithLastFrame in the ML Kit quickstart sample for an example.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-12-21 UTC.
[null,null,["Last updated 2024-12-21 UTC."],[[["ML Kit's image labeling API lets you identify objects in images using a pre-trained model that recognizes over 400 labels."],["To use this API, you need to include the `GoogleMLKit/ImageLabeling` pod, create a `VisionImage` object from your image, and then process it with an `ImageLabeler` instance."],["Results are provided as an array of `ImageLabel` objects, each containing the label's text, confidence score, and index."],["For real-time applications, leverage the synchronous `results(in:)` API and manage video frame processing efficiently to maintain optimal frame rates."]]],[]] - First, get an instance of
-