Language identification
Stay organized with collections
Save and categorize content based on your preferences.

With ML Kit's on-device language identification API, you can determine the
language of a string of text.
Language identification can be useful when working with user-provided text,
which often doesn't come with any language information.
iOS
Android
Key capabilities
Broad language support. Identifies over one hundred different languages. See the
complete list.
Romanized text support. Identifies Arabic, Bulgarian, Greek, Hindi, Japanese,
Russian, and Chinese text in both native and romanized script.
Example results
Simple language identification |
"My hovercraft is full of eels." |
en (English) |
"Dao shan xue hai" |
zh-Latn (Latinized Chinese) |
"ph'nglui mglw'nafh wgah'nagl fhtagn" |
und (undetermined) |
Confidence distribution |
"an amicable coup d'etat" |
en (0.52)
fr (0.44)
ca (0.03)
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-29 UTC.
[null,null,["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eML Kit's language identification API allows you to determine the language of a text string, which is particularly helpful when dealing with user-generated content that often lacks language information.\u003c/p\u003e\n"],["\u003cp\u003eIt supports a wide range of languages, including romanized versions of Arabic, Bulgarian, Greek, Hindi, Japanese, Russian, and Chinese text.\u003c/p\u003e\n"],["\u003cp\u003eThe API can provide a simple language identification or a confidence distribution across multiple languages for a given text.\u003c/p\u003e\n"]]],["ML Kit's on-device API identifies the language of text strings, supporting over one hundred languages, including romanized versions of Arabic, Bulgarian, Greek, Hindi, Japanese, Russian, and Chinese. It's designed for user-provided text that lacks language information. The API outputs language codes (e.g., `en` for English) or `und` for undetermined. For ambiguous cases, it provides a confidence distribution across multiple languages (e.g., `en` (0.52), `fr` (0.44)).\n"],null,["With ML Kit's on-device language identification API, you can determine the\nlanguage of a string of text.\n\nLanguage identification can be useful when working with user-provided text,\nwhich often doesn't come with any language information.\n\n[iOS](/ml-kit/language/identification/ios)\n[Android](/ml-kit/language/identification/android)\n\nKey capabilities\n\n- **Broad language support.** Identifies over one hundred different languages. See the\n [complete list](/ml-kit/language/identification/langid-support).\n\n- **Romanized text support.** Identifies Arabic, Bulgarian, Greek, Hindi, Japanese,\n Russian, and Chinese text in both native and romanized script.\n\nExample results\n\n| Simple language identification ||\n|---------------------------------------|-------------------------------|\n| \"My hovercraft is full of eels.\" | `en` (English) |\n| \"Dao shan xue hai\" | `zh-Latn` (Latinized Chinese) |\n| \"ph'nglui mglw'nafh wgah'nagl fhtagn\" | `und` (undetermined) |\n\n| Confidence distribution ||\n|---------------------------|-------------------------------------|\n| \"an amicable coup d'etat\" | `en` (0.52) `fr` (0.44) `ca` (0.03) |"]]