Document scanner with ML Kit on Android
Use the ML Kit document scanner API to easily add a document scanner feature to your app.
Feature | Details |
---|---|
Sdk name | play-services-mlkit-document-scanner |
Implementation | The models, scanning logic and UI flow are dynamically downloaded by Google Play services. |
App size impact | ~300KB download size increase. |
Initialization time | Users might have to wait for the models, logic and UI flow to download before first use. |
Try it out
Play around with the sample app to see an example usage of this API.
Before you begin
In your project-level
build.gradle
file, make sure to include Google's Maven repository in both your buildscript and allprojects sections.Add the dependency for the ML Kit document scanner library to your module's app-level gradle file, which is usually app/build.gradle:
dependencies {
// …
implementation 'com.google.android.gms:play-services-mlkit-document-scanner:16.0.0-beta1'
}
Document Scanner configuration
The document scanner user flow (which includes a dedicated viewfinder screen and preview screen) is provided by the SDK. The viewfinder and preview screen supports the following customizable controls:
- importing from the photo gallery
- setting a limit to the number of pages scanned
- scanner mode (to control the feature sets in the flow)
You can retrieve both PDF and JPEG files for your scanned documents.
Instantiate GmsDocumentScannerOptions
to configure the scanner options:
Kotlin
val options = GmsDocumentScannerOptions.Builder() .setGalleryImportAllowed(false) .setPageLimit(2) .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF) .setScannerMode(SCANNER_MODE_FULL) .build()
Java
GmsDocumentScannerOptions options = new GmsDocumentScannerOptions.Builder() .setGalleryImportAllowed(false) .setPageLimit(2) .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF) .setScannerMode(SCANNER_MODE_FULL) .build();
Scan documents
After creating your GmsDocumentScannerOptions
, get an
instance of GmsDocumentScanner
. You can then start the scanner activity
following
Activity Result APIs
introduced in AndroidX.
When the document scanning is complete, a GmsDocumentScanningResult
object will give access to the number of pages scanned, the URIs of the
images in JPEG format and PDF accordingly to what was defined via
setResultFormats
:
Kotlin
val scanner = GmsDocumentScanning.getClient(options) val scannerLauncher = registerForActivityResult(StartIntentSenderForResult()) { result -> { if (result.resultCode == RESULT_OK) { val result = GmsDocumentScanningResult.fromActivityResultIntent(result.data) result.getPages()?.let { pages -> for (page in pages) { val imageUri = pages.get(0).getImageUri() } } result.getPdf()?.let { pdf -> val pdfUri = pdf.getUri() val pageCount = pdf.getPageCount() } } } } scanner.getStartScanIntent(activity) .addOnSuccessListener { intentSender -> scannerLauncher.launch(IntentSenderRequest.Builder(intentSender).build()) } .addOnFailureListener { ... }
Java
GmsDocumentScanner scanner = GmsDocumentScanning.getClient(options); ActivityResultLauncher<IntentSenderRequest> scannerLauncher = registerForActivityResult( new StartIntentSenderForResult(), result -> { if (result.getResultCode() == RESULT_OK) { GmsDocumentScanningResult result = GmsDocumentScanningResult.fromActivityResultIntent(result.getData()); for (Page page : result.getPages()) { Uri imageUri = pages.get(0).getImageUri(); } Pdf pdf = result.getPdf(); Uri pdfUri = pdf.getUri(); int pageCount = pdf.getPageCount(); } }); scanner.getStartScanIntent(activity) .addOnSuccessListener(intentSender -> scannerLauncher.launch(new IntentSenderRequest.Builder(intentSender).build())) .addOnFailureListener(...);
Tips to improve performance
Consider that generating document files takes time and requires processing power, so only request the output formats (JPEG, or PDF, or both) you actually need.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-10-29 UTC.