ML Kit | Google for Developers

Document scanner with ML Kit on Android

Use the ML Kit document scanner API to easily add a document scanner feature to your app.

Feature	Details
Sdk name	play-services-mlkit-document-scanner
Implementation	The models, scanning logic and UI flow are dynamically downloaded by Google Play services.
App size impact	~300KB download size increase.
Initialization time	Users might have to wait for the models, logic and UI flow to download before first use.

Try it out

Play around with the sample app to see an example usage of this API.

Before you begin

In your project-level build.gradle file, make sure to include Google's Maven repository in both your buildscript and allprojects sections.
Add the dependency for the ML Kit document scanner library to your module's app-level gradle file, which is usually app/build.gradle:

dependencies {
   // …
   implementation 'com.google.android.gms:play-services-mlkit-document-scanner:16.0.0'
}

Document Scanner configuration

The document scanner user flow (which includes a dedicated viewfinder screen and preview screen) is provided by the SDK. The viewfinder and preview screen supports the following customizable controls:

importing from the photo gallery
setting a limit to the number of pages scanned
scanner mode (to control the feature sets in the flow)

You can retrieve both PDF and JPEG files for your scanned documents.

Instantiate GmsDocumentScannerOptions to configure the scanner options:

Kotlin

val options = GmsDocumentScannerOptions.Builder()
    .setGalleryImportAllowed(false)
    .setPageLimit(2)
    .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF)
    .setScannerMode(SCANNER_MODE_FULL)
    .build()

Java

GmsDocumentScannerOptions options = new GmsDocumentScannerOptions.Builder()
    .setGalleryImportAllowed(false)
    .setPageLimit(2)
    .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF)
    .setScannerMode(SCANNER_MODE_FULL)
    .build();

Scan documents

After creating your GmsDocumentScannerOptions, get an instance of GmsDocumentScanner. You can then start the scanner activity following Activity Result APIs introduced in AndroidX.

When the document scanning is complete, a GmsDocumentScanningResult object will give access to the number of pages scanned, the URIs of the images in JPEG format and PDF accordingly to what was defined via setResultFormats:

Kotlin

val scanner = GmsDocumentScanning.getClient(options)
val scannerLauncher = registerForActivityResult(StartIntentSenderForResult()) {
  result -> {
    if (result.resultCode == RESULT_OK) {
      val result =
        GmsDocumentScanningResult.fromActivityResultIntent(result.data)
      result.getPages()?.let { pages ->
        for (page in pages) {
          val imageUri = pages.get(0).getImageUri()
        }
      }
      result.getPdf()?.let { pdf ->
        val pdfUri = pdf.getUri()
        val pageCount = pdf.getPageCount()
      }
    }
  }
}

scanner.getStartScanIntent(activity)
  .addOnSuccessListener { intentSender ->
     scannerLauncher.launch(IntentSenderRequest.Builder(intentSender).build())
   }
  .addOnFailureListener {
    ...
  }

Java

GmsDocumentScanner scanner = GmsDocumentScanning.getClient(options);
ActivityResultLauncher<IntentSenderRequest> scannerLauncher =
  registerForActivityResult(
    new StartIntentSenderForResult(),
      result -> {
        if (result.getResultCode() == RESULT_OK) {
          GmsDocumentScanningResult result = GmsDocumentScanningResult.fromActivityResultIntent(result.getData());
          for (Page page : result.getPages()) {
            Uri imageUri = pages.get(0).getImageUri();
          }

          Pdf pdf = result.getPdf();
          Uri pdfUri = pdf.getUri();
          int pageCount = pdf.getPageCount();
        }
      });

scanner.getStartScanIntent(activity)
  .addOnSuccessListener(intentSender ->
    scannerLauncher.launch(new IntentSenderRequest.Builder(intentSender).build()))
  .addOnFailureListener(...);

Tips to improve performance

Consider that generating document files takes time and requires processing power, so only request the output formats (JPEG, or PDF, or both) you actually need.