Having to insert card details manually into an app could be a pain for users, but thanks to CameraX and MLKit, we can easily add a scanning feature in our app with minimal effort.


CameraX

CameraX is a Jetpack library to use camera APIs with device compatibility issues already handled for us and with lifecycle support.

The library is based on uses cases and we are going to use two of them: Preview and Image Capture.


Library Setup

For this post, we are going to use the following dependencies:

def camerax_version = '1.0.0-beta07'
implementation "androidx.camera:camera-core:$camerax_version"
implementation "androidx.camera:camera-camera2:$camerax_version"
implementation "androidx.camera:camera-lifecycle:$camerax_version"
implementation 'androidx.camera:camera-view:1.0.0-alpha14'
view raw build.gradle hosted with ❤ by GitHub


Preview

This use case will allow us to display on-screen what the camera is capturing in real-time.

To do so, we need to add androidx.camera.view.PreviewView to our layout:

<androidx.camera.view.PreviewView
android:id="@+id/previewView"
android:layout_width="match_parent"
android:layout_height="400dp"
android:adjustViewBounds="true" />

Then in code, we can build the Preview use case as follows:

private fun buildPreview(): Preview = Preview.Builder()
.build()
.apply {
setSurfaceProvider(previewView.createSurfaceProvider())
}
view raw MainActivity.kt hosted with ❤ by GitHub

To run our use cases we need a CameraProvider and a CameraSelector, I’ve created some extension using Kotlin Coroutines to get a CameraProvider:

suspend fun Context.getCameraProvider(): ProcessCameraProvider =
suspendCoroutine { continuation ->
ProcessCameraProvider.getInstance(this).apply {
addListener(Runnable {
continuation.resume(get())
}, executor)
}
}
val Context.executor: Executor
get() = ContextCompat.getMainExecutor(this)

While for the CameraSelector this is all we need:

private fun buildCameraSelector(): CameraSelector = CameraSelector.Builder()
.requireLensFacing(CameraSelector.LENS_FACING_BACK)
.build()
view raw MainActivity.kt hosted with ❤ by GitHub

We can now bind our use cases in this way:

override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
lifecycle.coroutineScope.launchWhenResumed {
bindUseCases(getCameraProvider())
}
}
private fun bindUseCases(cameraProvider: ProcessCameraProvider) {
val preview = buildPreview()
val cameraSelector = buildCameraSelector()
cameraProvider.bindToLifecycle(this, cameraSelector, preview)
}
view raw MainActivity.kt hosted with ❤ by GitHub


Image Capture

This is another CameraX use case, and the setup is pretty similar to the previous one.

We can create the use case in this way:

private fun buildTakePicture(): ImageCapture = ImageCapture.Builder()
.setTargetRotation(previewView.display.rotation)
.setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY)
.build()
view raw MainActivity.kt hosted with ❤ by GitHub

It’s time to update our bindUseCases method and also use this new one, but since CameraX is using a callback, and we would like to use kotlin coroutines let’s add a convenient extension method:

suspend fun ImageCapture.takePicture(executor: Executor): ImageProxy {
return suspendCoroutine { continuation ->
takePicture(executor, object : ImageCapture.OnImageCapturedCallback() {
override fun onCaptureSuccess(image: ImageProxy) {
continuation.resume(image)
super.onCaptureSuccess(image)
}
override fun onError(exception: ImageCaptureException) {
continuation.resumeWithException(exception)
super.onError(exception)
}
})
}
}

Now we can update bindUseCases function like the following:

private fun bindUseCases(cameraProvider: ProcessCameraProvider) {
val preview = buildPreview()
val takePicture = buildTakePicture()
val cameraSelector = buildCameraSelector()
cameraProvider.bindToLifecycle(this, cameraSelector, preview, takePicture)
button.setOnClickListener {
lifecycle.coroutineScope.launchWhenResumed {
val imageProxy = takePicture.takePicture(executor)
//TODO do something with the image
}
}
}
view raw MainActivity.kt hosted with ❤ by GitHub


MLKit

Now that we have our image is time to extract data from it. For this purpose, we are going to use MLKit Text Recognition.


Library Setup

We need to add this dependency :

implementation 'com.google.android.gms:play-services-mlkit-text-recognition:16.1.0'
view raw build.gradle hosted with ❤ by GitHub

and update our Manifest to download the ML model to the device after our app is installed:

<meta-data
android:name="com.google.mlkit.vision.DEPENDENCIES"
android:value="ocr" />

Let’s create an ExtractDataUseCase to encapsulate the MLKit integration:

class ExtractDataUseCase(private val textRecognizer: TextRecognizer) {
suspend operator fun invoke(image: Image, rotationDegrees: Int): CardDetails {
val imageInput = InputImage.fromMediaImage(image, rotationDegrees)
val text = textRecognizer.process(imageInput).await().text
return Extractor.extractData(text) //We are going to see this shortly
}
}

In this use case, we are accepting an instance of TextRecognizer that could be retrieved with TextRecognition.getClient().

To extract data from our image we first need to transform it into something MLKit can understand. Luckily for us, there is the InputImage.fromMediaImage(image, rotationDegrees) fatory method designed just for that.

Now that we have an image MLKit can understand we can pass it to our textRecognizer and await the text result.

Once again, we are using Kotlin coroutines so we can use this extension as a bridge from Task (returned by textRecognizer) and our suspend method.


Data Extraction

Thanks to CameraX we got the image, and then thanks to MLKit we were able to extract the text on it. Now is our turn to do some minimal logic and pass from raw data to some useful information.

As a first step let’s define what we can extract from the image, and create a data class:

data class CardDetails(
val owner: String?,
val number: String?,
val expirationMonth: String?,
val expirationYear: String?
)
view raw CardDetails.kt hosted with ❤ by GitHub

All fields are optional because I have several (debit) cards and on each of them I have a different layout, some of them have the owner on the same side of the number, some of them no.

Same for the expiration date, so ¯_(ツ)_/¯.

Now we can encapsulate our data extraction logic inside a Kotlin Object:

object Extractor {
fun extractData(input: String): CardDetails {
val lines = input.split("\n")
val owner = extractOwner(lines)
val number = extractNumber(lines)
val (month, year) = extractExpiration(lines)
return CardDetails(
owner = owner,
number = number,
expirationMonth = month,
expirationYear = year
)
}
}
view raw Extractor.kt hosted with ❤ by GitHub

Let’s extract some data:

private fun extractOwner(lines: List<String>): String? {
return lines
.filter { it.contains(" ") }
.filter { line -> line.asIterable().none { char -> char.isDigit() } }
.maxBy { it.length }
}
view raw Extractor.kt hosted with ❤ by GitHub

Here the rationale is to try to get the string that:

  • contains at least a space char (between name and surname)
  • contains no digits (I think you can’t have them in a name)

If we have more than one match, choose the longest one.

private fun extractNumber(lines: List<String>): String? {
return lines.firstOrNull { line ->
val subNumbers = line.split(" ")
subNumbers.isNotEmpty() && subNumbers.flatMap { it.asIterable() }.all { it.isDigit() }
}
}
view raw Extractor.kt hosted with ❤ by GitHub

For the number, we are picking the first line that:

  • has at lease a space (between numbers)
  • has all the chars as digits
private fun extractExpirationLine(lines: List<String>) =
lines.flatMap { it.split(" ") }
.firstOrNull { (it.length == 5 || it.length == 7) && it[2] == '/' }
view raw Extractor.kt hosted with ❤ by GitHub

Expiration could be something like 08/20 or, 08/2020 this is why I’m choosing the line with

  • 5 chars (like 08/20) o 7 chars (like 08/2020)
  • ’/’ as third char

Now that we have found the expiration line we need to separate the month from the year

private fun extractExpiration(lines: List<String>): Pair<String?, String?> {
val expirationLine = extractExpirationLine(lines)
val month = expirationLine?.substring(startIndex = 0, endIndex = 2)
val year = expirationLine?.substring(startIndex = 3)
return Pair(month, year)
}
view raw Extractor.kt hosted with ❤ by GitHub


Final touches

We can now update the activity to use our ExtractDataUseCase

Let’s create it

private val useCase = ExtractDataUseCase(TextRecognition.getClient())
view raw MainActivity.kt hosted with ❤ by GitHub

And then use it

private fun bindUseCases(cameraProvider: ProcessCameraProvider) {
val preview = buildPreview()
val takePicture = buildTakePicture()
val cameraSelector = buildCameraSelector()
cameraProvider.bindToLifecycle(this, cameraSelector, preview, takePicture)
button.setOnClickListener {
lifecycle.coroutineScope.launchWhenResumed {
val imageProxy = takePicture.takePicture(executor)
val cardDetails = useCase(imageProxy.image!!, imageProxy.imageInfo.rotationDegrees)
bindCardDetails(cardDetails)
}
}
}
view raw MainActivity.kt hosted with ❤ by GitHub

We can now finally bind our result to UI:

private fun bindCardDetails(card: CardDetails) {
owner.text = card.owner
number.text = card.number
date.text = "${card.expirationMonth}/${card.expirationYear}"
}
view raw MainActivity.kt hosted with ❤ by GitHub

That’s all folks! You can find the complete source code at https://github.com/dcampogiani/CreditCardScanner


Daniele Campogiani

Software Engineer