Having to insert card details manually into an app could be a pain for users, but thanks to CameraX and MLKit, we can easily add a scanning feature in our app with minimal effort.
CameraX
CameraX is a Jetpack library to use camera APIs with device compatibility issues already handled for us and with lifecycle support.
The library is based on uses cases and we are going to use two of them: Preview and Image Capture.
Library Setup
For this post, we are going to use the following dependencies:
def camerax_version = '1.0.0-beta07' | |
implementation "androidx.camera:camera-core:$camerax_version" | |
implementation "androidx.camera:camera-camera2:$camerax_version" | |
implementation "androidx.camera:camera-lifecycle:$camerax_version" | |
implementation 'androidx.camera:camera-view:1.0.0-alpha14' |
Preview
This use case will allow us to display on-screen what the camera is capturing in real-time.
To do so, we need to add androidx.camera.view.PreviewView to our layout:
<androidx.camera.view.PreviewView | |
android:id="@+id/previewView" | |
android:layout_width="match_parent" | |
android:layout_height="400dp" | |
android:adjustViewBounds="true" /> |
Then in code, we can build the Preview use case as follows:
private fun buildPreview(): Preview = Preview.Builder() | |
.build() | |
.apply { | |
setSurfaceProvider(previewView.createSurfaceProvider()) | |
} |
To run our use cases we need a CameraProvider and a CameraSelector, I’ve created some extension using Kotlin Coroutines to get a CameraProvider:
suspend fun Context.getCameraProvider(): ProcessCameraProvider = | |
suspendCoroutine { continuation -> | |
ProcessCameraProvider.getInstance(this).apply { | |
addListener(Runnable { | |
continuation.resume(get()) | |
}, executor) | |
} | |
} | |
val Context.executor: Executor | |
get() = ContextCompat.getMainExecutor(this) |
While for the CameraSelector this is all we need:
private fun buildCameraSelector(): CameraSelector = CameraSelector.Builder() | |
.requireLensFacing(CameraSelector.LENS_FACING_BACK) | |
.build() |
We can now bind our use cases in this way:
override fun onCreate(savedInstanceState: Bundle?) { | |
super.onCreate(savedInstanceState) | |
setContentView(R.layout.activity_main) | |
lifecycle.coroutineScope.launchWhenResumed { | |
bindUseCases(getCameraProvider()) | |
} | |
} | |
private fun bindUseCases(cameraProvider: ProcessCameraProvider) { | |
val preview = buildPreview() | |
val cameraSelector = buildCameraSelector() | |
cameraProvider.bindToLifecycle(this, cameraSelector, preview) | |
} |
Image Capture
This is another CameraX use case, and the setup is pretty similar to the previous one.
We can create the use case in this way:
private fun buildTakePicture(): ImageCapture = ImageCapture.Builder() | |
.setTargetRotation(previewView.display.rotation) | |
.setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY) | |
.build() |
It’s time to update our bindUseCases method and also use this new one, but since CameraX is using a callback, and we would like to use kotlin coroutines let’s add a convenient extension method:
suspend fun ImageCapture.takePicture(executor: Executor): ImageProxy { | |
return suspendCoroutine { continuation -> | |
takePicture(executor, object : ImageCapture.OnImageCapturedCallback() { | |
override fun onCaptureSuccess(image: ImageProxy) { | |
continuation.resume(image) | |
super.onCaptureSuccess(image) | |
} | |
override fun onError(exception: ImageCaptureException) { | |
continuation.resumeWithException(exception) | |
super.onError(exception) | |
} | |
}) | |
} | |
} |
Now we can update bindUseCases function like the following:
private fun bindUseCases(cameraProvider: ProcessCameraProvider) { | |
val preview = buildPreview() | |
val takePicture = buildTakePicture() | |
val cameraSelector = buildCameraSelector() | |
cameraProvider.bindToLifecycle(this, cameraSelector, preview, takePicture) | |
button.setOnClickListener { | |
lifecycle.coroutineScope.launchWhenResumed { | |
val imageProxy = takePicture.takePicture(executor) | |
//TODO do something with the image | |
} | |
} | |
} |
MLKit
Now that we have our image is time to extract data from it. For this purpose, we are going to use MLKit Text Recognition.
Library Setup
We need to add this dependency :
implementation 'com.google.android.gms:play-services-mlkit-text-recognition:16.1.0' |
and update our Manifest to download the ML model to the device after our app is installed:
<meta-data | |
android:name="com.google.mlkit.vision.DEPENDENCIES" | |
android:value="ocr" /> |
Let’s create an ExtractDataUseCase to encapsulate the MLKit integration:
class ExtractDataUseCase(private val textRecognizer: TextRecognizer) { | |
suspend operator fun invoke(image: Image, rotationDegrees: Int): CardDetails { | |
val imageInput = InputImage.fromMediaImage(image, rotationDegrees) | |
val text = textRecognizer.process(imageInput).await().text | |
return Extractor.extractData(text) //We are going to see this shortly | |
} | |
} |
In this use case, we are accepting an instance of TextRecognizer that could be retrieved with TextRecognition.getClient().
To extract data from our image we first need to transform it into something MLKit can understand. Luckily for us, there is the
InputImage.fromMediaImage(image, rotationDegrees)
fatory method designed just for that.
Now that we have an image MLKit can understand we can pass it to our textRecognizer and await the text result.
Once again, we are using Kotlin coroutines so we can use this extension as a bridge from Task (returned by textRecognizer) and our suspend method.
Data Extraction
Thanks to CameraX we got the image, and then thanks to MLKit we were able to extract the text on it. Now is our turn to do some minimal logic and pass from raw data to some useful information.
As a first step let’s define what we can extract from the image, and create a data class:
data class CardDetails( | |
val owner: String?, | |
val number: String?, | |
val expirationMonth: String?, | |
val expirationYear: String? | |
) |
All fields are optional because I have several (debit) cards and on each of them I have a different layout, some of them have the owner on the same side of the number, some of them no.
Same for the expiration date, so ¯_(ツ)_/¯.
Now we can encapsulate our data extraction logic inside a Kotlin Object:
object Extractor { | |
fun extractData(input: String): CardDetails { | |
val lines = input.split("\n") | |
val owner = extractOwner(lines) | |
val number = extractNumber(lines) | |
val (month, year) = extractExpiration(lines) | |
return CardDetails( | |
owner = owner, | |
number = number, | |
expirationMonth = month, | |
expirationYear = year | |
) | |
} | |
} |
Let’s extract some data:
private fun extractOwner(lines: List<String>): String? { | |
return lines | |
.filter { it.contains(" ") } | |
.filter { line -> line.asIterable().none { char -> char.isDigit() } } | |
.maxBy { it.length } | |
} |
Here the rationale is to try to get the string that:
- contains at least a space char (between name and surname)
- contains no digits (I think you can’t have them in a name)
If we have more than one match, choose the longest one.
private fun extractNumber(lines: List<String>): String? { | |
return lines.firstOrNull { line -> | |
val subNumbers = line.split(" ") | |
subNumbers.isNotEmpty() && subNumbers.flatMap { it.asIterable() }.all { it.isDigit() } | |
} | |
} |
For the number, we are picking the first line that:
- has at lease a space (between numbers)
- has all the chars as digits
private fun extractExpirationLine(lines: List<String>) = | |
lines.flatMap { it.split(" ") } | |
.firstOrNull { (it.length == 5 || it.length == 7) && it[2] == '/' } |
Expiration could be something like 08/20 or, 08/2020 this is why I’m choosing the line with
- 5 chars (like 08/20) o 7 chars (like 08/2020)
- ’/’ as third char
Now that we have found the expiration line we need to separate the month from the year
private fun extractExpiration(lines: List<String>): Pair<String?, String?> { | |
val expirationLine = extractExpirationLine(lines) | |
val month = expirationLine?.substring(startIndex = 0, endIndex = 2) | |
val year = expirationLine?.substring(startIndex = 3) | |
return Pair(month, year) | |
} |
Final touches
We can now update the activity to use our ExtractDataUseCase
Let’s create it
private val useCase = ExtractDataUseCase(TextRecognition.getClient()) |
And then use it
private fun bindUseCases(cameraProvider: ProcessCameraProvider) { | |
val preview = buildPreview() | |
val takePicture = buildTakePicture() | |
val cameraSelector = buildCameraSelector() | |
cameraProvider.bindToLifecycle(this, cameraSelector, preview, takePicture) | |
button.setOnClickListener { | |
lifecycle.coroutineScope.launchWhenResumed { | |
val imageProxy = takePicture.takePicture(executor) | |
val cardDetails = useCase(imageProxy.image!!, imageProxy.imageInfo.rotationDegrees) | |
bindCardDetails(cardDetails) | |
} | |
} | |
} |
We can now finally bind our result to UI:
private fun bindCardDetails(card: CardDetails) { | |
owner.text = card.owner | |
number.text = card.number | |
date.text = "${card.expirationMonth}/${card.expirationYear}" | |
} |
That’s all folks! You can find the complete source code at https://github.com/dcampogiani/CreditCardScanner