Keep customers (happy) through the onboarding process: How Machine Learning improves the CX

tl;dr: ML without the BS | one practical application is document image enhancing and sharpening | we use a u-net autoencoder for faster and easier e-onboarding | ML denoises and enhances is milliseconds, at scale and in real-time.
Keep customers (happy) through the onboarding process: How Machine Learning improves the CX

Banning buzzword bingo

“Machine learning” and “deep learning” are sprinkled too liberally in our industry. They are used and misused, hashtagged and waved away like a slightly inebriated uncle dancing at a wedding; oh don’t mind that, that’s machine learning, best you just let us worry about that.

We’re seriously surprised we haven’t seen an ML/DL buzzword bingo yet…neural networks, evolutionary algorithms, reinforcement learning – wait, FULL HOUSE!

Bridging the gap between hype and practicality

Of course, it is not easy to explain concepts that are extraordinarily technical by nature. Certainly not without either leaving the audience slightly more baffled than when they started, or dumbing it down so much that it become utterly innocuous.

So here’s what we propose: how about a technical blog that explains exactly how AimBrain is using deep learning and machine learning to enhance the biometric authentication process, and what this means to the end user?

Remote onboarding – how to fix a problem like poor images

One of our recent projects was to look at how we could make the onboarding process easier and faster for a customer enrolling with an identity document.

Generally, many passports today have a foil or stamp covering part of the image, to reduce the risk of being able to fabricate a fake document or identity. So whilst OCR can read the details of a document quickly and accurately, photos in Government-issued documents are generally problematic to ‘read’ and use for authentication purposes. The knock-on effect is that remote onboarding – whilst adhering to regulations like Know Your Customer (KYC) – becomes trickier.

Note: The customer doesn’t really care why it’s difficult. All they know is that they may be asked several times to take a photo. And that’s annoying.

Fixing a problem like this applies not just to watermarked images in documents, but to poor quality images in general. Poor resolution, pixelation, shaky hands, shadows…any number of factors can impact image quality. What this translates to—for the customer trying to use facial authentication—is multiple requests, unexplained denials of access and a whopping great dollop of friction.

Who knows whether a potential customer that has abandoned facial authentication will ever come back to try again? It is critical that organisations get it right, to maximise the adoption of biometric authentication and benefit from it as soon as possible. Lower costs, better CX, regulatory compliance, omni-channel consistency and reduced fraud, as if you weren’t painfully aware of the demands on your business.

Machine learning – making customers happy

What we knew was that our customers needed a solution that could effectively de-noise (remove the problem) and enhance images (rebuild the image in an improved way), to get it right first time, and eradicate friction from the onboarding or authentication process.

The tech behind facial authentication

Before we look at machine learning in the context of image denoising and enhancing, let’s briefly explain the process of facial authentication.

Facial recognition relies on a user’s photo (“query”) being compared to an existing “template or templates” – an original image (or set of images) upon which new queries are compared. The speed and accuracy of the verification process directly correlates with the quality of the image, both at query and template level.

Put simply, the better the image, the better the performance of the face recognition algorithms.

By the way….at AimBrain, biometric templates are stored server-side, in the form of algorithmic-derived code, and all queries (facial authentication requests, in this case) come in through an organisation’s channel (web, mobile, CCTV and so on).

The query comes in to the organisation, and is passed to our server, where it is coded and compared against the coded template. A BIDaaS (Biometric Identity as-a-Service) approach means that a query can come in from any channel and be compared to a single template…which means when you enrol for facial authentication, you can authenticate using your face from any channel with a camera. But we digress.

Avoiding a poor user experience

More often than not, however, users aren’t aware that there are minimum requirements for image quality when authenticating.

When prompted for a selfie to start the authentication process, a user might try to take the picture in low lighting or extreme sunlight (unlikely in the UK but certainly in our clients’ geographies), they might be partly/wholly in the shadows (there we go, UK!), or might be using a low-resolution camera.

Whatever the cause, it was obvious that we needed to overcome the challenges by improving image quality, whilst keeping the facial features intact.

(And at the risk of stating the obvious, it is clearly a prerequisite that this can be done in-session, in real-time, for huge volumes of people.)

It’s all about artificial neural networks

What we needed to do was to enhance the quality of the image, in real-time, to speed up the authentication process.

The way in which we chose to do this, was through the application of a denoising autoencoder, one specific artificial neural network model.


Autoencoders are a type of unsupervised neural network, originally designed for “dimensionality reduction” purposes. Put as simply as possible, dimensionality reduction refers to reducing the number the variables in specific data, by removing any variables that aren’t relevant. As such, data goes from ‘high dimensionality’ to ‘lower dimensionality’ – its core defining variables.

Autoencoders learn an abstract representation (a code) of the input data (the image), using stacked/sequential encoder and decoder processes.

The encoder converts the image into code, and the decoder reconstructs the image from the code, as true to the original as possible.

This entire process of autoencoding is trained by comparing the original input to the reconstructed image, enhancing and refining the process by using multiple images and continual learning.


Denoising is the process of removing and reconstructing the elements of an image; a denoising autoencoder therefore takes a corrupted image as its input and aims to reconstruct a clean, uncorrupted image as its output. These encoders are normally trained by artificially corrupting a clean image, then comparing the original clean version to its reconstructed counterpart.

The encoder-decoder model we used for this process is called u-net, due to its (unsurprisingly) U-shaped format. Originally developed for biomedical image segmentation, it has also had great success in tasks such as car detection and image translation, possibly because of its ability to transfer information from the encoder to the decoder – as indicated by the grey arrows – for a more accurate and detailed reconstruction:

Being based on machine learning, our entire process of denoising and enhancement is completed in sub-second time. And, as a final flourish, the scalable architecture means that unprecedented volumes of data can be process in real-time, critical for today’s banks that are vying for the business of millions of customers.

Our thanks to Will Miller, our Business Development Manager and guinea pig in this image denoising and enhancing experiment
Our thanks to Will Miller, our Business Development Manager and guinea pig in this image denoising and enhancing experiment

Denoising autoencoder and its applications today

If it has both queries and templates from which to learn, an algorithm can quickly be trained on practically any image.

This means that denoising autoencoders can be trained on all sorts of image problems and corruptions – from lighting to blurring to watermarks and holograms.

Training neural networks

However, it’s worth mentioning that an algorithm could be trained even if we only had the template. In this case, you could view the template as the output; a clean image free from distortions or corruptions upon which all facial authentication takes place.

We can train an algorithm to de-noise and enhance future images/ queries by artificially corrupting this template image – applying occlusions or making the image blurry for example. If we have the proposed output, machine learning can be trained through artificial manipulation of the output, to deal with any number of issues on the input or query side.

So today, we are able to apply machine learning to improve and enhance vast quantities of images, so that friction is minimised and the customer journey becomes smoother. Will the customer be aware of this? Probably not. Just as they may not be aware of how we could apply this to any process or improvement if we have the necessary data.

But it’s not about the glory, is it? It’s about making the biometric onboarding or enrolment process so utterly seamless that the customer doesn’t even consider what’s going on behind the scenes.

If you liked this blog, you’ll love our newsletter. Opt in to receive it here and don’t miss a thing!

Share on linkedin
Share on twitter
Share on facebook
AimBrain - Simply Smarter Authentication