Blog Posts Process Analysis

Using Alternative Data in Credit Risk Modelling

Blog: Enterprise Decision Management Blog

“Whenever I bring up the topic of alternative data, the first question our board asks is, ‘Are we using Facebook data?’ “

This comment from a participant in our recent EMEA Risk Leadership Forum caused a lot of chuckles and nodding heads. When it comes to evaluating credit risk, everyone wants to know if, when and how lenders will start probing their Facebook account.

For reasons that will be obvious to lenders, that tantalizing possibility doesn’t actually top the list of data sources to mine. In fact, at the forum we explored a few sources of data that can add to the picture of a consumer’s creditworthiness.

Multiple Types of Alternative Data

What is alternative data? In credit granting, it generally refers to any data that is not directly related to a consumer’s credit behavior. Traditional data usually means data from a credit bureau, a credit application or a lender’s own files on an existing customer. Alternative data is everything else.

Alternative data is a hot topic, in part because of the data explosion of the last few years, and in part because of the drive for financial inclusion. There are an estimated 3 billion adults worldwide who don’t have credit and so don’t have credit records. Opening up that market is a priority for lenders. And while many of these people are in developing markets with nascent credit infrastructures, there are so-called “credit invisibles” in the most mature credit markets, people who have no credit and are unknown to the credit bureaus.

With this in mind, let’s look at a few sources of alternative data, and how useful they are for credit decisions.

How Much Value?

FICO research has shown that these data sources do add predictive value on margin to credit risk models based on traditional data. The amount of predictive value outlined in the table below should be viewed as relative indicators, not absolute values, as the additional value of the data source is based on many parameters such as predictive power of existing models, strength of the customer relationship with the lender, etc.

Chart showing value of multiple kinds of alternative data

Please note that the Traditional Models used as the baseline were application models, not credit bureau score models (such as the FICO® Score).

The chart below shows the result of one project FICO did for a personal loans origination portfolio. The traditional credit characteristics captured more value than the alternative data characteristics (with the alternative data capturing about 60% of the predictive power), and there was a high degree of overlap between the two. However, by combining the traditional and alternative data characteristics (and understanding the overlap so as not to over-weigh certain variables’ contribution), we were able to produce a more powerful model.

Lift curve for FICO study

Machine Learning and Explainability

It’s impossible to talk about alternative data without talking about different analytic technologies and machine learning, such as neural networks, random forests and stochastic gradient boosting. With large, unstructured data sets, the smart use of these technologies can identify data patterns that relate to credit risk and make the model development process more manageable.

However, as is true with AI in general, data scientists play an important role. They need to check the accuracy of the output, make sure the model doesn’t overfit the data, make sure the model provides stable output, and ensure that the patterns discovered are strong, relevant and explainable.

Explainability is a challenge when dealing with AI and machine learning. Lenders need to explain how consumers are scored – certainly to regulators, and often to consumers themselves. FICO uses a technology we call Scorecardizer, which takes the patterns identified in AI, machine learning and other techniques and turns them into scorecards that are easy to understand and implement, and produce the similar uplifts in predictive power as machine learning models. For more information on these techniques, see the blog post by FICO Chief Analytics Officer Scott Zoldi on How to Build Credit Risk Models Using AI and Machine Learning.

FICO’s analytics team will be presenting some of these concepts at the 2017 Credit Scoring and Credit Control Conference in Edinburgh, August 30-September 1. We have six presentations on a wide range of topics – if analytics is your thing, stop by and see us.

Read more posts by FICO experts on alternative data.

The post Using Alternative Data in Credit Risk Modelling appeared first on FICO.

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples