February | 2016 | SQL Pete

Getting it all wrong

Posted by sqlpete in chat, process on February 14, 2016

In consumer lending, there are many constraints that have to be satisfied before any money is paid out to a customer. Off the top of my head, the customer must be:

Affordable
In employment (not always, but usually)
Over 18 (and under some policy-defined upper age limit)
Not bankrupt or otherwise insolvent
Accepted by the credit risk scorecard
Not a PEP (Politically Exposed Person)
Not an SDN (Specially Designated National), someone the government doesn’t want us to do business with
Not suffering from mental health problems, or otherwise vulnerable
Not matched by our own ‘do not lend’ list – i.e. they don’t match (or have links to) previous frauds or undesirable customers
Accepted by the fraud risk scorecard
Not suffering from an addiction to gambling
Not currently in prison
Thought to be telling the truth about their financial situation

(The exact list isn’t important here.)

Some of these statuses are easier to determine than others, but it’s impossible to get any of them 100% correct. Credit files are certainly imperfect, and the data they contain is always going to be out-of-date to some extent; PEP/SDN lists are pretty fuzzy; and of course, some people lie about their financial situation in order to get a loan paid out.

For the sake of demonstration, let’s assume we’re 99% accurate for each one: we get it wrong one time in a hundred. Given that error rate, what’s the probability of getting the overall decision wrong — that is, not satisfying every constraint — and paying out to someone we shouldn’t? Simple probability tells us it’s one minus the probability of getting all of them right, so for the 13 constraints above, it’s 1 – (0.99 ^ 13) = 12.2%. In other words, we’ve a one in eight chance of paying out when we shouldn’t; the consequences of which could be loss of money, regulatory or legal issues, damage to reputation, etc.

You could break some of those constraints into smaller parts: e.g. affordability is a calculation that relies on both income and expenditure being calculated correctly. Hence, satisfying those 13 constraints is down to many more smaller decisions (a scorecard involves hundreds of calculations). In which case, our situation is even worse: at our 99% error rate, it only takes 69 decisions to make it more likely that our overall decision is wrong! (1 – (0.99 ^ 69) = 50.02%.)

Luckily, the real-life situation isn’t this bad:

For some decisions we’re far better than 99% accurate, e.g. “aged over 18” is reasonably simple to get right
We test our scorecards thoroughly (yes, we do).^*
We corroborate information by obtaining it from several sources
The various decisions aren’t all independent (our calculation assumes they are)

^* Note that we’re not concerned here with scorecard predicting correctly – just functioning correctly, in accordance with how it was built.

To be honest, I’ve never estimated what the actual “incorrect lending decision” rate might be (my gut feeling is it’s satisfactorily low enough), but I’ve put it on my list of things to consider. And, as ever, I know that I won’t have a hope of getting an accurate answer if the data isn’t readily available and undistorted.

accuracy, error rates

	sqlpete on Determining SQL datatypes from…
	sqlpete on Two subtitles at the same…
	sqlpete on Two subtitles at the same…
	sqlpete on Practical queries using binary…
	sqlpete on Distances in SQL with geograph…

SQL Pete

Archive for February, 2016

Getting it all wrong

Recent Posts

Follow me on Twitter

Recent Comments

Archives

Categories

Meta