# Long run probability of default (LRPD) estimation via truncated normal assumptions

# Intro

In the Basel II/III capital framework for advanced (A-IRB) banks we use the so called "Basel formula" to compute the minimum capital requirements. The formula is based on the Asymptotic Single Risk Factor (ASRF) model, also known as the Vasicek-Merton model. The Long Run probability of Default (LRPD) is a key component in the computation of minimum capital requirements. This blogpost aims to present a simplified derivation of the PD component of formula and present a novel technique to estimate Long Run Probability of Default (LRPD) via truncated normal assumptions.

# The model

We have $Z, \epsilon \sim \mathcal{N}(0,\,1)$ where $Z$ is the macro-economic factor and $\epsilon$ is the idiosyncratic factor for each PD segment.

$PD = \Pr\left( \sqrt{\gamma}Z +\sqrt{1 - \gamma}\epsilon < B\right)$

for some barrier $B$. Re-arrange the values inside the $\Pr$, we get

$PD = \Pr\left(\epsilon < \frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)$

recall that $\epsilon \sim \mathcal{N}(0,\,1)$ so the RHS can be written in terms of the CDF of $\mathcal{N}(0,\,1)$, which is $\Phi$, yielding

$PD = \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)$

We can approximate $PD$ as Observed Default Rate $\left(ODR\right)$, leading to

$ODR \approx PD = \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)$

we also recognise that the part inside $\Pr$ is normal so we can make variable substitutions and re-write it using $\mu$ and $\sigma$, which gives

$ODR \approx PD = \Phi\left(\mu - \sigma Z\right)$

We have a different $Z_t$ for each time point, $t$, so add in the subscript $t$

$ODR_t \approx \Phi\left(\mu - \sigma Z_t\right)$

and $\Phi$ is invertible, so apply $\Phi^{-1}$ to both sides

$\Phi^{-1}\left(ODR_t\right) \approx \mu - \sigma Z_t$

so if the Basel formula is close to describing reality (instead of just a mathematical convenience) then the above implies that $\Phi^{-1}\left(ODR_t\right)$ (which is easy to compute) is normal distributed $\forall \,t$. Hence to work out the LRPD, you just need to estimate $\mu$ and $\sigma$ (for normal distribution that's easy to do), and perform the following integral (trust me, it's also easy to do as it has an analytical solution). Let $\phi(Z)$ be the *probability density function* (pdf) of $Z$

$\hat{LRPD} = \mathbb{E}\left(\Phi\left(\hat{\mu}-\hat{\sigma} Z\right)\right) = \int^\infty_{-\infty} \Phi\left(\hat{\mu}-\hat{\sigma} Z\right)\phi(Z)dZ$

which yields (you have to trust my maths on this)

$\hat{LRPD} = \Phi\left(\frac{\hat{\mu}}{\sqrt{1+\hat{\sigma}^2}}\right)$

writing the above in the same form as the Basel formulation, we get

$\hat{LRPD} = \int^\infty_{-\infty} \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)\phi(Z)dZ$

which gives

$\hat{LRPD} = \Phi\left(\hat{B}\right)$

# Excursion: Relationship to Basel Capital formula

We started with

$PD = \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)$

and as we saw above, $\Phi^{-1}\left(\hat{LRPD}\right) = \hat{B}$, so substituting that in, we get

$PD = \Phi\left(\frac{\Phi^{-1}\left(\hat{LRPD}\right) - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)$

and if I want the the value-at-risk (VAR) at 99.9%? One way is to first compute the 99.9%tile PD, $PD_{0.999}$. To do that, just substitute $Z$ with the the $Z$ at 99.9%tile which is $\Phi^{-1}\left(0.999\right)$, we have

$PD_{0.999} = \Phi\left(\frac{\Phi^{-1}\left(\hat{LRPD}\right) - \sqrt{\gamma}\Phi^{-1}\left(0.999\right)}{\sqrt{1 - \gamma}}\right)$

or equivalently

$PD_{0.999} = \Phi\left(\frac{\Phi^{-1}\left(\hat{LRPD}\right)}{\sqrt{1 - \gamma}} - \frac{\sqrt{\gamma}\Phi^{-1}\left(0.999\right)}{\sqrt{1 - \gamma}}\right)$

which is the "unexpected loss PD component" in the Basel capital formula.

# LRPD estimation using truncated normals

Back to the main topic, we have shown that the LRPD can be expressed simply using

- the mean, $\mu$, and the
- standard deviation, $\sigma$, of
- $\Phi^{-1}\left(ODR_t\right)$'s

by our intepretation of the Basel formulation of PD:

$\hat{LRPD} = \Phi\left(\frac{\hat{\mu}}{\sqrt{1+\hat{\sigma}^2}}\right)$

However in the Australian context, most (if not all) banks and financial institutions don't have data from the severe recession (we had to have) from the early 90's. Therefore we can't simply take a straight average as that might be a biased estimate.

In this blog post, we discuss one way to overcome the bias, which is to assume that $Z$ comes from a truncated normal. Basically we assumed a proportion of the $Z$'s that can lead to high PDs are not available to be sampled. Below is a plot of the pdf of a truncated normal distribution.

If we assume that $Z$ is from a truncated normal then there is a need to estimate the truncation points. One can use MLE to estimate the truncation points. To do that we write the likelihood, $L$, for observing the data, as below

$L = \frac{\prod_t\phi(\Phi^{-1}\left(ODR_t\right), \mu, \sigma^2)}{\Phi(u,\mu,\sigma^2) - \Phi(l,\mu,\sigma^2)}$

where $u$ and $l$ are the *upper* and *lower* truncation point. In capital modelling, conservative assumptions are fine so we can assume that $Z$ is sampled from a distribution where some of the worst $Z$'s are truncated (recall that lower $Z$ means higher PD) but all the best $Z$'s can be sampled. This conservatism can be achieved by setting $u = -\infty$, yielding

$L = \frac{\prod_t\phi(\Phi^{-1}\left(ODR_t\right), \mu, \sigma^2)}{ \Phi(u,\mu,\sigma^2)}$

it can be proved (definitely TRUE!!! but is not intuitive to many) that

$\hat{u_{MLE}} = \max_t \Phi^{-1}\left(ODR_t\right)$

I have provided a sketch of the proof in a later section.

## Problem formulation: MLE of truncated normal paramters

The problem formulation is this: given $Q_1,\, Q_2,\, ...,\, Q_n \sim^{iid} \mathcal{N}(\mu,\,\sigma^2,\,-\infty,\, u)$ where $\mathcal{N}(\mu,\,\sigma^2,\, -\infty,\, u)$ is the truncated normal with upper truncation point $u$. What are the Maximum Likelihood Estimator (MLE) of the parameters $\mu,\,\sigma,\,u$?

In this case we only want to solve the problem where the distribution is truncated from the top but not from the bottom.

Suppose a distribution has *probability density function* (pdf), $p$ then its truncated distribution has this pdf with lower and upper truncation points $l$ and $u$ is

$p_{trunc}(x) = \frac{p(x)}{\int^u_lp(x)\mathrm{d}x}$

where $l \le x \le u$ and it should be $0$ everywhere else.

Intuitively, $\int_{-\infty}^{\infty}p(x)\mathrm{d}x = 1$ but if we truncate the distribution then

- the pdf of the truncation must still follow the shape of the original where it isn't truncated; and
- it must still integerate to 1 over the range

It's not hard to see then that

$p_{trunc}(x) = \frac{p(x)}{\int^u_lp(x)\mathrm{d}x}$

is the only form that satisfies both criteria. In fact if we let $P$ be the *cumulative distribution function* (cdf) of the untruncated distribution then the pdf of the truncated distribution can be written as

$p_{trunc}(x) = \frac{p(x)}{P(u) - P(l)}$

In this particular case of truncated normals we let $\phi$ and $\Phi$ be the pdf and cdf respectively and notice that $l = -\infty$ as we only wanted to trucnated from the top; and so $\Phi(l) = \Phi(-\infty) = 0$. Therefore the pdf becomes

$\phi_{trunc}(x) = \frac{\phi(x,\mu,\sigma)}{\Phi(u,\,\mu,\sigma)}$

## $\max(Q_t)$

The truncation point MLE is alwaysThe minimum value that $u$ can take is $\max Q_t$. Let $\hat{u}$ be the MLE estimator for $u$, we can write

$\max Q_t \le \hat{u}$

the cdf $\Phi$ is a monotonically increasing function so we have

$\Phi(\max Q_t,\,\mu,\,\sigma) \le \Phi(\hat{u},\,\mu,\,\sigma)$

taking the reciprocal of boths sides yields

$\frac{1}{\Phi(\max Q_t,\,\mu,\,\sigma)} \ge \frac{1}{\Phi(\hat{u},\,\mu,\,\sigma)}$

now multiply both sides by $\phi(Q_i,\mu,\sigma)$, we see that

$\frac{\phi(Q_i,\mu,\sigma)}{\Phi(\max Q_t,\,\mu,\,\sigma)} \ge \frac{\phi(Q_i,\mu,\sigma)}{\Phi(\hat{u},\,\mu,\,\sigma)}$

the right hand side (RHS) looks like the cdf now and a bit of thinking should lead us to the conclusion that to maximize the likelihood we need to have

$\hat{u} = \max Q_t$

as that's where the maximum cap of the likelihood (density) is. Therefore the MLE optimization problem is a two dimensional one in $\mu$ and $\sigma$.

It's worth noting that this logic can be applied to any truncated distribution, not just normal. Indeed, the MLE of the upper truncation point is always the maximum of the observed values.

Also note that setting $\hat{u} = \max Q_t$ will lead to higher PDs than any other estimate of $\hat{u}$ given $\hat{u} >= \max Q_t$.

Judgemental choice of $u$ |
---|

A more judgemental way to choose $u$ is to use logic similiar to this "I have 20 years of data so I have not observed the worst 5% of PDs so $\Phi(u) = 0.95$". |

So if we can estimate the $\mu$ and $\sigma$ from the equation above then we have an estimate of the LRPD in the truncated normal framework. Ergashev et al have shown that given an $u$, there is a **necessary condition** to maximimizing the above likelihood, and the solution of $\mu$ and $\sigma$ is the solution to a simple root finding optimization problem (1 line of code to solve in R).

The **necessary condition** implies something important. There are certain values of $\Phi^{-1}\left(ODR_t\right)$ for which there is no unique solution! TROUBLE!!!

A promising way that I have used in the past to solve this problem is to recognise that the truncation points, $u_{seg1}$ and $u_{seg2}$, for some segments $seg1$ and $seg2$ should satisfy

$\Phi\left(u_{seg1},\mu_{seg1},\sigma^2_{seg1}\right) = \Phi\left(u_{seg2},\mu_{seg2},\sigma^2_{seg2}\right)$

that is, the *percentile* at which they sit should be the same, even though they may have different values! This adds a constraint to the model, which in practise almost always yield unique solutions! Problem SOLVED!!!

Although using normal distributions yield nice analytical solutions, but the approach of

- fitting a distribution to the ODR via $F^{-1}\left(ODR\right)$ for some function $F$
- estimate the truncation point of the distribution
- simulate (or numerically integrate) from the whole (untruncated distribution) to recover an estimate of LRPD

should work for any distribiutions, assuming the real world follows a distribution at the extremes! Whether you believe this is a question for another day.

# The optimization problem

Given observations $Q_1,\, Q_2,\, ...,\, Q_n \sim^{iid} \mathcal{N}(\mu,\,\sigma^2,\,-\infty,\, u)$, where $Q_i = \mu - \sigma Z_i$ and that $Z_i\sim \mathcal{N}(0,\,1)$. We aim to find paramters $\mu$ and $\sigma$ that optimize this likelihood

$L = \prod\left(\frac{\phi(Q_i,\mu,\sigma)}{\Phi(\max Q_t,\mu,\sigma)}\right)$

often we try to optimize the log likelihood

$\log L = l = \left(\sum_i \phi(Q_i,\mu,\sigma)\right) - n\Phi(\max Q_t,\mu,\sigma)$

## Numerical solutions in Julia, R, and Python/SciPy

In another blog post, I have coded up solutions to the above problem in Julia, R, and Python/Scipy.

# The algorithm from raw data to LRPD

Let $ODR_t$ be the observed default rate at time $t$

- Compute $Q_t = \Phi^{-1}\left(ODR_t\right) \,\forall t$
- Find the MLE for $\mu$ and $\sigma$ by numerically optimising this likelihood $L = \prod\left(\frac{\phi(Q_i,\mu,\sigma)}{\Phi(\max Q_t,\mu,\sigma)}\right)$
- Then

$\hat{LRPD} = \Phi\left(\frac{\hat{\mu}_{MLE}}{\sqrt{1+\hat{\sigma}_{MLE}^2}}\right)$

# Looking for a R trainer and/or risk modeling consultant?

Don't hestitate to email me at dzj@analytixware.com or visit http://evalparse.io for more information.