Long run probability of default (LRPD) estimation via truncated normal assumptions

Published Feb 13, 2018Last updated Feb 19, 2018
Long run probability of default (LRPD) estimation via truncated normal assumptions


In the Basel II/III capital framework for advanced (A-IRB) banks we use the so called "Basel formula" to compute the minimum capital requirements. The formula is based on the Asymptotic Single Risk Factor (ASRF) model, also known as the Vasicek-Merton model. The Long Run probability of Default (LRPD) is a key component in the computation of minimum capital requirements. This blogpost aims to present a simplified derivation of the PD component of formula and present a novel technique to estimate Long Run Probability of Default (LRPD) via truncated normal assumptions.

The model

We have Z,ϵN(0,1)Z, \epsilon \sim \mathcal{N}(0,\,1) where ZZ is the macro-economic factor and ϵ\epsilon is the idiosyncratic factor for each PD segment.

PD=Pr(γZ+1γϵ<B)PD = \Pr\left( \sqrt{\gamma}Z +\sqrt{1 - \gamma}\epsilon < B\right)

for some barrier BB. Re-arrange the values inside the Pr\Pr, we get

PD=Pr(ϵ<BγZ1γ)PD = \Pr\left(\epsilon < \frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)

recall that ϵN(0,1)\epsilon \sim \mathcal{N}(0,\,1) so the RHS can be written in terms of the CDF of N(0,1)\mathcal{N}(0,\,1), which is Φ\Phi, yielding

PD=Φ(BγZ1γ)PD = \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)

We can approximate PDPD as Observed Default Rate (ODR)\left(ODR\right), leading to

ODRPD=Φ(BγZ1γ)ODR \approx PD = \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)

we also recognise that the part inside Pr\Pr is normal so we can make variable substitutions and re-write it using μ\mu and σ\sigma, which gives

ODRPD=Φ(μσZ)ODR \approx PD = \Phi\left(\mu - \sigma Z\right)

We have a different ZtZ_t for each time point, tt, so add in the subscript tt

ODRtΦ(μσZt)ODR_t \approx \Phi\left(\mu - \sigma Z_t\right)

and Φ\Phi is invertible, so apply Φ1\Phi^{-1} to both sides

Φ1(ODRt)μσZt\Phi^{-1}\left(ODR_t\right) \approx \mu - \sigma Z_t

so if the Basel formula is close to describing reality (instead of just a mathematical convenience) then the above implies that Φ1(ODRt)\Phi^{-1}\left(ODR_t\right) (which is easy to compute) is normal distributed t\forall \,t. Hence to work out the LRPD, you just need to estimate μ\mu and σ\sigma (for normal distribution that's easy to do), and perform the following integral (trust me, it's also easy to do as it has an analytical solution). Let ϕ(Z)\phi(Z) be the probability density function (pdf) of ZZ

LRPD^=E(Φ(μ^σ^Z))=Φ(μ^σ^Z)ϕ(Z)dZ\hat{LRPD} = \mathbb{E}\left(\Phi\left(\hat{\mu}-\hat{\sigma} Z\right)\right) = \int^\infty_{-\infty} \Phi\left(\hat{\mu}-\hat{\sigma} Z\right)\phi(Z)dZ

which yields (you have to trust my maths on this)

LRPD^=Φ(μ^1+σ^2)\hat{LRPD} = \Phi\left(\frac{\hat{\mu}}{\sqrt{1+\hat{\sigma}^2}}\right)

writing the above in the same form as the Basel formulation, we get

LRPD^=Φ(BγZ1γ)ϕ(Z)dZ\hat{LRPD} = \int^\infty_{-\infty} \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)\phi(Z)dZ

which gives

LRPD^=Φ(B^)\hat{LRPD} = \Phi\left(\hat{B}\right)

Excursion: Relationship to Basel Capital formula

We started with

PD=Φ(BγZ1γ)PD = \Phi\left(\frac{B - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)

and as we saw above, Φ1(LRPD^)=B^\Phi^{-1}\left(\hat{LRPD}\right) = \hat{B}, so substituting that in, we get

PD=Φ(Φ1(LRPD^)γZ1γ)PD = \Phi\left(\frac{\Phi^{-1}\left(\hat{LRPD}\right) - \sqrt{\gamma}Z}{\sqrt{1 - \gamma}}\right)

and if I want the the value-at-risk (VAR) at 99.9%? One way is to first compute the 99.9%tile PD, PD0.999PD_{0.999}. To do that, just substitute ZZ with the the ZZ at 99.9%tile which is Φ1(0.999)\Phi^{-1}\left(0.999\right), we have

PD0.999=Φ(Φ1(LRPD^)γΦ1(0.999)1γ)PD_{0.999} = \Phi\left(\frac{\Phi^{-1}\left(\hat{LRPD}\right) - \sqrt{\gamma}\Phi^{-1}\left(0.999\right)}{\sqrt{1 - \gamma}}\right)

or equivalently

PD0.999=Φ(Φ1(LRPD^)1γγΦ1(0.999)1γ)PD_{0.999} = \Phi\left(\frac{\Phi^{-1}\left(\hat{LRPD}\right)}{\sqrt{1 - \gamma}} - \frac{\sqrt{\gamma}\Phi^{-1}\left(0.999\right)}{\sqrt{1 - \gamma}}\right)

which is the "unexpected loss PD component" in the Basel capital formula.

LRPD estimation using truncated normals

Back to the main topic, we have shown that the LRPD can be expressed simply using

  • the mean, μ\mu, and the
  • standard deviation, σ\sigma, of
  • Φ1(ODRt)\Phi^{-1}\left(ODR_t\right)'s

by our intepretation of the Basel formulation of PD:

LRPD^=Φ(μ^1+σ^2)\hat{LRPD} = \Phi\left(\frac{\hat{\mu}}{\sqrt{1+\hat{\sigma}^2}}\right)

However in the Australian context, most (if not all) banks and financial institutions don't have data from the severe recession (we had to have) from the early 90's. Therefore we can't simply take a straight average as that might be a biased estimate.

In this blog post, we discuss one way to overcome the bias, which is to assume that ZZ comes from a truncated normal. Basically we assumed a proportion of the ZZ's that can lead to high PDs are not available to be sampled. Below is a plot of the pdf of a truncated normal distribution.


If we assume that ZZ is from a truncated normal then there is a need to estimate the truncation points. One can use MLE to estimate the truncation points. To do that we write the likelihood, LL, for observing the data, as below

L=tϕ(Φ1(ODRt),μ,σ2)Φ(u,μ,σ2)Φ(l,μ,σ2)L = \frac{\prod_t\phi(\Phi^{-1}\left(ODR_t\right), \mu, \sigma^2)}{\Phi(u,\mu,\sigma^2) - \Phi(l,\mu,\sigma^2)}

where uu and ll are the upper and lower truncation point. In capital modelling, conservative assumptions are fine so we can assume that ZZ is sampled from a distribution where some of the worst ZZ's are truncated (recall that lower ZZ means higher PD) but all the best ZZ's can be sampled. This conservatism can be achieved by setting u=u = -\infty, yielding

L=tϕ(Φ1(ODRt),μ,σ2)Φ(u,μ,σ2)L = \frac{\prod_t\phi(\Phi^{-1}\left(ODR_t\right), \mu, \sigma^2)}{ \Phi(u,\mu,\sigma^2)}


it can be proved (definitely TRUE!!! but is not intuitive to many) that

uMLE^=maxtΦ1(ODRt)\hat{u_{MLE}} = \max_t \Phi^{-1}\left(ODR_t\right)

I have provided a sketch of the proof in a later section.

Problem formulation: MLE of truncated normal paramters

The problem formulation is this: given Q1,Q2,...,QniidN(μ,σ2,,u)Q_1,\, Q_2,\, ...,\, Q_n \sim^{iid} \mathcal{N}(\mu,\,\sigma^2,\,-\infty,\, u) where N(μ,σ2,,u)\mathcal{N}(\mu,\,\sigma^2,\, -\infty,\, u) is the truncated normal with upper truncation point uu. What are the Maximum Likelihood Estimator (MLE) of the parameters μ,σ,u\mu,\,\sigma,\,u?


In this case we only want to solve the problem where the distribution is truncated from the top but not from the bottom.

Suppose a distribution has probability density function (pdf), pp then its truncated distribution has this pdf with lower and upper truncation points ll and uu is

ptrunc(x)=p(x)lup(x)dxp_{trunc}(x) = \frac{p(x)}{\int^u_lp(x)\mathrm{d}x}

where lxul \le x \le u and it should be 00 everywhere else.

Intuitively, p(x)dx=1\int_{-\infty}^{\infty}p(x)\mathrm{d}x = 1 but if we truncate the distribution then

  1. the pdf of the truncation must still follow the shape of the original where it isn't truncated; and
  2. it must still integerate to 1 over the range

It's not hard to see then that

ptrunc(x)=p(x)lup(x)dxp_{trunc}(x) = \frac{p(x)}{\int^u_lp(x)\mathrm{d}x}

is the only form that satisfies both criteria. In fact if we let PP be the cumulative distribution function (cdf) of the untruncated distribution then the pdf of the truncated distribution can be written as

ptrunc(x)=p(x)P(u)P(l)p_{trunc}(x) = \frac{p(x)}{P(u) - P(l)}

In this particular case of truncated normals we let ϕ\phi and Φ\Phi be the pdf and cdf respectively and notice that l=l = -\infty as we only wanted to trucnated from the top; and so Φ(l)=Φ()=0\Phi(l) = \Phi(-\infty) = 0. Therefore the pdf becomes

ϕtrunc(x)=ϕ(x,μ,σ)Φ(u,μ,σ)\phi_{trunc}(x) = \frac{\phi(x,\mu,\sigma)}{\Phi(u,\,\mu,\sigma)}

The truncation point MLE is always max(Qt)\max(Q_t)

The minimum value that uu can take is maxQt\max Q_t. Let u^\hat{u} be the MLE estimator for uu, we can write

maxQtu^\max Q_t \le \hat{u}

the cdf Φ\Phi is a monotonically increasing function so we have

Φ(maxQt,μ,σ)Φ(u^,μ,σ)\Phi(\max Q_t,\,\mu,\,\sigma) \le \Phi(\hat{u},\,\mu,\,\sigma)

taking the reciprocal of boths sides yields

1Φ(maxQt,μ,σ)1Φ(u^,μ,σ)\frac{1}{\Phi(\max Q_t,\,\mu,\,\sigma)} \ge \frac{1}{\Phi(\hat{u},\,\mu,\,\sigma)}

now multiply both sides by ϕ(Qi,μ,σ)\phi(Q_i,\mu,\sigma), we see that

ϕ(Qi,μ,σ)Φ(maxQt,μ,σ)ϕ(Qi,μ,σ)Φ(u^,μ,σ)\frac{\phi(Q_i,\mu,\sigma)}{\Phi(\max Q_t,\,\mu,\,\sigma)} \ge \frac{\phi(Q_i,\mu,\sigma)}{\Phi(\hat{u},\,\mu,\,\sigma)}

the right hand side (RHS) looks like the cdf now and a bit of thinking should lead us to the conclusion that to maximize the likelihood we need to have

u^=maxQt\hat{u} = \max Q_t

as that's where the maximum cap of the likelihood (density) is. Therefore the MLE optimization problem is a two dimensional one in μ\mu and σ\sigma.

It's worth noting that this logic can be applied to any truncated distribution, not just normal. Indeed, the MLE of the upper truncation point is always the maximum of the observed values.


Also note that setting u^=maxQt\hat{u} = \max Q_t will lead to higher PDs than any other estimate of u^\hat{u} given u^>=maxQt\hat{u} >= \max Q_t.

Judgemental choice of uu
A more judgemental way to choose uu is to use logic similiar to this "I have 20 years of data so I have not observed the worst 5% of PDs so Φ(u)=0.95\Phi(u) = 0.95".

So if we can estimate the μ\mu and σ\sigma from the equation above then we have an estimate of the LRPD in the truncated normal framework. Ergashev et al have shown that given an uu, there is a necessary condition to maximimizing the above likelihood, and the solution of μ\mu and σ\sigma is the solution to a simple root finding optimization problem (1 line of code to solve in R).

The necessary condition implies something important. There are certain values of Φ1(ODRt)\Phi^{-1}\left(ODR_t\right) for which there is no unique solution! TROUBLE!!!

A promising way that I have used in the past to solve this problem is to recognise that the truncation points, useg1u_{seg1} and useg2u_{seg2}, for some segments seg1seg1 and seg2seg2 should satisfy

Φ(useg1,μseg1,σseg12)=Φ(useg2,μseg2,σseg22)\Phi\left(u_{seg1},\mu_{seg1},\sigma^2_{seg1}\right) = \Phi\left(u_{seg2},\mu_{seg2},\sigma^2_{seg2}\right)

that is, the percentile at which they sit should be the same, even though they may have different values! This adds a constraint to the model, which in practise almost always yield unique solutions! Problem SOLVED!!!

Although using normal distributions yield nice analytical solutions, but the approach of

  1. fitting a distribution to the ODR via F1(ODR)F^{-1}\left(ODR\right) for some function FF
  2. estimate the truncation point of the distribution
  3. simulate (or numerically integrate) from the whole (untruncated distribution) to recover an estimate of LRPD

should work for any distribiutions, assuming the real world follows a distribution at the extremes! Whether you believe this is a question for another day.

The optimization problem

Given observations Q1,Q2,...,QniidN(μ,σ2,,u)Q_1,\, Q_2,\, ...,\, Q_n \sim^{iid} \mathcal{N}(\mu,\,\sigma^2,\,-\infty,\, u), where Qi=μσZiQ_i = \mu - \sigma Z_i and that ZiN(0,1)Z_i\sim \mathcal{N}(0,\,1). We aim to find paramters μ\mu and σ\sigma that optimize this likelihood

L=(ϕ(Qi,μ,σ)Φ(maxQt,μ,σ))L = \prod\left(\frac{\phi(Q_i,\mu,\sigma)}{\Phi(\max Q_t,\mu,\sigma)}\right)

often we try to optimize the log likelihood

logL=l=(iϕ(Qi,μ,σ))nΦ(maxQt,μ,σ)\log L = l = \left(\sum_i \phi(Q_i,\mu,\sigma)\right) - n\Phi(\max Q_t,\mu,\sigma)

Numerical solutions in Julia, R, and Python/SciPy

In another blog post, I have coded up solutions to the above problem in Julia, R, and Python/Scipy.

The algorithm from raw data to LRPD

Let ODRtODR_t be the observed default rate at time tt

  1. Compute Qt=Φ1(ODRt)tQ_t = \Phi^{-1}\left(ODR_t\right) \,\forall t
  2. Find the MLE for μ\mu and σ\sigma by numerically optimising this likelihood L=(ϕ(Qi,μ,σ)Φ(maxQt,μ,σ))L = \prod\left(\frac{\phi(Q_i,\mu,\sigma)}{\Phi(\max Q_t,\mu,\sigma)}\right)
  3. Then

LRPD^=Φ(μ^MLE1+σ^MLE2)\hat{LRPD} = \Phi\left(\frac{\hat{\mu}_{MLE}}{\sqrt{1+\hat{\sigma}_{MLE}^2}}\right)

Looking for a R trainer and/or risk modeling consultant?

Don't hestitate to email me at dzj@analytixware.com or visit http://evalparse.io for more information.

Discover and read more posts from ZJ
get started