Long run probability of default (LRPD) estimation via truncated normal assumptions
In the Basel II/III capital framework for advanced (A-IRB) banks we use the so called "Basel formula" to compute the minimum capital requirements. The formula is based on the Asymptotic Single Risk Factor (ASRF) model, also known as the Vasicek-Merton model. The Long Run probability of Default (LRPD) is a key component in the computation of minimum capital requirements. This blogpost aims to present a simplified derivation of the PD component of formula and present a novel technique to estimate Long Run Probability of Default (LRPD) via truncated normal assumptions.
We have where is the macro-economic factor and is the idiosyncratic factor for each PD segment.
for some barrier . Re-arrange the values inside the , we get
recall that so the RHS can be written in terms of the CDF of , which is , yielding
We can approximate as Observed Default Rate , leading to
we also recognise that the part inside is normal so we can make variable substitutions and re-write it using and , which gives
We have a different for each time point, , so add in the subscript
and is invertible, so apply to both sides
so if the Basel formula is close to describing reality (instead of just a mathematical convenience) then the above implies that (which is easy to compute) is normal distributed . Hence to work out the LRPD, you just need to estimate and (for normal distribution that's easy to do), and perform the following integral (trust me, it's also easy to do as it has an analytical solution). Let be the probability density function (pdf) of
which yields (you have to trust my maths on this)
writing the above in the same form as the Basel formulation, we get
Excursion: Relationship to Basel Capital formula
We started with
and as we saw above, , so substituting that in, we get
and if I want the the value-at-risk (VAR) at 99.9%? One way is to first compute the 99.9%tile PD, . To do that, just substitute with the the at 99.9%tile which is , we have
which is the "unexpected loss PD component" in the Basel capital formula.
LRPD estimation using truncated normals
Back to the main topic, we have shown that the LRPD can be expressed simply using
- the mean, , and the
- standard deviation, , of
by our intepretation of the Basel formulation of PD:
However in the Australian context, most (if not all) banks and financial institutions don't have data from the severe recession (we had to have) from the early 90's. Therefore we can't simply take a straight average as that might be a biased estimate.
In this blog post, we discuss one way to overcome the bias, which is to assume that comes from a truncated normal. Basically we assumed a proportion of the 's that can lead to high PDs are not available to be sampled. Below is a plot of the pdf of a truncated normal distribution.
If we assume that is from a truncated normal then there is a need to estimate the truncation points. One can use MLE to estimate the truncation points. To do that we write the likelihood, , for observing the data, as below
where and are the upper and lower truncation point. In capital modelling, conservative assumptions are fine so we can assume that is sampled from a distribution where some of the worst 's are truncated (recall that lower means higher PD) but all the best 's can be sampled. This conservatism can be achieved by setting , yielding
it can be proved (definitely TRUE!!! but is not intuitive to many) that
I have provided a sketch of the proof in a later section.
Problem formulation: MLE of truncated normal paramters
The problem formulation is this: given where is the truncated normal with upper truncation point . What are the Maximum Likelihood Estimator (MLE) of the parameters ?
In this case we only want to solve the problem where the distribution is truncated from the top but not from the bottom.
Suppose a distribution has probability density function (pdf), then its truncated distribution has this pdf with lower and upper truncation points and is
where and it should be everywhere else.
Intuitively, but if we truncate the distribution then
- the pdf of the truncation must still follow the shape of the original where it isn't truncated; and
- it must still integerate to 1 over the range
It's not hard to see then that
is the only form that satisfies both criteria. In fact if we let be the cumulative distribution function (cdf) of the untruncated distribution then the pdf of the truncated distribution can be written as
In this particular case of truncated normals we let and be the pdf and cdf respectively and notice that as we only wanted to trucnated from the top; and so . Therefore the pdf becomes
The truncation point MLE is always
The minimum value that can take is . Let be the MLE estimator for , we can write
the cdf is a monotonically increasing function so we have
taking the reciprocal of boths sides yields
now multiply both sides by , we see that
the right hand side (RHS) looks like the cdf now and a bit of thinking should lead us to the conclusion that to maximize the likelihood we need to have
as that's where the maximum cap of the likelihood (density) is. Therefore the MLE optimization problem is a two dimensional one in and .
It's worth noting that this logic can be applied to any truncated distribution, not just normal. Indeed, the MLE of the upper truncation point is always the maximum of the observed values.
Also note that setting will lead to higher PDs than any other estimate of given .
|Judgemental choice of|
|A more judgemental way to choose is to use logic similiar to this "I have 20 years of data so I have not observed the worst 5% of PDs so ".|
So if we can estimate the and from the equation above then we have an estimate of the LRPD in the truncated normal framework. Ergashev et al have shown that given an , there is a necessary condition to maximimizing the above likelihood, and the solution of and is the solution to a simple root finding optimization problem (1 line of code to solve in R).
The necessary condition implies something important. There are certain values of for which there is no unique solution! TROUBLE!!!
A promising way that I have used in the past to solve this problem is to recognise that the truncation points, and , for some segments and should satisfy
that is, the percentile at which they sit should be the same, even though they may have different values! This adds a constraint to the model, which in practise almost always yield unique solutions! Problem SOLVED!!!
Although using normal distributions yield nice analytical solutions, but the approach of
- fitting a distribution to the ODR via for some function
- estimate the truncation point of the distribution
- simulate (or numerically integrate) from the whole (untruncated distribution) to recover an estimate of LRPD
should work for any distribiutions, assuming the real world follows a distribution at the extremes! Whether you believe this is a question for another day.
The optimization problem
Given observations , where and that . We aim to find paramters and that optimize this likelihood
often we try to optimize the log likelihood
Numerical solutions in Julia, R, and Python/SciPy
The algorithm from raw data to LRPD
Let be the observed default rate at time
- Find the MLE for and by numerically optimising this likelihood
Looking for a R trainer and/or risk modeling consultant?
Don't hestitate to email me at firstname.lastname@example.org or visit http://evalparse.io for more information.