Abstract
This paper argues that the incorrect application of mathematical models, particularly the Gaussian copula, played a critical role in the 2007-2008 financial crisis. The misinterpretation of default correlation theory led to systemic underestimation of credit risk, exacerbating the collapse of mortgage-backed securities and collateralized debt obligations. Using a detailed mathematical analysis, this study examines how financial institutions misapplied risk assessment models and how flaws in risk management within the insurance sector contributed to market instability. Key findings indicate that the reliance on oversimplified correlation assumptions led to a fragile financial system, ultimately triggering widespread economic fallout. The paper highlights the necessity of refining risk models to better capture financial dependencies and prevents future crises.
1. Introduction
As remarked by1 there is no single factor that is solely responsible for the crisis that unfolded in 2007-2008, although it is acknowledged that the main issue was the transfer of risk of mortgage default between the two parties: mortgage lenders on one hand and the banks, hedge funds, and insurance companies on the other. This happened through a process called securitization, which is undertaken because institutions usually seek to reduce their costs as well as tax obligations. The paper will provide an overview on how CDO, hazard rates, copulas, and Markov Chain interacted to cause the crisis.
1.1. Creation of Collateralized Debt Obligations
Using the terminology in1, a Collateralized Debt Obligation (CDO) is a product that is purchased or arranged through the process of taking out loans or assets. A common act we may have experienced such as student loans, auto loans, margin stocks are all part of CDOs. The process of creating a CDO is simple. Banks pool existing debts, such as mortgages, auto loans, and corporate debt, and restructure them into CDOs. These securities are then divided into different tranches to attract investors with varying risk appetites. CDOs do not create new financial assets; they simply repackage existing loans into structured financial products. Rating agencies assigned AAA ratings to senior tranches under the assumption that mortgage defaults were largely uncorrelated. However, this model severely underestimated the risk that defaults could cluster during a housing downturn, leading to a collapse in tranche values. There is a wide pool of benefits of loaning and purchasing a CDO. One of the biggest parts of CDOs attracting consumers is that it is available for majority of the consumers while receiving the needed assets immediately and also possibly enhancing the consumers’ credit. A purchase of a CDO would not only affect the consumer but has an even larger impact on the economic side. The funds the banks receive will be used to create other assets and produce liquidity for the financial market.
However, when purchasing a CDO, the consumer should be cautious and look over the liquidity , structural, rating, credit risk, and many other factors. Most commonly people lack the understanding of what a CDO actually is, and how much damage it can cost if not properly handled. This lack of understanding was exacerbated by financial models, which assumed mortgage defaults were independent. In reality, defaults became highly correlated when the housing market declined, leading to mispricing of risk and massive losses for investors. The risk of a CDO is often overlooked due to its complexity making the investors difficult to understand. CDOs can be seen as a complicated box filled with matching pieces. In many cases investors do not know or fully understand what is inside the box, which could lead to a higher risk than what they have anticipated. Liquidity risk is a risk of when a liquidity could become an illiquidity, meaning the the asset has decreased in value and there is no purchaser wanting to buy the liquidity. Structural risk is the one associated with the matching puzzles. It has various tranches creating cash flows. Tranches are structured based on credit seniority, not just payment order. Senior tranches were considered safe because they were supposed to absorb losses last, but this assumption failed when default correlation spiked, causing highly rated tranches to collapse in value. Most assets also have a rating risk, the risk of damage a business can impact towards the industry or company, which is proven to be inaccurate most the times.
1.2 Case Studies: Institutional Collapse and CDO Pricing Failures
The financial crisis of 2008 was driven by a combination of flawed pricing models, misjudged risk correlations, and regulatory oversight failures. This section examines key case studies, including Merrill Lynch’s Norma CDO, AIG’s mispriced CDS contracts, and pre-crisis CDO pricing models, using findings from the Financial Crisis Inquiry Commission (FCIC) report.
First let’s consider Merrill Lynch’s Norma CDO, which was issued in 2007, and is considered a prime example of how pre-crisis CDOs relied on incorrect correlation assumptions. The Norma CDO was a synthetic CDO squared, which means that it contained tranches of other synthetic CDOs, which themselves held credit default swaps (CDS) on subprime mortgage-backed securities (MBSs). There were two main key model flaws. First is the Gaussian Copula Model Assumptions. The pricing of the Norma CDO assumed low correlation among defaults within the underlying mortgage-backed securities. However, as the housing market collapsed, defaults became highly correlated, making senior tranches much riskier than models had predicted. The second issue was the Tranche Ratings versus the Realized Defaults. The CDO was AAA rated by Moody’s and S&P, despite being built on highly unstable subprime loans. According to the FCIC, over 90% of the AAA-rated mortgage-backed securities from 2006 and 2007 were downgraded to junk status by 2008. The rapid deterioration of Norma CDO’s tranches exposed the flaws in the Gaussian Copula model, particularly it’s failure to account for systematic shocks and extreme tail risk.
Next we have AIG’s mispriced CDS Contracts which gave the illusion of Risk Protection. AIG Financial Products (AIGFP) sold credit default swaps (CDS) on senior tranches of CDOs and mortgage backed securities (MBS), believing that the probability of default was near zero. This belief was based on flawed pricing models and risk assumptions. Among these assumptions, the most important was an underestimation of the probability of default. AIG priced CDS protection on super-senior CDO tranches with extremely low premiums, assuming defaults across different MBS tranches would remain uncorrelated. In addition, they ignored feedback loops, as in, AIG’s models did not account for the self reinforcing nature of the market, as in, as defaults rose, mark-to-market losses increased which forced collateral calls which in turn reduced liquidity. In addition, there were regulatory blind spots as in, the FCIC report found that AIG was not required to hold sufficient capital against these CDS contracts, relying instead on internal risk models that underestimated systemic risk.
Finally there was the issue of flawed CDO Pricing Models. Pre-crisis CDO models systematically failed due to misplaced confidence in default correlation assumptions and risk diversifications. For instance, David Li’s Gaussian Copula model, widely used to price CDOs, assumed that default correlations were stable over time, meaning that the probability of simultaneous defaults remained low. In addition it was assumed that historical data from pre-2005 was applicable to 2006-2007 subprime loans, despite weaker lending standards and higher loan-to-value ratios in later mortgages. It turns out that realized defaults in 2007-2008 were significantly higher than modeled expectations. Moody’s 2005 model predicted a worst-case scenario of 5% default rates on subprime mortgages; and by 2008, the actual rate exceeded 20%. CDO tranche correlations, assumed to be around 0.2 to 0.3, surged to nearly 1.0 as the housing crisis deepened, making AAA-rated tranches vulnerable.
In addition, there were issues related to Markov Chain models and KMV EDF (Expected Default Frequency) Misestimations. The models underestimated systemic risk by treating firm’s defaults as independent events. The Markov Chain based credit risk models, which predicted default probabilities based on historical transition matrices failed to account for clustering of mortgage defaults.
Finally we come to the regulatory reports, which can be used to tie model failures to institutional collapses. The FCIC report of 2011 highlights several regulatory model failures that exacerbated these model miscalculations. First the SEC failed to enforce stronger oversight on rating agencies, allowing conflicted incentives to persist, since banks paid rating agencies for ratings. Basel II capital regulations incentivized banks to hold AAA-rated CDO tranches, reinforcing the mispricing of risk. The Federal Reserve and the Office of the Comptroller of the Currency (OCC) overlooked risk concentration in AIG’s CDS exposure, leading to a massive government bailout.
1.2. Securitization
As there are various causes to this crisis, the most impactful was the failure of securitization, an act of arranging and organizing loans and mortgages to create profit in the capital market. The pool of assets being arranged into a package are consumer loans, debts, mortgages, and any other illiquid assets. Securitization allowed banks to convert illiquid assets into tradable securities, providing liquidity while offloading risk onto investors. However, this process encouraged excessive lending, as banks no longer bore the consequences of risky mortgage issuance. When making the portfolios, the banks consider tranches, different section that is made consisting the types of assets. The portfolios made become accessible to the public, attracting investors for a fixed rate of return: the money generated from the assets within the portfolio.
What role did securitization play in causing the mortgage crisis? During this time banks formed risky portfolios such as CDOs and MBS (mortgage-backed securities) which were misrated leading people to think of it as reliable and safe investments. These securities were misrated because risk models, underestimated the correlation between mortgage defaults. Investors assumed diversification protected them, but when home prices fell, defaults surged across all mortgage pools, exposing systemic weaknesses. Rating agencies, pressured by investment banks that paid for their ratings, assigned AAA grades to high-risk securities. This over-reliance on flawed mathematical models, such as Gaussian copulas, led investors to believe these assets were safe when they were actually highly volatile. As housing prices declined and interest rates increased, people started defaulting on their mortgages. Since banks and hedge funds had leveraged themselves heavily with mortgage-backed securities, the sharp rise in defaults triggered margin calls and fire sales. With no buyers for toxic assets, major hedge firms collapsed, freezing credit markets and intensifying the financial crisis. The banks that have bought these securities fell into bankruptcy, leading to a cause of the mortgage crisis of 2008.
1.3. Hazard Rate Functions (delete)
In financial modeling, hazard rate functions were widely used to estimate the probability of mortgage default over time. However, these models incorrectly assumed that defaults occurred independently and followed a stable trend, which failed catastrophically when housing prices collapsed. Following,2, suppose is a discrete random variable assuming values in
with probability mass function
and survival function
. We can think of
as the random lifetime of a device that can fail only at times in
. The hazard rate function of
is defined as:
(1)
at points for which
. The hazard rate is also called the failure rate or intensity function. If
has a finite support
,
, then
.
So given or
we can determine
. Here is how:
(2)
Here we assume that . If
then
.
This form of shows that we can use it to model the life distribution. In this regard we have the following result:
Theorem 1.1. A necessary and sufficient condition that is the hazard rate function of a distribution with support
is that
for
and
In this case the probability mass function .
Associated with this idea is the concept of a random variable called the time until default, also known as the survival time. This will apply to a security for instance. In order to define the time until default, we need a clearly defined origin of time, a time scale for measuring time and what actually defines a default. The time origin is defined to be the current time, the time scale is the years used for continuous models, or number of periods, and the default is defined by credit rating agencies such as Moody’s.
The survival function defined earlier gives the probability that the security will attain the age
. Suppose we have an existing security A. The time until default,
is a continuous random variable that measures the length of time from today until the point in time when default occurs.Suppose
represent the distribution function of
i.e.
. So here
, with the assumption that
. This is reasonable since every security will exist at time time
. The probability density function is defined as:
(3)
For a security that has survived years, the future lifetime for that security is
, as long as
.
The following notations are used in this regard:
is understood as the probability that the security A will default within the next
years, given that it survives for
years. For
.
For , one writes
and likewise
is defined.
is termed the marginal default probability. This is the probability of default in the next year, conditional on the survival until the beginning of the year.
Another way to define the Hazard rate function can be the ratio . The Hazard rate function gives the instantaneous probability of default for a security that is of age
. To write this another way:
(4)
From here, .
From here,
(5)
Now, and
. If the hazard rate
is constant over a period
. If this is true then
. This means that the survival time follows an exponential distribution with parameter
. Hazard rates failed to account for contagion effects, where one default could trigger many others. As interest rates rose and home prices fell, borrowers defaulted, revealing that hazard rate models had significantly underestimated the likelihood of simultaneous defaults, leading to severe mispricing of CDOs. The survival probability over the interval
for
is given by:
The crucial idea here is that the mathematical model of a default process is similar to modeling a hazard function. Li (2000) gives several reasons for such an assumption:
- We get information on the immediate default risk of each entity which is known to be alive at time
- Groups comparisons are easier with hazard rate functions
- Hazard rate functions can be adapted to cases of stochastic default fluctuations
- Hazard rate functions are similar to short rate models in the context of interest rate derivatives
With this assumption, the joint survival function of two entities A and B, with survival times is given by
.The joint distributional function is:
(6)
With this background, we can define the default correlation of two entities A and B with respect to their survival times and
:
(7)
This is also called the survival time correlation. This is more general compared to the discrete default correlation that depends on a single period. The discrete default correlation can be written as follows. Suppose represents the joint distribution of two survival times
, let
. Then the discrete default correlation is defined as
,
.
2. How insurance theory works
Following3, an insurance system is a mechanism for reducing the adverse financial impact of random events that prevent fulfillment of reasonable expectations.
Insurance actually has a mathematical construct. Let’s look at an example. Suppose a decision maker has wealth and faces a loss in the next period. We use a random variable,
, to model this loss. Suppose an insurance contract pays
for the loss of
. Since the contract is feasible,
. It is assumed that all feasible contracts with
can be purchased for the same price
. The decision maker has a utility function
and is risk averse so that
, and has decided on the value
.
Traditional insurance pricing methods failed to capture the systemic risks of CDS, as issuers assumed that historical default patterns would continue unchanged. This led to a dangerous underpricing of risk, ultimately contributing to the 2008 crisis.With the setting, the question is, which insurance contract must be purchased to maximize the expected utility of the decision maker, given the values of and the premium to be paid
?
A typical insurance contract only pays out when the loss amount is above a deductible amount :
(8)
This type of insurance is called stop-loss or excess-of-loss insurance.
One of the biggest mistakes during the subprime mortgage crisis was how AIG, a major insurance company, mispriced credit default swaps (CDS). These contracts were meant to act as a safety net for investors holding CDOs, but AIG’s models wrongly assumed that mortgage defaults were mostly independent and that housing prices wouldn’t fall.
The company relied on Value at Risk (VaR) models that failed to consider extreme market crashes, where mortgage defaults suddenly became highly correlated. When the housing market collapsed, AIG found itself unable to cover its CDS payouts, leading to a liquidity crisis that nearly brought the company down. The situation became so dire that the government had to step in with a massive $182 billion bailout to prevent total collapse.
This crisis exposed a major flaw in how financial risks were measured. Traditional insurance models were built on the idea that risks are mostly independent, but in financial markets, a single event can trigger widespread failures. AIG’s approach didn’t account for worst-case scenarios or how interconnected the system really was, making it a prime example of why financial risk modeling needs to go beyond simple historical trends and incorporate stress testing for extreme situations.
2.1. Collective Risk Models: An Introduction
Based on the work of 4, we define a collective risk model assumes that there is a random process which generates claims for an entire portfolio. The idea here is to think of the portfolio in its entirety instead of a smaller subset. Suppose denotes the number of claims produced by a portfolio in a specific time period. Denote by
the amount produced by claim
. Set
. This represents the aggregate of claims. Here
is itself a random variable which is related to the frequency of the claim. Collective risk models assume the following:
- The variables
are identically distributed random variables
- The random variables
are mutually independent
Since the variables are i.i.d., let
represent the common distribution function of these variables. Suppose
is a random variable with this distribution function. Set
. This is the
th moment about the origin. Set
to be the moment generating function of
. Let
be the moment generating function of the number of claims, and let
be the moment generating function of aggregate claims. Denote by
to be the distribution function of the aggregate claims. Recall the following formulas for mean and variance:
(9)
Proof It is straightforward to show that . To see this note the following steps:
Now,
Setting we get the desired result.
Now to prove the second statement, set to get
.
Since , we have:
.
From here:
The result follows from here.
Now let’s use these results.
Here
These statements simply mean that the expected value of aggregate claims is the product of the expected number of claims and the expected individual claim. The variance of aggregate claims has a component which depends on the variability of the number of claims and a component that depends on the variability of an individual claim.
The moment generating function of is written as:
(10)
We will need some background in sums of random variables now.
2.2. Sums of random variables
Suppose we have , the sum of two random variables.
We are concerned with the event , so that the distribution function of
is:
We can write this as:
The probability density function is obtained by replacing by
.
We can write these as:
These integrals are called convolutions. This can be used to see the distribution of multiple random variables i.e. . If
is the distribution function of
, and
is the distribution function of the sum
, then the recursion
exists. Here
represents the convolution operator.So for instance
.
2.3. Distribution of aggregate claims
As an example, suppose , the number of claims, has a geometric distribution given by:
with and
. In this case,
, so that
. In this case, the distribution function of
is given by:
So we have
2.4. Applications of Risk Theory
Typically we are interested in a compound Poisson distribution, which can be written as follows: ( is random variable)
Here each of the variables are i.i.d. which are also independent of
.
Suppose represents the insurer’s surplus, as in, the excess of the initial fund together with the premiums collected over claims that have been paid. Denote by
the premiums collected through time
and by
the aggregate claims paid through time
. Suppose
is the surplus at time
then:
(11)
is usually called the surplus process,
is called the aggregate claims process. Suppose the premium rate is
and is constant. Say
, a linear function. If surplus becomes negative, we will say that ruin has occurred. The time of ruin,
is considered
if
. Consider the function
, which represents the probability of ruin before time
. Denote by
to be the discrete time surplus process:
With this setting, suppose is the initial surplus, and we are looking at
periods, then
. Here
is the aggregate of claims in the first
periods. Suppose
represents the sum of claims in period
, and assume that these are all i.i.d. with
.
So here, . Set
to be the time of ruin. Set
to be the probability of ruin. Define the adjustment coefficient
to be a positive solution of this equation:
2.5 Specific Properties of the Poisson Distribution
We start with a theorem following2, where we look at sums of Poisson random variables.
Theorem 2.2
If be mutually independent random variables such that
has a compound Poisson distribution with parameter
and the distribution function of the claim amount is
, with
, then
has a compound Poisson distribution with
and
Proof. We will use moment generating functions. Let represent the moment generating function of
. Then the m.g.f. of
is
. Since
were assumed to be independent, the m.g.f. of their sum is:
(12)
From there we can write
We will address the following question. Suppose be
different numbers and suppose that
are mutually independent random variables. Suppose each
has a Poisson distribution with parameter
. We seek the distribution of
. In order to solve this problem, we interpret
to have a compound distribution with Poisson parameter
.
hen this sum has a compound Poisson distribution with
, the probability function
is defined to be
, and
otherwise. Let
denote the discrete values for individual claim amounts, and suppose
. In addition, recall that the multinomial probability distribution has the form:
(13)
With the defined in Theorem (2.2) having a compound Poisson distribution with parameter
and probability function of claim amounts given by the discrete probability function
two properties can be shown:
are mutually independent
has a Poisson distribution with parameter
We will see why this is true.
Proof.Suppose there are claim amounts, the number of claims having a multinomial distribution with parameters
. So now, we are told
, we do the following calculation:
2.6 The Theory of Copulas
Following5, we need a few definitions to start off. Denote by the interval
, by
the interval
. A rectangle in
is the Cartesian product of two intervals:
. The vertices of this rectangle are the points
. A two place real function
is a function whose domain
is a subset of
and whose range
is a subset of
Let and
be nonempty subsets of
, and let
be a two-place real function such that
. A two-place real function
is a function whose domain,
is a subset of
, and whose range
is a subset of
.
Let be a rectangle all of whose vertices are in
. Then the H-volume of
is given by:
.
In terms of the first order differences of on the rectangle
, define:
Then, the volume of a rectangle
is the second order difference of
on
:
.
A two place real function is 2-increasing if
for all rectangles
whose vertices lie in
.
When is 2-increasing, the H-volume is called an H-measure of B, or sometimes quasi-monotone.
We will need two results that are of importance in understanding this theory.
Let and
be non-empty subsets of
and let
be a 2-increasing function with domain
. Let
be in
with
and let
be in
with
. Then the function
is non-decreasing on
and the function
is non-decreasing on
.
Let and
be non-empty subsets of
, and let
be a grounded 2-increasing function with domain
. Then
is non-decreasing in each argument.
Here a function from
into
is grounded if
for all
in
.
Suppose has greatest element
and
has greatest element
. A function
from
into
has margins, and the margins of
are functions
and
given by:
,
,
Suppose the function is grounded 2-increasing with margins whose domain is
. Let
and
be any points
then:
.
Now we can define copulas. The standard approach in this direction is to define subcopulas as a class of grounded 2-increasing functions with margins. Then copulas are defined as subcopulas with domain . Here
, where
.
A two-dimensional subcopula (2-subcopula) is a function with the following properties:
- Dom
=
, where
are subsets of
containing
and
is grounded and 2-increasing
- For every
in
and every
in
,
So for every ,
. This means that
is a subset of
.
A two-dimensional copula is a 2-subcopula whose domain is
. It is a function
from
to
with the following properties:
- For every
,
and
- For every
such that
and
then
A subcopula and a copula are different, and these differences matter. Let’s look at some results in this area. For example, suppose is a subcopula. Then for every
in
,
(14)
Proof.Let be an arbitrary point in
. Now
and
. Together these yield
. Now since
. These imply
. In addition,
. Together this means
.
Every copula is also a subcopula, but the reverse is not true.
We state and look at a proof of one of the most important theorems in this theory, called Sklar’s theorem.
2.7 Sklar’s Theorem
Here we are looking at dimensional copulas. This is a function
from the unit
cube
to the unit interval
which satisfies the following:
for every
and for all
for
for any
is
-increasing
Let’s understand what these are saying. The first property says that once the realizations of variables are known with marginal probabilities
, then the joint probabilities of the
outcomes is the same as the probability of the remaining uncertain outcomes. The second property is saying that the joint probability of all outcomes is
, if the marginal probability of any outcome is
. The third property states that the
volume of any
dimensional interval is non-negative.
We will use the proof given in6
First let and
be non-empty intervals of
and suppose
be a non-decreasing mapping with
, and
.
Thus, as is a mapping, this gives
,
. By
and
are meant the lower end point and upper end point and in this case,
,
. The inverse function
,
.
Following6, we observe the following result:
Lemma 2.3 Let be a non-decreasing right continuous function. Then
is left continuous and we have:
and
,
in addition we also have:
We also recall the definition of a distribution function on . A mapping
is a distribution function iff :
is right continuous
assigns to non-negative volumes to any cuboid
with
which is equivalent to
for all
so that
, where
and
runs over
and
In addition, in order to become a cumulative distribution function, we also need:
In6, a copula on is a cdf C, with marginal cdf’s defined in the following manner, for
:
(15)
where these are all equal to the uniform cdf, which is defined as
(16)
So for
(17)
Now Sklar’s theorem reads:
Theorem 2.4 For any cdf on
, there exists a copula
on
such that:
,
Proof. For , set
C will assign non-negative volumes to cuboids of
using the definition of above, with arguments of the form
.
C is right continuous since F is right continuous, since of the .
Using the result earlier for with the above result, the theorem of Sklar follows
Following6, let’s see why the result holds. For
we consider the limit of
. For any
,
is the infimum of the set
such that
. All such
satisfies
, so that
.
Next we show that . First,
as
is right continuous and the right hand limit exists of the non-decreasing function
, so that
.
Now since
, this shows that
as
.
2.8. Properties of the Copula function
We will illustrate important properties as given in the most important paper on this topic, the paper by7 Suppose we have uniform random variables,
. The joint distribution function C is defined as:
(18)
Given the univariate marginal distribution functions , the function
, which is a multivariate distribution function with univariate marginal distributions given by
.
To see this:
Proof. Note that the function can be written thus:
In addition we can show that the marginal distribution of is
as follows:
Proof.
Sklar’s theorem given earlier shows that converse, namely that if is a multivariate joint distribution with univariate distribution functions
then there exists a copula function
such that
. If each
is continuous then
is unique.
For the purpose of this paper, we will look only at properties of the Bivariate Copula Function for uniform random variables
,
defined over the area
, with
a correlation parameter that is not necessarily equal to the Pearson’s correlation coefficient.
There are three main properties:
- As
and
are positive,
- Since
and
are bounded above by 1, the marginal distributions are
- For independent random variables
and
,
2.8.1. Examples of Copula Functions
Following7, the following copula functions are of interest
- The Frank Copula function is defined as:
- Bivariate Normal:
,
, where
is the bivariate normal distribution function with correlation coefficient
and
is the inverse of a univariate normal distribution function
It is also possible, following7 that given two uniform random variables, and
, that are independent, to have a Copula function
. In this regard, we have the Frechet-Hoeffding Boundary Copulas theorem. We start by looking at the simplest cases. Firstly, an independent copula structure is given by
. A minimum Copula is given by
, and a maximum Copula is given by
.
2.9. Understanding Frechet-Hoeffding Bounds
Here we follow the work of8. We start with the following result.
Theorem 2.5. Suppose we have random variables whose dependence structure is given by a Copula
. Let
,
be strictly increasing functions. Then the dependence structure of the random variables
is also given by the copula
In addition, there are the Frechet – Hoeffding bounds. This essentially puts a pyramid inside which every copula has to lie. Such a pyramid gives a lower bound and an upper bound
.
This implies that such functions don’t change the dependence structure. It was proven independently by Hoeffding and Frechet that a copula will always lie in between certain bounds. For instance consider two uniform random variables and
. If
, these are extremely dependent on each other. In this case the copula is given by:
(19)
Such a copula is always attained if , where
is a monotonic transformation. Random variables of this kind are called comonotonic. There is also the idea of a countermonotonic random variable. For this, we need
. We then have:
In other cases, this is . This brings us to the theorem on Frechet- Hoeffding bounds
Theorem 2.6. Consider a copula . Then
(20)
Proof. We start with the observation that a Copula function satisfies three conditions
,
,
,
- For all
and
,
, the following is true:
Now using the second property from above, we see that
(21)
and
(22)
Together these give the upper bound. Now we take in the third property gives:
(23)
As , this gives the lower bound of the Frechet-Hoeffding theorem.
2.10 Construction of Credit Curves
2.10.1 Kaplan Meier Estimator
Following9, let be the random variable that describes an individual’s survival time, and let
a time for an event drawn from
. Here
denotes the ascending order of event times as in for example
. With this terminology, the Kaplan Meier survival estimates,
is given by the following formula:
Here is the estimated conditional probability of surviving past time
, given survival to at least time
. The
is the Kaplan-Meier survival estimate of the previous time step. In this regard, one could also write:
(24)
Here is the number of events occurring at
and is the number of individuals that have survived.
2.10.2. Cox Proportional Hazards Model
Here we use two components
- Baseline hazard rate as a function of time
- Effect parameters
Let be a set of
explanatory variables, for subject
. The Cox model is then written in terms of the hazard function
and is defined as:
(25)
Here is the coefficient related to explanatory variable
, and
is the baseline hazard function. This model is also called a proportional hazard model. Suppose
and
be two observations with the corresponding predictors
and
. Then the hazard ratio for these observations is given by:
This is independent of time .
3. Markov Chain Model
A stochastic process, with discrete time parameter, is a sequence of random variables . The state of the process is given by
. Here
is the state of the process at time
.
A Markov Chain is a stochastic process where the next state of the process only depends on the current state, and is not influenced by any of the previous states. This is called the Markovian property. The stochastic process with state space
is said to be a discrete time Markov Chain if for each
,the following is true:
(26)
A Markov Chain is called time homogeneous if given states :
(27)
This is independent of which represents the time.
The are called transition probabilities. These satisfy two conditions:
A transition matrix gives a matrix of transition probabilities. This matrix is of the following form:
(28)
It must also be true that the following holds
(29)
This is a way of saying that the sum of all transitional probabilities from one state to all other states including itself is
. To extend this from one step to
steps, we simply raise the matrix elements to powers. In the following representations,
, where we are looking at the probability to migrate from state
to state
in
steps. This is given by:
(30)
Here is the probability of going from state
to state
in
steps.
Note that the transition matrix is estimated. One of the ways to achieve this is the cohort method. Define
(31)
Now the transition rate between two states and
is estimated by:
(32)
Here denotes the number of entities that migrated from state
to state
during the period
and
is the number of entities that started in state
at time
. If there were no transitions between state
and
, then
.
3.1. Term Structure of Default Rates
In10, three methods are shown
- Historical default information from rating agencies
- Merton option theoretical approach
- Implied approach using market price of default bonds or asset swap spreads
3.1.1. Understanding credit risks via CreditMetrics
We will reference11. CreditMetrics is a framework for quantifying credit risk in portfolios. We are interested in the section on portfolio risk calculations. In most financial risk estimations, there are three main directions
- Estimating particular individual parameters such as expected default frequencies
- Estimating volatility of value which are the unexpected losses
- Estimating volatility of value within the context of a specific portfolio
Of these the most important is the idea of unexpected losses.
As seen in11, there are several difficulties in terms of unexpected losses, and the industry has taken an approach that is dangerous. Firstly, as11 states, since it is difficult to explicitly address correlations, a lot of the time it is assumed that the correlations are all zero, or all equal one which corresponds to the cases of perfectly correlated or perfectly positively correlated, but the issue is that these are not realistic. Other times, the practitioners assume that the correlations will be the same as that of some index portfolio. This needs a different type of analysis, because it assumes that a specific portfolio somehow mirrors the market in question, which may be the case but only under the assumption that there are parallels between the correlations and profile of composition.
3.1.2. Asset Value Model
We are concerned with joint probabilities in terms of defaults. Following12, denote by to be the joint default frequency of firm
and firm
, which is the actual probability of both firms defaulting together. Let
represent the default correlation for firms
and
.
Then
(33)
Here represents the probability of default.
Define the following variables. Set . Then let
,
,
,
. In addition we have
(34)
Define ,
. Now if
. Then
(35)
The above equation gives the unexpected loss for the portfolio in question. Default correlation, which was the crux of the problems during the crisis of measures the strength of the default relationship between two borrowers. If there is no relationship, it means the default is zero. However, if borrowers are correlated, the probability of both defaulting is higher. According to12, the joint probability of default is defined to be the likelihood that both firms market asset values will be below their respective default points in the same time period.
This probability depends on three factors
- Current asset values in terms of the market
- The asset volatilities
- The correlation between the market asset values
Denote by the bivariate normal distribution function, by
the inverse normal distribution function, by
the correlation between firm
‘s asset return and firm
‘s asset return.
In this case, the is given by
(36)
Ideally, a firm’s return can be written as the sum of the composite factor return and firm specific effects. The composite factor returns include country factor returns and industry factor returns. The country factor return has four components: the global economic effect, the regional factor effect, the sector factor effect and the country specific effect. Finally the industry factor return has four components, the global economic effect, the regional factor effect, the sector factor effect and finally the industry specific effect.
The composite (custom) market factor index for firm can be written as:
(37)
In this equation set, ,
the , the
the return index for the country
,
the the return index for the country
.
Finally the .
It is also true that
(38)
3.2. The KMV and Merton Models
The KMV corportion refers to Kealhofer, McQuown and Vasicek (KMV), which used to provide quantitative credit analysis, and was thereafter acquired by Moody’s in 2002. We start with the example provided by12. First consider a firm that has a single asset that has million shares of Microsoft stock. Assume that it has a single fixed liability, which is a one year discount note with a par amount of
million dollars. The firm is otherwise funded by equity. In a year the company will either be able to pay off the note by virtue of the market value of its business, or it will default. The equity of this company is equivalent to
million call options on Microsoft stock, each with an exercise price of
dollars. The maturity time is
year. This entire example shows that the equity of a company can be thought of as a call option on the company’s underlying assets. This means, according to13, the value of equity will depend on three factors. These are the market value of the company’s assets, the volatility and the payment terms of the liabilities. Merton’s original model from 1974 has specific properties:
- The company has equity, a single debt liability and no other obligations
- The liability has continuous fixed coupon flow and infinite maturity
- The company has no other cash payouts like equity dividends
Merton showed that, assuming that the company’s assets follow a lognormal process, this model can be solved to show a closed form process for the value of a company’s debt. The aim of this model shows the company’s debt.
The KMV model is based on probability of default of the company as a whole, rather than the valuation of the debt. The KMV model has the following properties, following14
- The company could have debt or non debt fixed liabilities, in addition to common equity and preferred stock
- Warrants, convertible debt and convertible preferred stock is allowed
- Short term obligations can be demanded by creditors and long term can be treated as perpetuities
- Any and all classes of liabilities are allowed to make fixed cash payouts
- The default occurs when the market value of a company’s assets falls below a fixed point, called the default point. This default point depends on both the nature and the extent of the fixed obligations
- Default occurs on the company as a whole
In13, the distance to default, is defined to be the number of standard deviations to the default point by horizon
.
This is calculated as
(39)
Here is the current market value of the company’s assets,
is the company’s default point,
is the expected market return to the assets per unit of time,
is the volatility of the market value of the company’s assets per unit of time. The KMV model has a focus on default risk measurement and not debt valuation. The reason for this is that debt valuation actually has default risk measurement built into it. The other issue is that the using a lognormal model, there will be differences between actual realized default rates and predicted default rates. An example given by13 is that of a firm that is more than
standard deviations from its default point has essentially
probability of default, although in reality the default probability is around
percent, and this is actually significant in real life. This simply means that on paper, being
standard deviations away is equivalent to being better than
grade in investment, but having a default probability of
percent means it is not even investment grade.
There is a reason why the default risk measurements are used in place of debt valuations. The debt valuations already have the default risk measurements contained in them. In other words, if the default risk measurement is accurate, so is the debt valuation. Keep in mind that the distance to default is an ordinal measure, not an absolute measure. In this regard one needs, for instance, a log normal asset value distribution of the Merton approach. The solution to this problem is the KMV EDF (Expected Default Frequency) credit measure. The EDF is the probability of default within a given time period.
3.3. Prediction of Default Rates
When it comes to bond yields, one needs to consider specific ideas. These are:
- Spread Volatility The average yield spread that corresponds to a given agency rating grade will change significantly with time
- Considerable variation in the shape of spread curves This variation is significant when given as a function of term
3.4. Default Rates and Firm Values
Suppose one has a cash flow , due at a single future date,
. Suppose
is the continuous discount cash rate to
for a default risk free cash flow. The option theoretic formula for the value of the cash flow today, is given by
(40)
Here is the so called risk neutral cumulative default probability to
, and
is the loss given default term, the expected percentage loss if the borrower defaults). There is a relationship between
the risk neutral cumulative default probability and
the actual cumulative default probability to
. Under the assumption of lognormality, this is given by:
(41)
Here and
represent the standard cumulative normal distribution and it’s inverse function,
represents the instantaneous expected return to the asset, and
represents the volatility of asset returns. The cash flow is valued as if the default probability were
, and this is larger than the actual probability
. Another approximate relation between
and
is given by:
(42)
Assuming a risk premium of is determined by the capital asset pricing model, we write
(43)
We can write this as
(44)
Here is the asset return,
is the market return,
is the expected return,
is the standard deviation of asset return,
is the standard deviation of market return,
is the correlation of
and
, and
is the market Sharpe ratio. We can write these equations together as
(45)
In the case of multiple cash flows, the valuation formula becomes
(46)
Let us recapitulate the model and show an example. The EDF is a forward looking measure of the actual probability of default. The KMV model is based on the structural approach to calculate EDF, with the credit risk being driven by the firm value process. In order to get the actual probability of default, one goes through three steps:
- Estimation of the market value and volatility of the firm’s assets
- Calculation of the distance to default, an index measure of default risk
- Scaling of the distance to actual probabilities of default using a default database
Essentially we are looking at two items: the estimation of firm value , and the volatility of firm value
. What usually happens is that the price of equity for most public firms is directly observable, and sometimes part of the debt is traded. Typically one has two equations:
(47)
(48)
Here denotes the leverage ratio in capital structure,
is the average coupon paid on the long term debt,
is the risk free rate. One usually solves for
and
from these two equations. As an example, suppose the current market value of assets,
is
, the net expected growth of assets per annum,
, the expected asset value in one year,
, the annualized asset volatility is
and the default point is
. Then the default distance is
. Among the population of all firms with
at one point in time, supposing there were
firms, of which
defaulted in a year. In this case,
(49)
Acknowledgments
The author wishes to thank their mentor Rajit Chatterjea, from The University of Southern California for guidance on the sections related to probability theory.
References
- C. Donnelly and P. Embrechts. “The devil is in the tails: Actuarial mathematics and the subprime mortgage crisis.” ASTIN Bulletin, 40, pp. 1–33. (May 2010). [↩] [↩]
- N. L. Bowers, H. U. Gerber, J. C. Hickman, D. A. Jones, and C. J. Nesbitt. Actuarial Mathematics. The Society of Actuaries. (1997). [↩] [↩]
- D. R. Cox and D. Oakes. Analysis of survival data. Chapman Hall/CRC. (1998). [↩]
- L. H. Longley-Cook. “Society of Actuaries – Actuarial application of Monte Carlo technique by Russell M. Collins Jr.” (Aug. 1964). [↩]
- R. B. Nelsen. An introduction to copulas. Springer. (1999). [↩]
- E. Gane, M. Samb, and J. Lo. “A simple proof of the theorem of Sklar and its extension to distribution functions.” (2018). [↩] [↩] [↩] [↩]
- D. X. Li. “On default correlation: A copula function approach.” SSRN Electronic Journal. (1999). [↩] [↩] [↩]
- T. Schmidt. “Coping with copulas.” (Jan. 2006). [↩]
- H Englund and V Mostberg (2022) “Probability of Default Term Structure Modeling”. PhD thesis. URL: https://www.diva-portal.org/smash/get/diva2: 1667201/FULLTEXT03 (visited on 12/07/2023). [↩]
- D X Li. (1999). “On Default Correlation: A Copula Function Approach”. In: SSRNElectronic Journal. DOI: 10.2139/ssrn.187289. [↩]
- G Gupton (1997). Credit Metrics Technical Document. Yale School of Management Program on Financial Stability. [↩] [↩] [↩]
- S Kealhofer (Jan. 2003). “Quantifying Credit Risk I: Default Prediction”. In: Financial Analysts Journal 59, pp. 30–44. DOI: 10 . 2469 / faj . v59 . n1 . 2501. (Visited on 10/26/2020). [↩] [↩] [↩]
- S Kealhofer (Jan. 2003). “Quantifying Credit Risk I: Default Prediction”. In: Financial Analysts Journal 59, pp. 30–44. DOI: 10 . 2469 / faj . v59 . n1 . 2501.(Visited on 10/26/2020). [↩] [↩] [↩]
- S Kealhofer (Jan. 2003). “Quantifying Credit Risk I: Default Prediction”. In: Fi-nancial Analysts Journal 59, pp. 30–44. DOI: 10 . 2469 / faj . v59 . n1 . 2501. (Visited on 10/26/2020). [↩]