# reverse_engineer

Personal website of Aurélien Vermylen

# Cross-sectional methods for Proxy Credit Spreads

### “Nomura” method

Recently, I’ve been working on the so-called “Nomura” method for proxy credit spreads (see this paper). The paper, published in February 2013, basically proposes to model a credit spread by a multiplication of different factors:

$$S_i^{proxy} = M_{glob} M_{sctr(i)} M_{rgn(i)} M_{rtg(i)} M_{snty(i)}$$

The advantage of using this method instead of the earlier “intersection” methods (where a missing counterparty’s spread is taken as the average of the comparable counterparties’ spreads), is that the method “learns” in a cross-sectional way from all spreads of counterparties with similar properties. This means that if we ask for the spread of the Walloon Region in Belgium (which is not a liquid CDS, so it is unknown, basically), the cross-sectional method can answer us by multiplying the “Sub-Sovereign” factor, with the “Western Europe” factor, with the AA- factor and the “global” factor. This way, it used information from all Sub-Sovereigns, all Western-European and all AA- counterparties. The earlier intersection methods would often fail for such kind of counterparties, because there are no or limited number of comparables in the “Western European AA- Sub-sovereign” bucket.

The method can then be calibrated easily by moving into log-space:

$$\log{S_i^{proxy}} = \log{M_{glob}} + \log{M_{sctr(i)}} + \log{M_{rgn(i)}} + \log{M_{rtg(i)}} + \log{M_{snty(i)}}$$

Then, minimizing the log-spreads in a least-squares way like this:

$$\min_{M} \sum_i (\log{S_i^{obs}} - (\log{M_{glob}} + \log{M_{sctr(i)}} + \log{M_{rgn(i)}} + \log{M_{rtg(i)}} + \log{M_{snty(i)}}))^2$$

Which is easily solvable with linear algebra (as it is a simple over-determined linear system):

$$\pmb{Ax} = \pmb{y}, with: \quad y_i = \log{S_i^{obs}}, A_{i,j} = 1 \quad if \, spread \, S_i \, has \, factor \, j$$

### Earlier idea’s

A similar idea was apparently already worked on by my boss at my current employer in 2012, but never published. His idea was to do something similar, but first only with ratings as explanatory variables, and not using a factor for each rating, but fitting a linear curve through the ratings, and considering only one “slope” parameter in the following way:

$$\log{\frac{S_i^{proxy}}{1-RR_i}} = \alpha + \beta \cdot f(Rating_i) + \sigma \cdot \epsilon_i$$

Where $f$ is a linear function of the rating $Rating_i$ which is expressed numerically. (Note that here, recovery-adjusted log-spreads were used).

A further refinement of this method was then proposed by him: assigning a “distance” between ratings. Basically, not considering AAA as 0 and C as 18 (numbering equally-spaced), but optimizing on the entire history of spreads for natural rating distances, so that we fit the observed spreads at best.

This refinement is very interesting, as it identifies the behavior of Rating Agencies when they assign ratings, and we can see some interesting results from these distances, which are not shown here, because the data does not belong to me.

Another interesting result is to see how Rating agencies change their methodology after the financial crisis of 2008 by calibrating rating distances before and after the crisis and looking at the differences.

### Important pitfall with the log-space methodology

A dangerous pitfall with this methodology was identified by myself in early 2015, in my previous job. Indeed, when optimizing on log-spreads in a least-squares way (as proposed above by the “Nomura” and “my boss” method), the method naturally will try to converge to the arithmetic average of the observed log-spreads. We can thus easily prove that for each “intersectional” bucket, the method will try in the spread space (with its limited number of degrees of freedom) to converge to the geometric average of the observed spreads. I’ve shown this in a small paper I wrote (see here), if you’re interested.

So since we’re converging to the geometric average of the observed spreads, we must ensure we use the results of the method coherently. Indeed, it is quite clear from for example Markit data that recovery-adjusted log-spreads are quite normally distributed. This means that the spreads themselves follow a log-normal distribution, and that their geometric average is quite different from their arithmetic average! There is a convexity adjustement to take into account there:

$$\mathbb{E}[S_i^{proxy}] = \mathbb{GM}[S_i^{proxy}] \cdot e^{\sigma^2/2}$$

I’ve actually shown that the impact on CVA of using these geometric averages instead of arithmetic ones can be very significant. For a typical 1% spread with 50bp $\sigma$, we typically see a 10% underestimation of CVA.

I wonder whether a lot of banks are aware of this, and are using cross-section proxy spreads in a coherent manner… If you have worked on this topic and want to discuss, feel free to contact me.