Working in the superannuation industry in Australia has some great advantages. A nice atmosphere at work, gym sessions with colleagues during lunch time, endless talks about girls after hours. However, as Brian Tracy once said: *When you go to work, you work*. And this is true. And this is rewarding.

Superannuation is the Australian way of making people rich when they retire. Sadly, barely understood by many, it offers a wide palette of investment options for your *super* (9-12% of every salary is credited to the super account by the employer; regulated by law). Over 120 super funds across Australia, over 7000 options to choose from. All styles of investments, from cash options, through growth and diversified fixed interest, ending among high growth-high risk options. Trust me, the funds’ portfolio managers do their best to beat the benchmarks, the markets, and competing super funds.

The industry establishes standards. Regulations. Smarter and unified ways of reporting. On 29 June 2010, Australian Prudential Regulation Authority (APRA) issued a letter to superannuation trustees advising that APRA would be providing guidance on the disclosure of superannuation investment risk to fund members. The letter to trustees states that **risk should be measured as the likely number of negative annual returns over any 20 year period**. And this is where our story begins.

**Risk Assessment**

A working practice of super funds across Australia revealed that since 2013 the way of reporting risk (following the abovementioned definition) still leaves a lot behind the curtain. It’s not clear how the assessment of an individual investment option’s risk has been derived.

APRA’s guidance states that this classification system should help members ‘readily distinguish the characteristics of each investment strategy’. To achieve this, there needs to be sufficient categories to meaningfully differentiate between various options based on risk.

The funds dare to report the **risk score/label according** to a seven level classification system to provide sufficient granularity. The Joint ASFA/FSC Working Group’s analysis supports the claim that the number of annual negative periods over any 20 year period is likely to fall in the range of 0 to 7 for the majority of investment options:

Risk Band | Risk Label | Estimated number of negative annual returns over any 20 year period |
---|---|---|

1 | Very Low | Less than 0.5 |

2 | Low | 0.5 to less than 1 |

3 | Low to Medium | 1 to less than 2 |

4 | Medium | 2 to less than 3 |

5 | Medium to High | 3 to less than 4 |

6 | High | 4 to less than 6 |

7 | Very High | 6 or Greater |

Having an access to any investment option performance over past years does not solve fully the problem of their risk assessment following the guidance. Firstly, not every option’s life time is long enough to derive meaningful results. Maximal data span on only few occasions reaches 10 years. Secondly, the data have gaps, therefore not for every option we can fetch its monthly performance figures. Thirdly, an access to data requires huge amount of labour of dedicated people who put all grains of sand into one jar and then are willing to sell it (e.g. the research house of SuperRatings in Sydney).

Lastly, knowing the option’s performance over past 3 years the questions arise. How should we estimate its risk score correctly? How many times in the upcoming 20 year period a considered investment option is going to denote negative annual returns? We need a model.

**Model**

A quantitative idea standing behind a unification of APRA’s risk measure guidance for investment options can find solution in the fundamentals of statistics. Below we will design a way of *guessing* the answer for the problem we question.

If we recall the concept of the **Binomial Distribution**, we immediately recognise its potential application. Consider a time-series of monthly returns, $\{ r_i \}$, of the total length of $N$ months. If $N$ is larger than 12, we can calculate

$$

m=(N-12)+1

$$ times the annual return,

$$

R_j = \left[ \prod_{i}^{12} (1+r_i) \right] – 1 \ \ \mbox{for} \ j = 1,…,m

$$ where $r_i$ are given as decimals and $j$ denotes specific period of 12 consecutive months. $R_j$ should be understood as a rolling annual return given the statistically justified minimum requirement of $N\ge 31$ months of data for a specific option. For example, if we have 5 years of data (60 months) then we are able to get $m=(60-12)+1=49$ test annual returns based on the uninterrupted data sequences. Here, the word *test* is crucial.

We build our risk score model in the following way. For any option of data length of $N$ months we construct a vector of $m$ annual returns $R$. We count the total number of negative values and denote it as $k$ out of $m$ trials. The probability of a single year to close up with the negative annual return is given as:

$$

p=\frac{k}{m}

$$ where $p$ can be assigned as the probability of success (in obtaining the negative annual return). Under the *assumption of independent trials*, one can find the probability of obtaining exactly $k$ successes out of, in general, $n$ trials making use of Binomial Distribution that is given by the probability mass function:

$$

Pr(X=k) = {{n}\choose{k}} p^k (1-p)^{n-k}

$$ where obviously

$$

{{n}\choose{k}} = \frac{n!}{k!(n-k)!} \ \ .

$$ From the data analysis of the option performance, we aim at calculating the probability of $k$ negative annual returns in any $n=20$ trials, namely:

$$

Pr(X=k) = {{20}\choose{k}} p^k (1-p)^{20-k}

$$

The model can be easily coded in Excel/VBA as a macro. The core algorithm of assessing option’s risk score you can grasp below:

If m > 0 Then 'probability of success Dim p As Double p = k / m 'probability for r=1..7 Dim pr As Double, trials As Integer, r As Integer, newton As Double Dim ans As Integer, v0 As Double, v1 As Double v0 = -1 v1 = -1 trials = 20 For r = 1 To 7 newton = Application.Fact(trials) / _ (Application.Fact(r) * Application.Fact(trials - r)) pr = newton * Application.Power(p, r) * Application.Power(1 - p, trials - r) If pr > v0 Then v0 = pr v1 = r End If Next r 'a new risk score If v1 = 1 Then scoreSR = 1 ' "Very Low" Else If v1 = 2 Then scoreSR = 2 ' "Low" Else If v1 = 3 Then scoreSR = 3 ' "Low to Medium" Else If v1 = 4 Then scoreSR = 4 ' "Medium" Else If v1 = 5 Then scoreSR = 5 ' "Medium to High" Else If v1 = 6 Then scoreSR = 6 ' "High" Else If v1 = 7 Then scoreSR = 7 ' "Very High" Else scoreSR = "" End If End If End If End If End If End If End If |

The resulting solution may lead us to the **reformulation of the risk assessment** for individual investment options as presented in the following table:

Risk Band | New Risk Label | Estimated number of negative annual returns over any 20 year period ($k$) |
---|---|---|

1 | Very Low | 0 to 1 |

2 | Low | 2 |

3 | Low to Medium | 3 |

4 | Medium | 4 |

5 | Medium to High | 5 |

6 | High | 6 |

7 | Very High | 7 or Greater |

Intuitively, any Cash Option which always denotes positive monthly returns is also ranked (by our model) as an option of a *Very Low* risk. You can apply this model for any monthly return series since we play we the numbers. This is what quants do best. Number crunching!

**Nasty Assumption**

The motivation behind the model sleeps in one word you can find in the definition we underlined at the very beginning: *Estimated number of negative annual returns over* **any** *20 year period*. ‘Any’ allows us to choose any random sequence of 12 consecutive monthly returns and that pays off in the larger number of trials under the assumption of independence.

Na, right. The nasty piece of the model: the independence of individual trials. This is a cornerstone as it comes to Binomial Distribution. As a smart quant you can question its validity within our model. I tackled with this problem for a while and I came to an **empirical** lemma supporting our line of defence.

The independence of individual trials can be justified as an analogous to the following experiment. Imagine that a sheet of A4-format paper is available. The paper has a colour gradually changing from left to right from black to light grey, and the colour gradient is constant in a vertical direction. We cut the sheet horizontally into a finite number of strips of paper of 1 cm height, however starting every new strip by 1 cm to the right (the concept of a sliding window in time-serie analysis). Any comparison of two adjacent strips of paper indicates a finite common area of the same colour as defined by the gradient: correlation. We put all strips into one jar and churn it. Next we randomly pull strips out of the jar. Each strip is an independent trial out of all strips placed inside the jar as the memory of correlation between two formally adjacent strips has been lost.

Invest smartly and don’t worry. Correlations are everywhere.