Quantitative Analysis, Risk Management, Modelling, Algo Trading, and Big Data Analysis

Predicting Heavy and Extreme Losses in Real-Time for Portfolio Holders (2)

This part is awesome. Trust me! Previously, in Part 1, we examined two independent methods in order to estimate the probability of a very rare event (heavy or extreme loss) that an asset could experience on the next day. The first one was based on the computation of the tail probability, i.e.: given the upper threshold $L_{thr}$ find the corresponding $\alpha$ such that $ Pr(L < -L_{thr}) = \alpha$ where a random variable $L$ was referred to as a daily asset return (in percent). In this case, we showed that this approach was strongly dependent on the number of similar events in the past that would contribute to the changes of $\alpha$'s in time. The second method was based on a classical concept of the conditional probability. For example, given that the index (benchmark, etc.) drops by $L'$ what is the probability that an asset will lose $L\le L'$? We found that this methods fails if the benchmark never experienced a loss of $L'$.

In both cases, the lack of information on the occurrence of a rare event was the major problem. Therefore, is there any tool we can use to predict an event that has not happened yet?!

This post addresses an appealing solution and delivers a black-box for every portfolio holder he or she can use as an integrated part of the risk management computer system.

1. Bayesian Conjugates

Bayesian statistics treats the problem of finding the probability of an event a little bit differently than what we know from school. For example, if we flip a fair coin, in long run, we expect the same proportion of heads and tails. Formally, it is known as a frequentist interpretation. Bayesian interpretation assumes nothing about the coin (is it fair or weighted) and with every toss of the coin, it builds a grand picture on the underlying probability distribution. And it’s fascinating because we can start with an assumption that the coin is made such that we should observe a number of heads more frequently whereas, after $N$ tosses, we may find that the coin is fair.

Let me skip the theoretical part on that aspect of Bayesian statistics which has been beautifully explained by Mike Halls-Moore within his two articles (Bayesian Statistics: A Beginner’s Guide and Bayesian Inference Of A Binomial Proportion – The Analytical Approach) that I recommend you to study before reading this post further down.

Looking along the return-series for any asset (spanned over $M$ days), we may denote by $x_i = 1$ ($i=1,…,M$) a day when a rare event took place (e.g. an asset lost more than -16% or so) and by $x_i = 0$ a lack of such event. Treating days as a series of Bernoulli trials with unknown probability $p$, if we start with an uninformed prior, such as Beta($\alpha, \beta$) with $\alpha=\beta=1$ then the concept of Bayesian conjugates can be applied in order to update a prior belief. The update is given by:
$$
\alpha’ = \alpha + n_e = \alpha + \sum_{i=1}^{N\le M} x_i \\
\beta’ = \beta + N – n_e = \beta + N – \sum_{i=1}^{N\le M} x_i
$$ what simply means that after $N$ days from some initial moment in the past, by observing a rare event $n_e$ times, the probability distribution is given by Beta($\alpha’, \beta’$). Therefore, a posterior predictive mean for Bernoulli likelihood can be interpreted as a new probability of an event to take place on the next day:
$$
\mbox{Pr}(L < -L_{thr}) = \frac{\alpha'}{\alpha' + \beta'} $$ And that's the key concept we are going to use now for the estimation of probability for heavy or extreme losses that an asset can experience in trading.

As a quick visualisation, imagine you analyse the last 252 trading days of the AAPL stock. The stock never lost more than 90% in one day. No such extreme event took place. According to our new method, the probability that AAPL will plummet 90% (or more) in next trading session is not zero(!) but
$$
\mbox{Pr}(L < -90\%) = \frac{\alpha'}{\alpha' + \beta'} = \frac{1+0}{1+252-0} = 0.3953\% $$ with 95% confidence interval: $$ \left[ B^{-1}(0.05, \alpha', \beta');\ B^{-1}(0.95, \alpha', \beta') \right] $$ where $B^{-1}$ denotes the percent point function (quantile function) for Beta distribution.

The estimation of a number of events corresponding to the loss of $L < -L_{thr}$ during next 252 trading day, we find by: $$ E(n_e) = \left\lceil \frac{252}{[B^{-1}(0.95, \alpha', \beta')]^{-1}} \right\rceil $$ where $\lceil \rceil$ denotes the operation of rounding.

2. Analysis of Facebook since its IPO

Having a tool that includes information whether an event took place or not is powerful. You may also notice that the more information we gather (i.e. $N\to\infty$) the more closer we are to the “true” probability function. It also means that the spread around a new posterior predictive mean:
$$
\sigma = \sqrt{ \frac{\alpha’ \beta’} { (\alpha’+\beta’)^2 (\alpha’+\beta’+1) } } \ \to 0 .
$$ It is logical that the analysis of an asset traded daily for the past 19 years will deliver “more trustful” estimation of $\mbox{Pr}(L < -L_{thr})$ rather than a situation where we take into account only a limited period of time. However, what if the stock is a newbie in the market?

Let’s investigate the case of Facebook (NASDAQ:FB). The information we have the record of, spans a bit over 3.5 years back, when Facebook’s initial public offering (IPO) took place on May 18th, 2012. Since that day till today (December 6, 2015) the return-series (daily data) is $M=892$ points long. Using the analytical formulations as given above and writing a Python code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# Predicting Heavy and Extreme Losses in Real-Time for Portfolio Holders (2)
# (c) 2015 Pawel Lachowicz, QuantAtRisk.com
 
import numpy as np
from scipy.stats import beta
import matplotlib.pyplot as plt
import pandas_datareader.data as web
 
 
# Fetching Yahoo! Finance for Facebook
data = web.DataReader("FB", data_source='yahoo',
                  start='2012-05-18', end='2015-12-04')['Adj Close']
cp = np.array(data.values)  # daily adj-close prices
ret = cp[1:]/cp[:-1] - 1    # compute daily returns
N = len(ret)
 
# Plotting IBM price- and return-series
plt.figure(num=2, figsize=(9, 6))
plt.subplot(2, 1, 1)
plt.plot(cp)
plt.axis("tight")
plt.ylabel("FB Adj Close [USD]")
plt.subplot(2, 1, 2)
plt.plot(ret, color=(.6, .6, .6))
plt.axis("tight")
plt.ylabel("Daily Returns")
plt.xlabel("Time Period [days]")
#plt.show()
 
# provide a threshold
Lthr = -0.21
 
# how many events of L < -21% occured?
ne = np.sum(ret < Lthr)
# of what magnitude?
ev = ret[ret < Lthr]
# avgloss = np.mean(ev)  # if ev is non-empty array
 
 
# prior
alpha0 = beta0 = 1
# posterior
alpha1 = alpha0 + ne
beta1 = beta0 + N - ne
pr = alpha1/(alpha1+beta1)
cl1 = beta.ppf(0.05, alpha1, beta1)
cl2 = beta.ppf(0.95, alpha1, beta1)
ne252 = np.round(252/(1/cl2))
 
print("ne = %g" % ne)
print("alpha', beta' = %g, %g" % (alpha1, beta1))
print("Pr(L < %3g%%) = %5.2f%%\t[%5.2f%%, %5.2f%%]" % (Lthr*100, pr*100,
                                                       cl1*100, cl2*100))
print("E(ne) = %g" % ne252)

first we visualise both the price- and return-series:
figure_2
and while the computations we find:

ne = 0
alpha', beta' = 1, 893
Pr(L < -21%) =  0.11%	[ 0.01%,  0.33%]
E(ne) = 1

There is no record of an event within 892 days of FB trading, $n_e = 0$, that the stock lost 21% or more in a single day. However, the probability that such even may happen at the end of the next NASDAQ session is 0.11%. Non zero but nearly zero.

A quick verification of the probability estimate returned by our Bayesian method we obtain by changing the value of a threshold from -0.21 to Lthr = 0. Then, we get:

alpha', beta' = 422, 472
Pr(L <   0%) = 47.20%	[44.46%, 49.95%]
E(ne) = 126

i.e. 47.2% that on the next day the stock will close lower. It is in an excellent agreement with what can be calculated based on the return-series, i.e. there were 421 days with the negative daily returns, thus $421/892 = 0.4719$. Really cool, don’t you think?!

Now, imagine for a second that after first few days since the Facebook IPO we recalculate the posterior predictive mean, day-by-day, eventually to reach the most current estimate of the probability that the stock may lose -21% or more at the end of the next trading day — such a procedure allows us to plot the change of $\mbox{Pr}(L < -L_{thr})$ during the whole trading history of FB. We achieve that by running a modified version of the previous code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import numpy as np
from scipy.stats import beta
import matplotlib.pyplot as plt
import pandas_datareader.data as web
 
 
data = web.DataReader("FB", data_source='yahoo',
                  start='2012-05-18', end='2015-12-05')['Adj Close']
cp = np.array(data.values)  # daily adj-close prices
returns = cp[1:]/cp[:-1] - 1   # compute daily returns
 
Lthr = -0.21
 
Pr = CL1 = CL2 = np.array([])
for i in range(2, len(returns)):
    # data window
    ret = returns[0:i]
    N = len(ret)
    ne = np.sum(ret < Lthr)
    # prior
    alpha0 = beta0 = 1
    # posterior
    alpha1 = alpha0 + ne
    beta1 = beta0 + N - ne
    pr = alpha1/(alpha1+beta1)
    cl1 = beta.ppf(0.05, alpha1, beta1)
    cl2 = beta.ppf(0.95, alpha1, beta1)
    #
    Pr = np.concatenate([Pr, np.array([pr*100])])
    CL1 = np.concatenate([CL1, np.array([cl1*100])])
    CL2 = np.concatenate([CL2, np.array([cl2*100])])
 
plt.figure(num=1, figsize=(10, 5))
plt.plot(Pr, "k")
plt.plot(CL1, "r--")
plt.plot(CL2, "r--")
plt.axis("tight")
plt.ylim([0, 5])
plt.grid((True))
plt.show()

what displays:
figure_1
i.e. after 892 days we reach the value of 0.11% as found earlier. The black line tracks the change of probability estimate while both red dashed lines denote the 95% confidence interval. The black line is smooth, i.e. because of a lack of -21% event in the past, a recalculated probability did not changed its value “drastically”.

This is not the case when we consider the scenario for Lthr = -0.11, delivering:
figure_1
what stays in agreement with:

ne = 1
alpha', beta' = 2, 892
Pr(L < -11%) =  0.22%	[ 0.04%,  0.53%]
E(ne) = 1

and confirms that only one daily loss of $L < -11\%$ took place while the entire happy life of Facebook on NASDAQ so far.

Since the computer can find the number of losses to be within a certain interval, that gives us an opportunity to estimate:
$$
\mbox{Pr}(L_1 \le L < L_2) $$ in the following way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import numpy as np
from scipy.stats import beta
import matplotlib.pyplot as plt
import pandas_datareader.data as web
 
 
data = web.DataReader("FB", data_source='yahoo',
                  start='2012-05-18', end='2015-12-05')['Adj Close']
cp = np.array(data.values)  # daily adj-close prices
ret = cp[1:]/cp[:-1] - 1    # compute daily returns
N = len(ret)
 
dL = 2
for k in range(-50, 0, dL):
    ev = ret[(ret >= k/100) & (ret < (k/100+dL/100))]
    avgloss = 0
    ne = np.sum((ret >= k/100) & (ret < (k/100+dL/100)))
    #
    # prior
    alpha0 = beta0 = 1
    # posterior
    alpha1 = alpha0 + ne
    beta1 = beta0 + N - ne
    pr = alpha1/(alpha1+beta1)
    cl1 = beta.ppf(0.05, alpha1, beta1)
    cl2 = beta.ppf(0.95, alpha1, beta1)
    if(len(ev) > 0):
        avgloss = np.mean(ev)
        print("Pr(%3g%% < L < %3g%%) =\t%5.2f%%\t[%5.2f%%, %5.2f%%]  ne = %4g"
          "  E(L) = %.2f%%" % \
          (k, k+dL, pr*100, cl1*100, cl2*100, ne, avgloss*100))
    else:
        print("Pr(%3g%% < L < %3g%%) =\t%5.2f%%\t[%5.2f%%, %5.2f%%]  ne = " \
              "%4g" %  (k, k+dL, pr*100, cl1*100, cl2*100, ne))

By doing so, we visualise the distribution of potential losses (that may occur on the next trading day) along with their magnitudes:

Pr(-50% < L < -48%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-48% < L < -46%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-46% < L < -44%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-44% < L < -42%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-42% < L < -40%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-40% < L < -38%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-38% < L < -36%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-36% < L < -34%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-34% < L < -32%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-32% < L < -30%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-30% < L < -28%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-28% < L < -26%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-26% < L < -24%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-24% < L < -22%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-22% < L < -20%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-20% < L < -18%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-18% < L < -16%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-16% < L < -14%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-14% < L < -12%) =	 0.11%	[ 0.01%,  0.33%]  ne =    0
Pr(-12% < L < -10%) =	 0.34%	[ 0.09%,  0.70%]  ne =    2  E(L) = -11.34%
Pr(-10% < L <  -8%) =	 0.67%	[ 0.29%,  1.17%]  ne =    5  E(L) = -8.82%
Pr( -8% < L <  -6%) =	 0.89%	[ 0.45%,  1.47%]  ne =    7  E(L) = -6.43%
Pr( -6% < L <  -4%) =	 2.91%	[ 2.05%,  3.89%]  ne =   25  E(L) = -4.72%
Pr( -4% < L <  -2%) =	 9.84%	[ 8.26%, 11.53%]  ne =   87  E(L) = -2.85%
Pr( -2% < L <   0%) =	33.11%	[30.54%, 35.72%]  ne =  295  E(L) = -0.90%

For example, based on a complete trading history of FB, there is a probability of 0.34% that the stock will close with a loss to be between -12% and -10%, and if so, the “expected” loss would be -11.3% (estimated based solely on 2 events).

Again. The probability, not certainty.

Student t Distributed Linear Value-at-Risk

One of the most underestimated feature of the financial asset distributions is their kurtosis. A rough approximation of the asset return distribution by the Normal distribution becomes often an evident exaggeration or misinterpretations of the facts. And we know that. The problem arises if we investigate a Value-at-Risk (VaR) measure. Within a standard approach, it is computed based on the analytical formula:
$$
\mbox{VaR}_{h, \alpha} = \Phi ^{-1} (1-\alpha)\sqrt{h}\sigma – h\mu
$$ what expresses $(1-\alpha)$ $h$-day VaR, $\Phi ^{-1} (1-\alpha)$ denotes the standardised Normal distribution $(1-\alpha)$ quantile, and $h$ is a time horizon. The majority of stocks display significant excess kurtosis and negative skewness. In plain English that means that the distribution is peaked and likely to have a heavy left tail. In this scenario, the best fit of the Normal probability density function (pdf) to the asset return distribution underestimates the risk accumulated in a far negative territory.

In this post we will introduce the concept of Student t Distributed Linear VaR, i.e. an alternative method to measure VaR by taking into account “a correction” for kurtosis of the asset return distribution. We will use the market stock data of IBM as an exemplary case study and investigate the difference in a standard and non-standard VaR calculation based on the parametric models. It will also be an excellent opportunity to learn how to do it in Python, quickly and effectively.

1. Leptokurtic Distributions and Student t VaR

A leptokurtic distribution is one whose density function has a higher peak and greater mass in the tails than the normal density function of the same variance. In a symmetric unimodal distribution, i.e. one whose density function has only one peak, leptokurtosis is indicated by a positive excess kurtosis (Alexander 2008). The impact of leptokurtosis on VaR is non-negligible. Firstly, assuming high significance levels, e.g. $\alpha \le 0.01$, the estimation of VaR for Normal and leptokurtic distribution would return:
$$
\mbox{VaR}_{lepto,\ \alpha\le 0.01} \gt \mbox{VaR}_{Norm,\ \alpha\le 0.01}
$$
which is fine, however for low significance levels the situation may change:
$$
\mbox{VaR}_{lepto,\ \alpha\ge 0.05} \lt \mbox{VaR}_{Norm,\ \alpha\ge 0.05} \ .
$$
A good example of the leptokurtic distribution is Student t distribution. If it describes the data better and displays significant excess kurtosis, there is a good chance that VaR for high significance levels will be a much better measure of the tail risk. The Student t distribution is formally given by its pdf:
$$
f_\nu(x) = \sqrt{\nu\pi} \Gamma\left(\frac{\nu}{2}\right)^{-1} \Gamma\left(\frac{\nu+1}{2}\right)
(1+\nu^{-1} x^2)^{-(\nu+1)/2}
$$ where $\Gamma$ denotes the gamma function and $\nu$ the degrees of freedom. For $\nu\gt 2$ its variance is $\nu(\nu-2)^{-1}$ and the distribution has a finite excess kurtosis of $\kappa = 6(\nu-4)^{-1}$ for $\nu \gt 4$. It is obvious that for $\nu\to \infty$ the Student t pdf approaches the Normal pdf. Large values of $\nu$ for the fitted Student t pdf in case of the financial asset daily return distribution are unlikely, unless the Normal distribution is the best pdf model indeed (possible to obtain for very large samples).

We find $\alpha$ quantile of the Student t distribution ($\mu=0$ and $\sigma=1$) by the integration of the pdf and usually denote it as $\sqrt{\nu^{-1}(\nu-2)} t_{\nu}^{-1}(\alpha)$ where $t_{\nu}^{-1}$ is the percent point function of the t distribution. Accounting for the fact that in practice we deal with random variables of the form of $X = \mu + \sigma T$ where $X$ would describe the asset return given as a transformation of the standardised Student t random variable, we reach to the formal definition of the $(1-\alpha)$ $h$-day Student t VaR given by:
$$
\mbox{Student}\ t\ \mbox{VaR}_{\nu, \alpha, h} = \sqrt{\nu^{-1}(\nu-2)h}\ t_{\nu}^{-1}(1-\alpha)\sigma – h\mu
$$ Having that, we are ready to use this knowledge in practice and to understand, based on data analysis, the best regimes where one should (and should not) apply the Student t VaR as a more “correct” measure of risk.

2. Normal and Student t VaR for IBM Daily Returns

Within the following case study we will make of use of Yahoo! Finance data provider in order to fetch the price-series of the IBM stock (adjusted close) and transform it into return-series. As the first task we have to verify before heading further down this road will be the sample skewness and kurtosis. We begin, as usual:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Student t Distributed Linear Value-at-Risk
#   The case study of VaR at the high significance levels
# (c) 2015 Pawel Lachowicz, QuantAtRisk.com
 
import numpy as np
import math
from scipy.stats import skew, kurtosis, kurtosistest
import matplotlib.pyplot as plt
from scipy.stats import norm, t
import pandas_datareader.data as web
 
 
# Fetching Yahoo! Finance for IBM stock data
data = web.DataReader("IBM", data_source='yahoo',
                  start='2010-12-01', end='2015-12-01')['Adj Close']
cp = np.array(data.values)  # daily adj-close prices
ret = cp[1:]/cp[:-1] - 1    # compute daily returns
 
# Plotting IBM price- and return-series
plt.figure(num=2, figsize=(9, 6))
plt.subplot(2, 1, 1)
plt.plot(cp)
plt.axis("tight")
plt.ylabel("IBM Adj Close [USD]")
plt.subplot(2, 1, 2)
plt.plot(ret, color=(.6, .6, .6))
plt.axis("tight")
plt.ylabel("Daily Returns")
plt.xlabel("Time Period 2010-12-01 to 2015-12-01 [days]")

where we cover the most recent past 5 years of trading of IBM at NASDAQ. Both series visualised look like:
IBMseries
where, in particular, the bottom one reveals five days where the stock experienced 5% or higher daily loss.

In Python, a correct computation of the sample skewness and kurtosis is possible thanks to the scipy.stats module. As a test you may verify that for a huge sample (e.g. of 10 million) of random variables $X\sim N(0, 1)$, the expected skewness and kurtosis,

from scipy.stats import skew, kurtosis
X = np.random.randn(10000000)
print(skew(X))
print(kurtosis(X, fisher=False))

is indeed equal 0 and 3,

0.0003890933008049
3.0010747070253507

respectively.

As we will convince ourselves in a second, this is not the case for our IBM return sample. Continuing,

31
32
33
34
35
36
37
38
39
40
41
print("Skewness  = %.2f" % skew(ret))
print("Kurtosis  = %.2f" % kurtosis(ret, fisher=False))
# H_0: the null hypothesis that the kurtosis of the population from which the
# sample was drawn is that of the normal distribution kurtosis = 3(n-1)/(n+1)
_, pvalue = kurtosistest(ret)
beta = 0.05
print("p-value   = %.2f" % pvalue)
if(pvalue < beta):
    print("Reject H_0 in favour of H_1 at %.5f level\n" % beta)
else:
    print("Accept H_0 at %.5f level\n" % beta)

we compute the moments and run the (excess) kurtosis significance test. As already explained in the comment, we are interested whether the derived value of kurtosis is significant at $\beta=0.05$ level or not? We find that for IBM, the results reveal:

Skewness  = -0.70
Kurtosis  = 8.39
p-value   = 0.00
Reject H_0 in favour of H_1 at 0.05000 level

id est, we reject the null hypothesis that the kurtosis of the population from which the sample was drawn is that of the normal distribution kurtosis in favour of the alternative hypothesis ($H_1$) that it is not. And if not then we have work to do!

In the next step we fit the IBM daily returns data sample with both Normal pdf and Student t pdf in the following way:

43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# N(x; mu, sig) best fit (finding: mu, stdev)
mu_norm, sig_norm = norm.fit(ret)
dx = 0.0001  # resolution
x = np.arange(-0.1, 0.1, dx)
pdf = norm.pdf(x, mu_norm, sig_norm)
print("Integral norm.pdf(x; mu_norm, sig_norm) dx = %.2f" % (np.sum(pdf*dx)))
print("Sample mean  = %.5f" % mu_norm)
print("Sample stdev = %.5f" % sig_norm)
print()
 
# Student t best fit (finding: nu)
parm = t.fit(ret)
nu, mu_t, sig_t = parm
pdf2 = t.pdf(x, nu, mu_t, sig_t)
print("Integral t.pdf(x; mu, sig) dx = %.2f" % (np.sum(pdf2*dx)))
print("nu = %.2f" % nu)
print()

where we ensure that the complete integral over both pdfs return 1, as expected:

Integral norm.pdf(x; mu_norm, sig_norm) dx = 1.00
Sample mean  = 0.00014
Sample stdev = 0.01205
 
Integral t.pdf(x; mu, sig) dx = 1.00
nu = 3.66

The best fit of the Student t pdf returns the estimate of 3.66 degrees of freedom.

From this point, the last step is the derivation of two distinct VaR measure, the first one corresponding to the best fit of the Normal pdf and the second one to the best fit of the Student t pdf, respectively. The trick is

61
62
63
64
65
66
67
68
69
# Compute VaR
h = 1  # days
alpha = 0.01  # significance level
StudenthVaR = (h*(nu-2)/nu)**0.5 * t.ppf(1-alpha, nu)*sig_norm - h*mu_norm
NormalhVaR = norm.ppf(1-alpha)*sig_norm - mu_norm
 
lev = 100*(1-alpha)
print("%g%% %g-day Student t VaR = %.2f%%" % (lev, h, StudenthVaR*100))
print("%g%% %g-day Normal VaR    = %.2f%%" % (lev, h, NormalhVaR*100))

that in the calculation of the $(1-\alpha)$ $1$-day Student t VaR ($h=1$) according to:
$$
\mbox{Student}\ t\ \mbox{VaR}_{\nu, \alpha=0.01, h=1} = \sqrt{\nu^{-1}(\nu-2)}\ t_{\nu}^{-1}(0.99)\sigma – \mu
$$ we have to take $\sigma$ and $\mu$ given by sig_norm and mu_norm variable (in the code) and not the ones returned by the t.fit(ret) function in line #54. We find out that:

99% 1-day Student t VaR = 3.19%
99% 1-day Normal VaR    = 2.79%

i.e. VaR derived based on the Normal pdf best fit may underestimate the tail risk for IBM, as suspected in the presence of highly significant excess kurtosis. Since the beauty of the code is aesthetic, the picture is romantic. We visualise the results of our investigation, extending our code with:

71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
plt.figure(num=1, figsize=(11, 6))
grey = .77, .77, .77
# main figure
plt.hist(ret, bins=50, normed=True, color=grey, edgecolor='none')
plt.hold(True)
plt.axis("tight")
plt.plot(x, pdf, 'b', label="Normal PDF fit")
plt.hold(True)
plt.axis("tight")
plt.plot(x, pdf2, 'g', label="Student t PDF fit")
plt.xlim([-0.2, 0.1])
plt.ylim([0, 50])
plt.legend(loc="best")
plt.xlabel("Daily Returns of IBM")
plt.ylabel("Normalised Return Distribution")
# inset
a = plt.axes([.22, .35, .3, .4])
plt.hist(ret, bins=50, normed=True, color=grey, edgecolor='none')
plt.hold(True)
plt.plot(x, pdf, 'b')
plt.hold(True)
plt.plot(x, pdf2, 'g')
plt.hold(True)
# Student VaR line
plt.plot([-StudenthVaR, -StudenthVaR], [0, 3], c='g')
# Normal VaR line
plt.plot([-NormalhVaR, -NormalhVaR], [0, 4], c='b')
plt.text(-NormalhVaR-0.01, 4.1, "Norm VaR", color='b')
plt.text(-StudenthVaR-0.0171, 3.1, "Student t VaR", color='g')
plt.xlim([-0.07, -0.02])
plt.ylim([0, 5])
plt.show()

what brings us to:
IBMrisk
where in the inset, we zoomed in the left tail of the distribution marking both VaR results. Now, it is much clearly visible why at $\alpha=0.01$ the Student t VaR is greater: the effect of leptokurtic distribution. Moreover, the fit of the Student t pdf appears to be a far better parametric model describing the central mass (density) of IBM daily returns.

There is an ongoing debate which one out of these two risk measures should be formally applied when it comes to reporting of 1-day VaR. I do hope that just by presenting the abovementioned results I made you more aware that: (1) it is better to have a closer look at the excess kurtosis for your favourite asset (portfolio) return distribution, and (2) normal is boring.

Someone wise once said: “Don’t be afraid of being different. Be afraid of being the same as everyone else.” The same means normal. Dare to fit non-Normal distributions!

DOWNLOAD
     tStudentVaR.py

REFERENCES
     Alexander, C., 2008, Market Risk Analysis. Volume IV., John Wiley & Sons Ltd

Predicting Heavy and Extreme Losses in Real-Time for Portfolio Holders (1)

The probability of improbable events. The simplicity amongst complexity. The purity in its best form. The ultimate cure for those who trade, for those who invest. Does it exist? Can we compute it? Is it really something impossible? In this post we challenge ourselves to the frontiers of accessible statistics and data analysis in order to find most optimal computable solutions to this enigmatic but mind-draining problem. Since there is no certainty, everything remains shrouded in the veil of probability. The probability we have to face and hope that something unlikely, determined to take place anyhow, eventually will not happen.


Our goal is to calculate the probability of a very rare event (e.g. a heavy and/or extreme loss) in the trading market (e.g. of a stock plummeting 5% or much more) in a specified time-horizon (e.g. on the next day, in one week, in one month, etc.). The probability. Not the certainty of that event.

In this Part 1, first, we look at the tail of an asset return distribution and compress our knowledge on Value-at-Risk (VaR) to extract the essence required to understand why VaR-stuff is not the best card in our deck. Next, we move to a classical Bayes’ theorem which helps us to derive a conditional probability of a rare event given… yep, another event that (hypothetically) will take place. Eventually, in Part 2, we will hit the bull between its eyes with an advanced concept taken from the Bayesian approach to statistics and map, in real-time, for any return-series its loss probabilities. Again, the probabilities, not certainties.

1. VaR (not) for Rare Events

In the framework of VaR we take into consideration $T$ days of trading history of an asset. Next, we drive a number (VaR) that describes a loss that is likely to take place with the probability of approximately $\alpha$. “To take place” does not mean here that it will take place. In this approach we try to provide some likelihood (a quantitative measure) of the rare event in a specified time-horizon (e.g. on the next day if daily return-series are under our VaR investigation; a scenario considered in this post).

If by $L$ we denote a loss (in percent) an asset can experience on the next day, then:
$$
\mbox{Pr}(L \le -\mbox{VaR}_{1-\alpha}) = \alpha
$$ would be the probability of a loss of $-L\times D$ dollars where $-L\times D\ge -\mbox{VaR}\times D$, equal, for instance, $\alpha=0.05$ (also referred to as $(1-\alpha)$% VaR measure; $\mbox{VaR}_{95}$, etc.) and $D$ is the position size (money invested in the asset in terms of physical currency, e.g. in dollars). In other words, the historical data can help us to find $\mbox{VaR}_{95}$ given $\alpha$ assuming 5% of chances that $\mbox{VaR}_{1-\alpha}$ will be exceeded on the next day.

In order to illustrate that case and its shortcomings when it comes to the analysis of rare events, let’s look at the 10 year trading history of two stocks in the NASDAQ market: highly volatile CAAS (China Automotive Systems, Inc.; of the market capital of 247M) and highly liquid AAPL (Apple Inc.; of the market capital of 750B). First, we fetch their adjusted close price-series from Yahoo! Finance and derive the corresponding daily return-series utilising Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Predicting Heavy and Extreme Losses in Real-Time for Portfolio Holders
# (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
#
# heavy1.py
 
import pandas.io.data as web
import matplotlib.pyplot as plt
import numpy as np
from pyvar import findvar, findalpha
 
 
# ---1. Data Processing
 
# fetch and download daily adjusted-close price series for CAAS
#  and AAPL stocks using Yahoo! Finance public data provider
caas = web.DataReader("CAAS", data_source='yahoo',
                      start='2005-05-13', end='2015-05-13')['Adj Close']
aapl = web.DataReader("AAPL", data_source='yahoo',
                      start='2005-05-13', end='2015-05-13')['Adj Close']
 
CAAScp = np.array(caas.values)
AAPLcp = np.array(aapl.values)
 
f = file("data1.dat","wb")
np.save(f, CAAScp)
np.save(f, AAPLcp)
f.close()
 
# read in the data from a file
f = file("data1.dat","rb")
CAAScp = np.load(f)
AAPLcp = np.load(f)
f.close()
 
# compute return-series
retCAAS = CAAScp[1:]/CAAScp[:-1]-1
retAAPL = AAPLcp[1:]/AAPLcp[:-1]-1

The best way to understand the data is by plotting them:

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# plotting (figure #1)
#  adjusted-close price-series
fig, ax1 = plt.subplots(figsize=(10, 6))
plt.xlabel("Trading days 13/05/2005-13/05/2015")
plt.plot(CAAScp, '-r', label="CAAS")
plt.axis("tight")
plt.legend(loc=(0.02, 0.8))
plt.ylabel("CAAS Adj Close Price (US$)")
ax2 = ax1.twinx()
plt.plot(AAPLcp, '-', label="AAPL")
plt.legend(loc=(0.02, 0.9))
plt.axis("tight")
plt.ylabel("AAPL Adj Close Price (US$)")
 
# plotting (figure #2)
#  daily return-series
plt.figure(num=2, figsize=(10, 6))
plt.subplot(211)
plt.grid(True)
plt.plot(retCAAS, '-r', label="CAAS")
plt.axis("tight")
plt.ylim([-0.25,0.5])
plt.legend(loc="upper right")
plt.ylabel("CAAS daily returns")
plt.subplot(212)
plt.grid(True)
plt.plot(retAAPL, '-', label="AAPL")
plt.legend(loc="upper right")
plt.axis("tight")
plt.ylim([-0.25,0.5])
plt.ylabel("AAPL daily returns")
plt.xlabel("Trading days 13/05/2005-13/05/2015")

We obtain the price-series
figure_1
and return-series
figure_2

respectively. For the latter plot, by fixing the scaling of both $y$-axes we immediately gain an chance to inspect the number of daily trades closing with heavy losses. Well, at least at first glance and for both directions of trading. In this post we will be considering the long positions only.

Having our data pre-processed we may implement two different strategies to make use of the VaR framework in order to work out the probabilities for tail events. The first one is based on setting $\alpha$ level and finding $-VaR_{1-\alpha}$. Let’s assume $\alpha=0.01$, then the following piece of code

73
74
75
76
77
78
79
80
# ---2. Compuation of VaR given alpha
 
alpha = 0.01
 
VaR_CAAS, bardata1 = findvar(retCAAS, alpha=alpha, nbins=200)
VaR_AAPL, bardata2 = findvar(retAAPL, alpha=alpha, nbins=100)
 
cl = 100.*(1-alpha)

aims at computation of the corresponding numbers making use of the function:

def findvar(ret, alpha=0.05, nbins=100):
    # Function computes the empirical Value-at-Risk (VaR) for return-series
    #   (ret) defined as NumPy 1D array, given alpha
    # (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
    #
    # compute a normalised histogram (\int H(x)dx = 1)
    #  nbins: number of bins used (recommended nbins>50)
    hist, bins = np.histogram(ret, bins=nbins, density=True)
    wd = np.diff(bins)
    # cumulative sum from -inf to +inf
    cumsum = np.cumsum(hist * wd)
    # find an area of H(x) for computing VaR
    crit = cumsum[cumsum <= alpha]
    n = len(crit)
    # (1-alpha)VaR
    VaR = bins[n]
    # supplementary data of the bar plot
    bardata = hist, n, wd
    return VaR, bardata

Here, we create the histogram with a specified number of bins and $-VaR_{1-\alpha}$ is found in an empirical manner. For many reasons this approach is much better than fitting the Normal Distribution to the data and finding VaR based on the integration of that continuous function. It is a well-know fact that such function would underestimate the probabilities in far tail of the return distribution. And the game is all about capturing what is going out there, right?

The results of computation we display as follows:

82
83
print("%g%% VaR (CAAS) = %.2f%%" % (cl, VaR_CAAS*100.))
print("%g%% VaR (AAPL) = %.2f%%\n" % (cl, VaR_AAPL*100.))

i.e.

99% VaR (CAAS) = -9.19%
99% VaR (AAPL) = -5.83%

In order to gain a good feeling of those numbers we display the left-tails of both return distributions

85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# plotting (figure #3)
#  histograms of daily returns; H(x)
#
plt.figure(num=3, figsize=(10, 6))
c = (.7,.7,.7)  # grey color (RGB)
#
# CAAS
ax = plt.subplot(211)
hist1, bins1 = np.histogram(retCAAS, bins=200, density=False)
widths = np.diff(bins1)
b = plt.bar(bins1[:-1], hist1, widths, color=c, edgecolor="k", label="CAAS")
plt.legend(loc=(0.02, 0.8))
#
# mark in red all histogram values where int_{-infty}^{VaR} H(x)dx = alpha
hn, nb, _ = bardata1
for i in range(nb):
    b[i].set_color('r')
    b[i].set_edgecolor('k')
plt.text(-0.225, 30, "VaR$_{%.0f}$ (CAAS) = %.2f%%" % (cl, VaR_CAAS*100.))
plt.xlim([-0.25, 0])
plt.ylim([0, 50])
#
# AAPL
ax2 = plt.subplot(212)
hist2, bins2 = np.histogram(retAAPL, bins=100, density=False)
widths = np.diff(bins2)
b = plt.bar(bins2[:-1], hist2, widths, color=c, edgecolor="k", label="AAPL")
plt.legend(loc=(0.02, 0.8))
#
# mark in red all histogram bars where int_{-infty}^{VaR} H(x)dx = alpha
hn, nb, wd = bardata2
for i in range(nb):
    b[i].set_color('r')
    b[i].set_edgecolor('k')
plt.text(-0.225, 30, "VaR$_{%.0f}$ (AAPL) = %.2f%%" % (cl, VaR_AAPL*100.))
plt.xlim([-0.25, 0])
plt.ylim([0, 50])
plt.xlabel("Stock Returns (left tail only)")
plt.show()

where we mark our $\alpha=$1% regions in red:
heavy3

As you may notice, so far, we haven’t done much new. A classical textbook example coded in Python. However, the last figure reveals the main players of the game. For instance, there is only 1 event of a daily loss larger than 15% for AAPL while CAAS experienced 4 heavy losses. Much higher 99% VaR for CAAS takes into account those 4 historical events and put more weight on 1-day VaR as estimated in our calculation.

Imagine now that we monitor all those tail extreme/rare events day by day. It’s not too difficult to notice (based on the inspection of Figure #2) that in case of AAPL, the stock recorded its first serious loss of $L \lt -10$% approximately 650 days since the beginning of our “monitoring” which commenced on May 13, 2005. In contrast, CAAS was much more volatile and you needed to wait only ca. 100 days to record the loss of the same magnitude.

If something did not happen, e.g. $L \lt -10$%, the VaR-like measure is highly inadequate measure of probabilities for rare events. Once the event took place, the estimation of VaR changes (is updated) but decreases in time until a new rare event occurs. Let’s illustrate it with Python. This is our second VaR strategy: finding $\alpha$ given the threshold for rare events. We write a simple function that does the job for us:

def findalpha(ret, thr=1, nbins=100):
    # Function computes the probablity P(X<thr)=alpha given threshold
    #   level (thr) and return-series (NumPy 1D array). X denotes the
    #   returns as a rv and nbins is number of bins used for histogram
    # (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
    #
    # compute normalised histogram (\int H(x)dx=1)
    hist, bins = np.histogram(ret, bins=nbins, density=True)
    # compute a default histogram
    hist1, bins1 = np.histogram(ret, bins=nbins, density=False)
    wd = np.diff(bins1)
    x = np.where(bins1 < thr)
    y = np.where(hist1 != 0)
    z = list(set(x[0]).intersection(set(y[0])))
    crit = np.cumsum(hist[z]*wd[z])
    # find alpha
    try:
        alpha = crit[-1]
    except Exception as e:
        alpha = 0
    # count number of events falling into (-inft, thr] intervals
    nevents = np.sum(hist1[z])
    return alpha, nevents

We call it in our main program:

126
127
128
129
130
131
132
133
134
135
136
137
138
# ---3. Computation of alpha, given the threshold for rare events
 
thr = -0.10
 
alpha1, ne1 = findalpha(retCAAS, thr=thr, nbins=200)
alpha2, ne2 = findalpha(retAAPL, thr=thr, nbins=200)
 
print("CAAS:")
print("  Pr( L < %.2f%% ) = %.2f%%" % (thr*100., alpha1*100.))
print("  %g historical event(s)" % ne1)
print("AAPL:")
print("  Pr( L < %.2f%% ) = %.2f%%" % (thr*100., alpha2*100.))
print("  %g historical event(s)" % ne2)

what returns the following results:

CAAS:
  Pr( L < -10.00% ) = 0.76%
  19 historical event(s)
AAPL:
  Pr( L < -10.00% ) = 0.12%
  3 historical event(s)

And that’s great however all these numbers are given as a summary of 10 years of analysed data (May 13/2005 to May/13 2015 in our case). The final picture could be better understood if we could dynamically track in time the changes of both probabilities and the number of rare events. We achieve it by the following not-state-of-the-art code:

141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
# ---4. Mapping alphas for Rare Events
 
alphas = []
nevents = []
for t in range(1,len(retCAAS)-1):
    data = retCAAS[0:t]
    alpha, ne = findalpha(data, thr=thr, nbins=500)
    alphas.append(alpha)
    nevents.append(ne)
 
alphas2 = []
nevents2 = []
for t in range(1,len(retAAPL)-1):
    data = retAAPL[0:t]
    alpha, ne = findalpha(data, thr=thr, nbins=500)
    alphas2.append(alpha)
    nevents2.append(ne)
 
# plotting (figure #4)
#   running probability for rare events
#
plt.figure(num=4, figsize=(10, 6))
ax1 = plt.subplot(211)
plt.plot(np.array(alphas)*100., 'r')
plt.plot(np.array(alphas2)*100.)
plt.ylabel("Pr( L < %.2f%% ) [%%]" % (thr*100.))
plt.axis('tight')
ax2 = plt.subplot(212)
plt.plot(np.array(nevents), 'r', label="CAAS")
plt.plot(np.array(nevents2), label="AAPL")
plt.ylabel("# of Events with L < %.2f%%" % (thr*100.))
plt.axis('tight')
plt.legend(loc="upper left")
plt.xlabel("Trading days 13/05/2005-13/05/2015")
plt.show()

revealing the following picture:
heavy4

It’s a great way of looking at and understanding the far left-tail volatility of the asset under your current investigation. The probability between two rare/extreme events decreases for the obvious reason: along the time axis we include more and more data therefore the return distribution evolves and shifts its mass to the right leaving left tail events less and less probable.

The question remains: if an asset has never experienced an extreme loss of given magnitute, the probability of such event, within our VaR framework, simply remains zero! For example, for losses $L \lt -20$%,
heavy4a
CAAS displays 2 extreme events while AAPL none! Therefore, our estimation of superbly rare daily loss of -20% (or more) for AAPL stock based on 10 years of data is zero or undefined or… completely unsound. Can we do better than this?

2. Classical Conditional Prediction for Rare Events

Let’s consider a case where we want to predict the probability of the asset/stock (CAAS; traded at NASDAQ exchange) falling down more than $-L$%. Previously we have achieved such estimation through the integration of its probability density function. An alternative way is via derivation of the conditional probability, i.e. that CAAS will lose more than $-L$% given that NASDAQ index drops more than $-L$% on the next day.

Formula? Well, Bayes’ formula. That all what we need:
$$
\mbox{Pr}(B|R) = \frac{ \mbox{Pr}(R|B)\ \mbox{Pr}(B) } { \mbox{Pr}(R) } .
$$
Great, now, the meaning of each term. Let $A$ denotes the event of the stock daily return to be between -L% and 0%. Let B is the event of the stock return to be $\lt -L$%. Let $R$ is the event of NASDAQ daily return to be $\lt -L$%. Given that, based on both CAAS and NASDAQ historical time-series, we are able to compute $\mbox{Pr}(A)$, $\mbox{Pr}(B)$, $\mbox{Pr}(R|A)$, $\mbox{Pr}(R|B)$, therefore
$$
\mbox{Pr}(R) = \mbox{Pr}(R|A)\mbox{Pr}(A) + \mbox{Pr}(R|B)\mbox{Pr}(B)
$$ as well. Here, $\mbox{Pr}(R|A)$ would stand for the probability of NASDAQ falling down more than $-L$% given the observation of the CAAS return to be in $(-L;0)$% interval and $\mbox{Pr}(R|B)$ would denote the probability of NASDAQ falling down more than $-L$% given the observation of the CAAS return also dropping more than $-L$% on the same day.

With Bayes’ formula we inverse the engineering, aiming at providing the answer on $\mbox{Pr}(B|R)$ given all available data. Ready to code it? Awesome. Here we go:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Predicting Heavy and Extreme Losses in Real-Time for Portfolio Holders
# (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
#
# heavy2.py
 
import pandas.io.data as web
import matplotlib.pyplot as plt
import numpy as np
 
# ---1. Data Processing
 
# fetch and download daily adjusted-close price series for CAAS stock
#  and NASDAQ index using Yahoo! Finance public data provider
'''
caas = web.DataReader("CAAS", data_source='yahoo',
                   start='2005-05-13', end='2015-05-13')['Adj Close']
nasdaq = web.DataReader("^IXIC", data_source='yahoo',
                   start='2005-05-13', end='2015-05-13')['Adj Close']
 
CAAScp = np.array(caas.values)
NASDAQcp = np.array(nasdaq.values)
 
f = file("data2.dat","wb")
np.save(f,CAAScp)
np.save(f,NASDAQcp)
f.close()
'''
 
f = file("data2.dat","rb")
CAAScp = np.load(f)
NASDAQcp = np.load(f)
f.close()
 
# compute the return-series
retCAAS = CAAScp[1:]/CAAScp[:-1]-1
retNASDAQ = NASDAQcp[1:]/NASDAQcp[:-1]-1

The same code as used in Section 1 but instead of AAPL data, we fetch NASDAQ index daily close prices. Let’s plot the return-series:

37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# plotting (figure #1)
#  return-series for CAAS and NASDAQ index
#
plt.figure(num=2, figsize=(10, 6))
plt.subplot(211)
plt.grid(True)
plt.plot(retCAAS, '-r', label="CAAS")
plt.axis("tight")
plt.ylim([-0.25,0.5])
plt.legend(loc="upper right")
plt.ylabel("CAAS daily returns")
plt.subplot(212)
plt.grid(True)
plt.plot(retNASDAQ, '-', label="NASDAQ")
plt.legend(loc="upper right")
plt.axis("tight")
plt.ylim([-0.10,0.15])
plt.ylabel("NASDAQ daily returns")
plt.xlabel("Trading days 13/05/2005-13/05/2015")
plt.show()

i.e.,
heavy5
where we observe a different trading dynamics for NASDAQ index between 750th and 1000th day (as counted from May/13, 2005). Interestingly, the heaviest losses of NASDAQ are not ideally correlated with those of CAAS. That make this case study more exciting! Please also note that CAAS is not the part of the NASDAQ index (i.e., its component). Bayes’ formula is designed around independent events and, within a fair approximation, we may think of CAAS as an asset ticking off that box here.

What remains and takes a lot of caution is the code that “looks at” the data in a desired way. Here is its final form:

58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
# ---2. Computations of Conditional Probabilities for Rare Events
 
# isolate return-series displaying negative returns solely
#  set 1 for time stamps corresponding to positive returns
nretCAAS = np.where(retCAAS < 0, retCAAS, 1)
nretNASDAQ = np.where(retNASDAQ < 0, retNASDAQ, 1)
 
# set threshold for rare events
thr = -0.065
 
# compute the sets of events
A = np.where(nretCAAS < 0, nretCAAS, 1)
A = np.where(A >= thr, A, 1)
B = np.where(nretCAAS < thr, retCAAS, 1)
R = np.where(nretNASDAQ < thr, retNASDAQ, 1)
nA = float(len(A[A != 1]))
nB = float(len(B[B != 1]))
n = float(len(nretCAAS[nretCAAS != 1]))  # n must equal to nA + nB
# (optional)
print(nA, nB, n == (nA + nB))  # check, if True then proceed further
print(len(A), len(B), len(R))
print
 
# compute the probabilities
pA = nA/n
pB = nB/n
 
# compute the conditional probabilities
pRA = np.sum(np.where(R+A < 0, 1, 0))/n
pRB = np.sum(np.where(R+B < 0, 1, 0))/n
 
pR = pRA*pA + pRB*pB
 
# display results
print("Pr(A)\t = %5.5f%%" % (pA*100.))
print("Pr(B)\t = %5.5f%%" % (pB*100.))
print("Pr(R|A)\t = %5.5f%%" % (pRA*100.))
print("Pr(R|B)\t = %5.5f%%" % (pRB*100.))
print("Pr(R)\t = %5.5f%%" % (pR*100.))
 
if(pR>0):
    pBR = pRB*pB/pR
    print("\nPr(B|R)\t = %5.5f%%" % (pBR*100.))
else:
    print("\nPr(B|R) impossible to be determined. Pr(R)=0.")

Python’s NumPy library helps us in a tremendous way by its smartly designed function of where which we employ in lines #69-72 and #86-87. First we test a logical condition, if it’s evaluated to True we grab the right data (actual returns), else we return 1. That opens for us a couple of shortcuts in finding the number of specific events and making sure we are still on the right side of the force (lines #73-79).

As you can see, in line #66 we specified our threshold level of $L = -6.5$%. Given that, we derive the probabilities:

(1216.0, 88.0, True)
(2516, 2516, 2516)
 
Pr(A)	 = 93.25153%
Pr(B)	 = 6.74847%
Pr(R|A)	 = 0.07669%
Pr(R|B)	 = 0.15337%
Pr(R)	 = 0.08186%
 
Pr(B|R)	 = 12.64368%

The level of -6.5% is pretty random but delivers an interesting founding. Namely, based on 10 years of data there is 12.6% of chances that on May/14 2015 CAAS will lose more than -6.5% if NASDAQ drops by the same amount. How much exactly? We don’t know. It’s not certain. Only probable in 12.6%.

The outcome of $\mbox{Pr}(R|B)$ which is close to zero may support our assumption that CAAS has a negligible “influence” on NASDAQ itself. On the other side of the rainbow, it’s much more difficult to interpret $\mbox{Pr}(R)$, the probability of an “isolated” rare event (i.e., $L<-6.5$%) since its estimation is solely based on two players in the game: CAAS and NASDAQ. However, when computed for $N\ge 100$ stocks outside the NASDAQ index but traded at NASDAQ, such distribution of $\mbox{Pr}(R)$'s would be of great value as an another alternative estimator for rare events (I should write about it a distinct post). Now, back the business. There is one problem with our conditional estimation of rare event probabilities as outlined within this Section. A quick check for $L = -10$% reveals:

Pr(A)	 = 98.69632%
Pr(B)	 = 1.30368%
Pr(R|A)	 = 0.00000%
Pr(R|B)	 = 0.00000%
Pr(R)	 = 0.00000%
 
Pr(B|R) impossible to be determined. Pr(R)=0.

Let’s remind that $R$ is the event of NASDAQ daily return to be $\lt -L$%. A quick look at NASDAQ return-series tells us the whole story in order to address the following questions that come to your mind: why, why, why zero? Well, simply because there was no event in past 10 year history of the index that it slid more than -10%. And we are cooked in the water.

A visual representation of that problem can be obtained by executing the following piece of code:

104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
from pyvar import cpr  # non-at-all PEP8 style ;)
 
prob = []
for t in range(2,len(retCAAS)-1):
    ret1 = retCAAS[0:t]
    ret2 = retNASDAQ[0:t]
    pBR, _ = cpr(ret1, ret2, thr=thr)
    prob.append(pBR)
 
# plotting (figure #2)
#   the conditional probability for rare events given threshold
#
plt.figure(num=2, figsize=(10, 6))
plt.plot(np.array(prob)*100., 'r')
plt.ylabel("Pr(B|R) [%%]")
plt.axis('tight')
plt.title("L$_{thr}$ = %.2f%%" % (thr*100.))
plt.xlabel("Trading days 13/05/2005-13/05/2015")
plt.show()

where we moved all Bayes’-based calculations (lines #58-102) into a function cpr (a part of pyvar.py local library; see Download section below) and repeated our previous experiment, however, this time, $\mbox{Pr}(R|B)$ changing in time given the threshold of $L=-6.5$%. The resulting plot would be:
heavy7
Before ca. 800th day the NASDAQ index was lacking any event of $L\lt -6.5$% therefore the $\mbox{Pr}(R|B)$ could not be determined. After that period we got some heavy hits and the conditional probability could be derived. The end value (as for May 13, 2015) is 12.64%.

Hope amongst Hopelessness?

There is no better title to summarise the outcomes we derived. Our goal was to come up with some (running) probability for very rare event(s) that, in our case, would be expressed as a 1-day loss of $-L$% in trading for a specified financial asset. In the first attempt we engaged VaR-related framework and the estimation of occurrence of the rare event based on the PDF integration. In our second attempt we sought for an alternative solution making use of Bayes’ conditional prediction of the asset’s loss of $-L$% (or more) given the observation of the same event somewhere else (NASDAQ index). For the former we ended up the a running probability of $\mbox{Pr}(L \le L_{thr})$ while for the latter with $\mbox{Pr}(R|B)$ given $L_{thr}$ for both B and R events.

Assuming $L_{thr}$ to be equal to -6.5%, we can present the corresponding computations in the following chart:
heavy8

Well, the situation looks hopeless. It ain’t get better even if, as discussed earlier, the probability of $\mbox{Pr}(R)$ has been added. Both methodologies seem to deliver the answer to the same question however, somehow, we are left confused, disappointed, misled, in tasteless despair…

Stay tuned as in Part 2 we will see the light at the end of the tunnel. I promise.

DOWNLOAD
    heavy1.py, heavy2.py, pyvar.py

RELATED POSTS
    VaR and Expected Shortfall vs. Black Swan
    Extreme VaR for Portfolio Managers

Applied Portfolio VaR Decomposition. (2) Impact vs Moving Elements.


Calculations of daily Value-at-Risk (VaR) for any $N$-asset portfolio, as we have studied it already in Part 1, heavily depend on the covariance matrix we need to estimate. This estimation requires historical return time-series. Often negligible but superbly important question one should ask here is: How long?! How long those return-series ought to be in order to provide us with fair and sufficient information on the daily return distribution of each asset held in the portfolio?

It is intuitive that taking data covering solely past 250 trading days and, for example, past 2500 days (moving element) would have a substantial impact on the calculation of current portfolio VaR. In this short post, we will examine this relation.

Case Study (Supplementary Research)

Using the example of 7-asset portfolio from Part 1, we modify our Matlab code that allows us to download the history of stock (adjusted) close prices covering past 10 calendar years (ca. 3600 days),

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
% Applied Portfolio VaR Decomposition. (2) Impact vs Moving Elements.
% (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
 
 
clear all; close all; clc;
format short
 
% Read the list of the portfolio components
fileID = fopen('portfolio.lst');
tmp = textscan(fileID, '%s');
fclose(fileID);
pc=tmp{1};  % a list as a cell array
 
% fetch stock data for last 10 years since:
t0=735976; % 12-Jan-2015
date2=datestr(t0,'yyyy-mm-dd');        % from
date1=datestr(t0-10*365,'yyyy-mm-dd');  % to
 
%{
% create an empty array for storing stock data
stockd={};
% scan the tickers and fetch the data from Quandl.com
for i=1:length(pc)
    quandlc=['WIKI/',pc{i}];
    fprintf('%4.0f %s\n',i,quandlc);
    % fetch the data of the stock from Quandl
    % using recommanded Quandl's command and
    % saving them directly into Matlab's FTS object (fts)
    fts=0;
    [fts,headers]=Quandl.get(quandlc,'type','fints', ...
                  'authcode','YourQuandlCode',...
                  'start_date',date1,'end_date',date2);
    stockd{i}=fts; % entire FTS object in an array's cell
end
save data2
%}
load data2

Assuming the same initial holdings (position sizes) for each asset:

39
40
41
42
43
44
45
46
% an initial dollar expose per position
d=[   55621.00; ...
     101017.00; ...
      23409.00; ...
    1320814.00; ...
     131145.00; ...
     321124.00; ...
    1046867.00];

we aim at calculation of the portfolio VaR,
$$
\mbox{VaR}_P = 1.65 \sigma_P C = 1.65\left(\boldsymbol{d}’ \boldsymbol{M}_2 \boldsymbol{d} \right)^{1/2}
$$ in a function of time or, alternatively speaking, in a function of varying length of return time-series:

48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
% examine VaR_P as a function of historical data
% taken into account
e=[];
for td=(10*250):-1:60   % loop over trading days
    rs=[];
    for i=1:length(pc)
        cp=fts2mat(stockd{i}.Adj_Close,0);
        rv=cp(2:end,1)./cp(1:end-1,1)-1;
        rs=[rs rv(end-td:end)];
    end
    rs(isnan(rs))=0.0;
    % covariance matrix
    M2=cov(rs);
    % the portfolio volatility
    vol_P=sqrt(d'*M2*d);
    % diversified portfolio VaR
    VaR_P=1.65*vol_P;
    e=[e; td VaR_P];
end

The meaning of the loop is straightforward: How does $\mbox{VaR}_P$ change if we take only last 60 trading days into account, and what is its value when we include more and more data going back in time as far as 10 years?

The best way to illustrate the results is to compare running portfolio VaR with the stock market itself, here reflected by ETF of SPY,

68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
% download close price time-series for SPY (ETF)
% tracking S&P500 Index
[fts,headers]=Quandl.get('GOOG/NYSE_SPY','type','fints', ...
                  'authcode','YourQuandlCode',...
                  'start_date',date1,'end_date',date2);
SPY_cp=fts2mat(fts.Close,0);
 
% plot results
subplot(2,1,1)
xtmp=(2501:-1:1)./250;
plot(xtmp,SPY_cp(end-10*250:end,1));
xlim([min(xtmp) max(xtmp)]);
set(gca,'xdir','reverse');
ylabel('SPY Close Price [$]')
 
subplot(2,1,2)
plot(e(:,1)./250,e(:,2),'r');
set(gca,'xdir','reverse');
ylabel('VaR_P [$]')
xlabel('Time since 12-Jan-2015 [trading years]')

uncovering the big picture as follows:
ex
And now all becomes clear! Firstly, $\mbox{VaR}_P$ of our 7-asset portfolio is a function of data. Secondly, $\mbox{VaR}_P$ changes in the way that we might expect as contrasted with the market behaviour. In this point, note an evident increase in $\mbox{VaR}_P$ value if we include not 6 but 7 years of data. Why? The market went down in 2008 and the impact of it had been imprinted in the daily losses of single stocks. Therefore this contribution enriched the left tail of their return distributions. An inclusion of this information in our covariance matrix in a natural way increases the level of $\mbox{VaR}_P$.

So, how long? How many data should we include to get fair estimation of $\mbox{VaR}_P$? It’s not so easy to address. I would go with a lower bound of at least 1 year. As for the upper limit, it depends. If you are optimistic that in 2015 stocks will keep their upward momentum, look back up to 5 years. If you are a cautious investor, include 7 to 10 years of data.

Footnote

Some of you may still wonder why Impact vs Moving Elements?! Well, I believe the inspiration for a good title may come from the most unexpected direction. It was 2013 when I heard that song for the first time and I loved it! Through ages, beautiful women have always inspired men, therefore let me share my inspiration with you. You can become inspired too ;-)

YouTube Preview Image

Applied Portfolio VaR Decomposition. (1) Marginal and Component VaR.

Risk. The only ingredient of life that makes us growing and pushing outside our comfort zones. In finance, taking the risk is a risky business. Once your money have been invested, you need to keep your eyes on the ball that is rolling. Controlling the risk is the art and is the science: a quantitative approach that allows us to ease the pressure. With some proper tools we can manage our positions in the markets. No matter whether you invest in a long-term horizon or trade many times per day, your portfolio is a subject to the constantly changing levels of risk and asset volatility.


VaR. Value-at-Risk. The most hated and most adorable quantitative measure of risk. Your running security amongst the unpredictability of the floor. A drilling pain in the ass; a few sleepless nights you’ve been so familiarised with. The omnipresent and rarely confirmed assumptions of the assets returns to be normally distributed. Well, we all know the truth but, somehow, we love to settle for the analytical solutions allowing us to “see a less attractive girl” — more attractive.

In this post we will stay in the mystic circle of those assumptions as a starting point. Don’t blame me. This is what they teach us in the top-class textbooks on financial risk management and test our knowledge within FRM or CFA exams. You may pass the exam but do you really, I mean, really understand the effects of what you have just studied in real portfolio trading?

We kick off from a theory on VaR decomposition in the framework of an active $N$-asset portfolio when our initial capital of $C$ has been fully invested in the stock market (as an example). Next, we investigate the portfolio VaR and by computation of its marginal VaR we derive new positions in our portfolio that minimise the portfolio VaR.

The Analytical VaR for N-Asset Portfolio. The Beginning.

Let’s keep our theoretical considerations to the absolute but essential minimum. At the beginning you have $C$ dollars in hand. You decide to invest them all into $N$ assets (stocks, currencies, commodities, etc.). Great. Say, you pick up US equities and buy shares of $N=7$ stocks. The number of shares depends on stock price and each position size (in dollars) varies depending on your risk-return appetite. Therefore, the position size can be understood as a weighting factor, $w_i$, for $i=1,…,N$, such:
$$
\sum_{i=1}^N w_i = 1 \ \ \mbox{or} \ \ 100\mbox{%}
$$ or $d_i=w_iC$ in dollars and $\sum_i d_i=C$. Once the orders have been fulfilled, you are a proud holder of $N$-asset portfolio of stocks. Congratulations!

From now on, every day, when the markets close, you check a daily rate of return for individual assets, $r_{t,i} = P_{i,t}/P_{i,{t-1}}-1$ where by $P$ we denote the stock close price time-series. Your portfolio rate of return, day-to-day i.e. from $t$-1 to $t$, will be:
$$
R_{p,t} = \sum_{i=1}^N w_ir_{i,t} = \boldsymbol{w}’\boldsymbol{r}
$$ or alternatively:
$$
R_{p,t} = \sum_{i=1}^N d_ir_{i,t} = \boldsymbol{d}’\boldsymbol{r}
$$ what provides us with the updated portfolio value. Now, that differs from the portfolio expected return defined as:
$$
\mu_P = E\left(\sum_{i=1}^{N} w_ir_i \right) = \sum_{i=1}^{N} w_i\mu_i = \boldsymbol{w}’\boldsymbol{\mu}
$$ where the expected daily return of the $i$-th asset, $\mu_i$, needs to be estimated based on its historical returns (a return-series). This is the most fragile spot in the practical application of the Modern Portfolio Theory. If we estimate,
$$
\mu_i = \sum_{t’={(t-1)}-T}^{{t-1}} r_{t’}/T
$$ looking back at last $T$ trading days, it is obvious that we will derive different $\mu_i$’s for $T$=252 (1 year) and for $T$=1260 (5 years). That affects directly the computation of the portfolio covariance matrix thus, as we will see below, the (expected) portfolio volatility. Keep that crucial remark in mind!

The expected portfolio variance is:
$$
\sigma_P^2 = \boldsymbol{w}’ \boldsymbol{M}_2 \boldsymbol{w}
$$
where $\boldsymbol{M}_2$ is the covariance matrix $N\times N$ with the individual covariances of $c_{ij}=\mbox{cov}(\{r_i\},\{r_j\})$ between security $i$ and $j$ computed based on the return-series. Therefore,
$$
\sigma_P^2 = [w_1,…,w_N]
\left[
\begin{array}{cccc}
c_{11} & c_{12} & … & c_{1N} \\
… & … & … & … \\
c_{N1} & c_{N2} & … & c_{NN}
\end{array}
\right]
\left[
\begin{array}{cccc}
w_{1} \\
… \\
w_{N}
\end{array}
\right]
$$ what can be expressed in the terms of the dollar exposure as:
$$
\sigma_P^2C^2 = \boldsymbol{d}’ \boldsymbol{M}_2 \boldsymbol{d} \ .
$$ In order to move from the portfolio variance to the portfolio VaR we need to know the distribution of the portfolio returns. And we assume it to be normal (delta-normal model). Given that, the portfolio VaR at the 95% confidence level ($\alpha=1.65$) is:
$$
\mbox{VaR}_P = 1.65 \sigma_P C = 1.65\left(\boldsymbol{d}’ \boldsymbol{M}_2 \boldsymbol{d} \right)^{1/2}
$$ what can be understood as diversified VaR since we use our portfolio to reduce the overall risk of the investment. On the other hand, the 95% individual VaR describing individual risk of each asset in the portfolio is:
$$
\mbox{VaR}_i = 1.65 |\sigma_i d_i|
$$ where $\sigma_i$ represents the asset volatility over past period of $T$.

It is extremely difficult to obtain a portfolio with a perfect correlation (unity) among all its components. In such a case, we would talk about undiversified VaR possible to be computed as a sum of $\mbox{VaR}_i$ when, in addition as a required condition, there is no short position held in our portfolio. Reporting the risk for a portfolio currently alive is often conducted by the risk managers by providing a quotation of both diversified and undiversified VaR.

Now, imagine that we add one unit of an $i$-th asset to our portfolio and we aim at understanding the impact of this action on the portfolio VaR. In this case we talk about a marginal contribution to risk, possible to be derived through the differentiation of the portfolio variance with respect to $w_i$:
$$
\frac{\mbox{d}\sigma_P}{\mbox{d}w_i} = \frac { \left[ \sum_{i=1}^{N} w_i^2\sigma_i^2 + \sum_{i=1}^N \sum_{j=1, j\ne i}^{N} w_iw_jc_{ij} \right]^{1/2} }
{ \mbox{d}w_i }
$$
$$
= \frac{1}{2} \left[ 2w_i\sigma_i^2 + 2 \sum_{j=1, j\ne 1}^{N} w_jc_{ij} \right] \times \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \left[ \sum_{i=1}^{N} w_i^2\sigma_i^2 + \sum_{i=1}^N \sum_{j=1, j\ne i}^{N} w_iw_jc_{ij} \right]^{-1/2}
$$
$$
= \frac{w_i^2\sigma_i^2 + \sum_{j=1, j\ne 1}^{N} w_j c_{ij} } {\sigma_P}
= \frac{c_{iP}}{\sigma_P}
$$ where, again, $c_{ij}$ denotes the covariance between any $i$-th and $j$-th asset and $c_{iP}$ between $i$-th component and portfolio. That allows us to define the marginal VaR for $i$-th asset in the portfolio in the following way:
$$
\Delta\mbox{VaR}_i = \frac{\mbox{dVaR}_P}{\mbox{d}d_i} = \frac{\mbox{dVaR}_P}{\mbox{d}w_iC} = \frac{\mbox{d}(\alpha\sigma_P C)}{\mbox{d}w_iC} = \alpha\frac{C}{C}
\frac{\mbox{d}\sigma_P}{\mbox{d}w_i} = \alpha \frac{c_{iP}}{\sigma_P}
$$ that has a close relation to the systematic risk of $i$-th asset as confronted with the portfolio itself, described by:
$$
\beta_i = \frac{c_{iP}}{\sigma_P^2}
$$ therefore
$$
\Delta\mbox{VaR}_i = \alpha \sigma_P \beta_i = \beta_i \frac{ \mbox{VaR}_P } {C}
$$ The best and most practical interpretation of the marginal VaR calculated for all positions in the portfolio would be: the higher $\Delta\mbox{VaR}_i$ the corresponding exposure of the $i$-th component should be reduced to lower the overall portfolio VaR. Simple as that. Hold on, we will see that at work in the calculations a little bit later.

Since the portfolio volatility is a highly nonlinear function of its components, a simplistic computation of individual VaRs and adding the up all together injects the error into portfolio risk estimation. However, with a help of the marginal VaR we gain a tool to capture the fireflies amongst the darkness of the night.

We define the component VaR as:
$$
\mbox{cVaR}_i = \Delta\mbox{VaR}_i d_i = \Delta\mbox{VaR}_i w_i C = \beta_i \frac{ \mbox{VaR}_P w_i C } {C} = \beta_i w_i \mbox{VaR}_P
$$ utilising the observation that:
$$
\sigma_P = \sum_{i=1}^{N} w_i c_{iP} = \sum_{i=1}^{N} w_i \beta_i \sigma_P = \sigma_P \sum_{i=1}^{N} w_i \beta_i = \sigma_P
$$ what very nicely leads us to:
$$
\sum_{i=1}^{N} \mbox{cVaR}_i = \mbox{VaR}_P \sum_{i=1}^{N} w_i \beta_i = \mbox{VaR}_P \ .
$$ Having that, we get the percent contribution of the $i$-th asset to the portfolio VaR as:
$$
i = \frac{ \mbox{cVaR}_i } { \mbox{VaR}_P } = w_i \beta_i
$$ It is possible to calculate the required changes in the positions sizes based on our VaR analysis. We obtain it by re-evaluation of the portfolio marginal VaRs meeting now the following condition:
$$
\Delta\mbox{VaR}_i = \beta_i \frac{ \mbox{VaR}_P } { C } = \ \mbox{constant .}
$$ a subject of finding a solution for a “Risk-Minimising Position” problem. All right. The end of a lecture. Time for coding!

 

Case Study

Do You like sex? I do! So, let’s do it in Matlab this time! First, we pick up randomly seven stocks among US equites (for the simplicity of this example), and their tickers are given in the portfolio.lst text file as follows:

AAPL
DIS
IBM
JNJ
KO
NKE
TXN

and next we download their 3 years of historical data from Quandl.com as the data provider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
% Applied Portfolio VaR Decomposition. (1) Marginal and Component VaR.
%
% (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
 
clear all; close all; clc;
format short
 
% Read the list of the portfolio components
fileID = fopen('portfolio.lst');
tmp = textscan(fileID, '%s');
fclose(fileID);
pc=tmp{1};  % a list as a cell array
 
% fetch stock data for last 3 years since:
t0=735976; % 12-Jan-2015
date2=datestr(t0,'yyyy-mm-dd');        % from
date1=datestr(t0-3*365,'yyyy-mm-dd');  % to
 
% create an empty array for storing stock data
stockd={};
for i=1:length(pc) % scan the tickers and fetch the data from Quandl.com
    quandlc=['WIKI/',pc{i}];
    fprintf('%4.0f %s\n',i,quandlc);
    % fetch the data of the stock from Quandl
    % using recommanded Quandl's command and
    % saving them directly into Matlab's FTS object (fts)
    fts=0;
    [fts,headers]=Quandl.get(quandlc,'type','fints', ...
                  'authcode','YourQuandlCode',...
                  'start_date',date1,'end_date',date2);
    stockd{i}=fts; % entire FTS object in an array's cell
end
 
% limit data to 3 years of trading days, select the adjusted
% close price time-series and calculate return-series
rs=[];
for i=1:length(pc)
    cp=fts2mat(stockd{i}.Adj_Close,0);
    rv=cp(2:end,1)./cp(1:end-1,1)-1;
    rs=[rs rv(end-(3*5*4*12):end)];
end
rs(isnan(rs))=0.0;

The covariance matrix, $\boldsymbol{M}_2$, computed based on 7 return-series is:

44
45
% covariance matrix
M2=cov(rs);

Let’s now assume some random dollar exposure, i.e. position size for each asset,

47
48
49
50
51
52
53
54
55
56
57
% exposure per position
d=[   55621.00; ...
     101017.00; ...
      23409.00; ...
    1320814.00; ...
     131145.00; ...
     321124.00; ...
    1046867.00]
 
% invested capital of C
C=sum(d)

returning our total exposure of $C=\$$2,999,997 in the stock market which is based on the number of shares purchased and/or our risk-return appetite (in Part II of this series, we will witness how that works in a simulated trading).

We turn a long theory on the VaR decomposition into numbers just within a few lines of code, namely:

59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
% the portfolio volatility
vol_p=sqrt(d'*M2*d);
 
% diversified portfolio VaR
VaR_P=1.65*vol_p;
 
% volatility of assets (based on daily returns)
v=sqrt(diag(cov(rs)));
 
% individual and undiversified portfolio VaR
VaRi=1.65*abs(v.*d);
uVaR_P=sum(VaRi);
 
% the portfolio betas
beta=C*(M2*d)/vol_p^2;
 
% the portfolio marginal VaR
DVaR=1.65*(M2*d)/vol_p;
 
% the component VaR [percent]
cVaR=100*DVaR.*d/VaR_P;
 
% initial positions of assets in portfolio [percent]
orgpos=100*d/C

The last column vector provides us with an information on the original (currently held) rate of exposure of our invested capital in a function of the asset number in portfolio:

orgpos =
          1.85
          3.37
          0.78
         44.03
          4.37
         10.70
         34.90

Before we generate our very first risk report, let’s develop a short code that would allow us to find a solution for “Risk-Minimising Position” problem. Again, we want to find new position sizes (different from orgpos) at the lowest possible level of the portfolio risk. We construct intelligently a dedicated function that does the job for us:

% Finding a solution for the "Risk-Minimasing Position" problem
%   based on portfolio VaR
% (c) 2015 QuantAtRisk, by Pawel Lachowicz
 
function [best_x,best_dVaR,best_VaRp,best_volp]=rmp(M2,d,VaR_p)
    c=sum(d);
    N=length(d);
    best_VaRp=VaR_p;
    for i=1:100                     % an arbitrary number (recommended >50)
        k=0;
        while(k~=1)
            d=random('uniform',0,1,N,1);
            d=d/sum(d)*c;
            vol_p=sqrt(d'*M2*d);
            % diversified VaR (portfolio)
            VaRp=1.65*vol_p;
            dVaR=1.65*(M2*d)/vol_p;
            test=fix(dVaR*100);     % set precision here
            m=fix(mean(test));
            t=(test==m);
            k=sum(t)/N;
            if(VaRp<best_VaRp)
                best_x=d;
                best_dVaR=dVaR;
                best_VaRp=VaRp;
                best_volp=vol_p;
            end
        end
    end
end

We feed it with three input parameters, namely, the covariance matrix of $\boldsymbol{M}_2$, the current dollar exposure per asset of $\boldsymbol{d}$, and the current portfolio VaR. Inside we generate randomly a new dollar exposure column vector, next re-calculate the portfolio VaR and marginal VaR until,
$$
\Delta\mbox{VaR}_i = \beta_i \frac{ \mbox{VaR}_P } { C } = \ \mbox{constant}
$$ condition is met. We need to be sure that a new portfolio VaR is lower than the original one and somehow lowest among a large number of random possibilities. That is why we repeat that step 100 times. In this point, as a technical remark, I would suggest to use more than 50 iterations and to experiment with a precision that is required to find marginal VaRs (or $\beta$’s) to be (nearly) the same. The greater both golden numbers are the longer time of waiting for the final solution.

It is also that apex of the day in daily bank’s portfolio evaluation process that assesses the bank’s market risk and global exposure. A risk manager knows how long it takes to find new risk-minimaised positions and how fast C/C++/GPU solutions need to be implemented to cut the time of those computations. Here, with our Matlab function of rmp, we can feel the pain of this game.

Eventually, we finalise our computations:

84
85
86
[nd,nDVaR,nVaR_P,nvol_P]=rmp(M2,d,VaR_P);
redVaR=100*(nVaR_P-VaR_P)/VaR_P;
newpos=100*nd/C;

followed by a nicely formatted report:

88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
fprintf('\n============================================\n');
fprintf('  Portfolio Risk Report as of %s\n',datestr(t0));
fprintf('============================================\n\n');
fprintf('1-day VaR estimation at %1.0f%% confidence level\n\n',95.0)
fprintf('Number of assets:\t\t %1.0f \n',length(d))
fprintf('Current exposure:\t\t $%1s\n',bankformat(C));
fprintf('Portfolio VaR (undiversified):\t $%1s\n',bankformat(uVaR_P));
fprintf('Portfolio VaR (diversified):\t $%1s\n',bankformat(VaR_P));
fprintf('Component VaR [individual VaR]\n');
for i=1:length(d)
    fprintf('   %s\t%6.2f%%   $%-10s\t[$%-10s]\n',pc{i},cVaR(i), ...
            bankformat(cVaRd(i)),bankformat(VaRi(i)));
end
fprintf('\nRisk-Minimising Position scenario\n');
fprintf('--------------------------------------------------------------');
fprintf('\n\t\t   Original   Marginal     New        Marginal\n');
fprintf('\t\t   position   VaR          position   VaR\n');
for i=1:length(d)
    fprintf('\t  %-4s\t    %6.2f%%   %1.5f\t    %6.2f%%   %1.2f\n', ...
            pc{i},orgpos(i), ...
            DVaR(i),newpos(i),nDVaR(i))
end
fprintf('div VaR\t\t $%1s\t\t $%1s\n',bankformat(VaR_P), ...
        bankformat(nVaR_P));
fprintf('\t\t\t\t\t%10.2f%%\n',redVaR);
fprintf('ann Vol\t\t%10.2f%%\t\t%10.2f%%\n',  ...
        sqrt(252)*vol_P/1e6*1e2,sqrt(252)*nvol_P/1e6*1e2);
fprintf('--------------------------------------------------------------');
fprintf('\n\n');

where bankformat function you can find here.

As a portfolio risk manager you are interested in numbers, numbers, and numbers that, in fact, reveal a lot. It is of paramount importance to analyse the portfolio risk report with attention before taking any further steps or decisions:

 
============================================
  Portfolio Risk Report as of 12-Jan-2015
============================================
 
1-day VaR estimation at 95% confidence level
 
Number of assets:		 7 
Current exposure:		 $2,999,997.00
Portfolio VaR (undiversified):	 $54,271.26
Portfolio VaR (diversified):	 $33,027.94
Component VaR [individual VaR]
   AAPL	  0.61%   $202.99    	[$1,564.29  ]
   DIS	  1.71%   $564.13    	[$1,885.84  ]
   IBM	  0.26%   $84.43     	[$429.01    ]
   JNJ	 30.99%   $10,235.65 	[$17,647.63 ]
   KO	  1.19%   $392.02    	[$2,044.05  ]
   NKE	  9.10%   $3,006.55  	[$7,402.58  ]
   TXN	 56.14%   $18,542.15 	[$23,297.83 ]
 
Risk-Minimising Position scenario
--------------------------------------------------------------
		   Original   Marginal     New        Marginal
		   position   VaR          position   VaR
	  AAPL	      1.85%   0.00365	      6.02%   0.01
	  DIS 	      3.37%   0.00558	      1.69%   0.01
	  IBM 	      0.78%   0.00361	     10.79%   0.01
	  JNJ 	     44.03%   0.00775	     37.47%   0.01
	  KO  	      4.37%   0.00299	     27.36%   0.01
	  NKE 	     10.70%   0.00936	      8.13%   0.01
	  TXN 	     34.90%   0.01771	      8.53%   0.01
div VaR		 $33,027.94		 $26,198.39
					    -20.68%
ann Vol		     31.78%		     25.21%
--------------------------------------------------------------

From the risk report we can spot the difference of $\$$21,243.32 between the undiversified and diversified portfolio VaR and the largest contribution of Texas Instruments Incorporated stock (TXN) to the overall portfolio VaR though it is Johnson & Johnson stock (JNJ) that occupies the largest dollar position in our portfolio.

The original marginal VaR is the largest for TXN, NKE, and JNJ. Therefore, in order to minimise portfolio VaR, we should cut them and/or reduce their positions. On the other hand, the exposure should be increased in case of KO, IBM, AAPL, and DIS which display the lowest marginal VaR. Both suggestions find the solution in a form of derived new holdings that would reduce the portfolio VaR by $-$20.68% decreasing concurrently the annualised portfolio volatility by 6.57%.

Indeed. Risk is like sex. The more the merrier! Unless, you are a risk manager and you calculate your “positions”.

Visualisation of N-Asset Portfolio in Matlab

Many of you who wished to use the Matlab’s Financial Toolbox for portfolio visualization most probably realised that something was wrong. Not every time the annualised risk and return for different time-series matched with what was expected. Below you can find an improved piece of code which does the job correctly.

The input sampling of your financial time-series now can be specified directly as daily, weekly, monthly, or quarterly and the risk-return diagram will be displayed with annualised results. Otherwise, you have an option, instead of annualised value, specify another time-horizon. All smooth and easy.

% Specialised plotting function for visualisation of portfolios
% based on the Financial Toolbox's portfolio objects
% 
% Modified by Pawel Lachowicz, 2014/03/01
%
% Remarks: Essential corrections introduced in the calculation and plotting
% of the annualised risk and returns coded incorrectly by the team at 
% The MathWorks, Inc.
 
function plotportfolio(varargin)
 
    plottitle = varargin{1};  % plot title, e.g. 'Efficient Frontier'
    plotfreq =  varargin{2};  % frequency of raw data (dt; or 'daily', etc.)
    plotfreq2=  varargin{3};  % e.g. ’annualised’
    plotfreq3=  varargin{4};
    plotlegend = [];
 
    switch plotfreq2
        case 'annualised'
            switch plotfreq
                case 'daily'
                    periods=252;
                case 'weekly'
                    periods=52;
                case 'monthly'
                    periods=12;
                case 'quarterly'
                    periods=4;
            end
        otherwise
            periods=plotfreq3;
    end
 
    for i = 5:nargin
        plotinfo = varargin{i};
 
        plottype = plotinfo{1};
        plotrsk  = plotinfo{2};
        plotret  = plotinfo{3};
        if numel(plotinfo) > 3
            plotlabel = plotinfo{4};
        else
            plotlabel = [];
        end
        if numel(plotinfo) > 4
            plotstyle = plotinfo{5};
            if isempty(plotstyle)
                plotstyle = 'b';
            end
        else
            if strcmpi(plottype,'line')
                plotstyle = 'b';
            else
                plotstyle = 'g';
            end
        end
        if numel(plotinfo) > 5
            plotline = plotinfo{6};
            if isempty(plotline)
                plotline = 1;
            end
        else
            if strcmpi(plottype,'line')
                plotline = 2;
            else
                plotline = [];
            end
        end
 
        % line plot
        if strcmpi(plottype,'line')
            hold on;
            for k = 1:size(plotrsk,2)
                plot(plotrsk(:,k)*sqrt(periods),((plotret(:,k)+1).^periods-1), ...
                     plotstyle, 'LineWidth', plotline);
                if i == 2 && k == 1
                    hold on
                end
                if ~isempty(plotlabel) && ~isempty(plotlabel{k})
                    plotlegend = [ plotlegend, {plotlabel{k}} ];
                end
            end
 
        % scatter plot
        else
            if any(plotstyle == '.')
                scatter(plotrsk*sqrt(periods),((plotret(:,k)+1).^periods-1),'or','Filled');
            else
                scatter(plotrsk*sqrt(periods),((plotret(:,k)+1).^periods-1),'og','Filled');
            end			
            if i==2
                hold on
            end
            if ~isempty(plotlabel)
                for k = 1:numel(plotrsk)
                    if ~isempty(plotlabel{k})
                        text(plotrsk(k)*sqrt(periods)+0.005, ...
                            ((plotret(k)+1).^periods-1), plotlabel{k},'FontSize', 9);
                    end
                end
            end
        end
    end
 
    if ~isempty(plotlegend)
        legend(plotlegend,'Location','SouthEast');
    end
    if(plotfreq2=='annualised')
        xlabel('Annualised Volatility');
        ylabel('Annualised Expected Returns');
    else
        xlabel('Volatility');
        ylabel('Returns');
    end
    grid on
 
end

Now, you can call it with various parameters of your interest, matching the data directly, for instance:

plotportfolio('Portfolio',dt,'annualised',[], ... 
    {'line', prsk, pret}, ...
    {'scatter',CashRsk,CashRet, {'Cash'}}, ...
    {'scatter', sqrt(diag(p.AssetCovar)), p.AssetMean, p.AssetList,'.r'});

For some awesome practical application of this function in the process of construction and analysis of $N$-asset portfolio grab Applied Portfolio Optimization with Risk Management using Matlab ebook today.

Roy’s Safety-First Criterion in Portfolio Optimization Problem

The main objective of portfolio selection is the construction of a portfolio that maximises expected return given a certain tolerance for risk. There is an intuitive problem with the portfolio variance as a measure of risk. Using the variance in the portfolio optimization context, it means that outcomes that are above the expected portfolio returned are deemed as risky as outcomes that are below the expected portfolio return. The use of portfolio variance as a risk measure has been a subject to many discussions. There’s a thin line we step onto.

Since the class of risk for financial portfolios has been dividing into two sub-branches, namely, dispersion and downside risk measures, the Roy’s safety-first criterion (after the paper of Arthur Roy, Safety First and the Holding of Assets, Econometrica 1952 July, 431–450) refers more to the investor’s control over investment in the way such the investor makes sure that a certain amount of the invested principal is preserved, so the investor tries to minimise the probability that the return earned is less than or equal to the threshold $t$. Therefore, the optimization problem for portfolio with $N$ assets and associated weights of $w$ is:
$$
\min_{w} Pr(r_p\le t)
$$ or
$$
\min_{w} Pr\left( \sum_{i=1}^{N} w_ir_i \le t \right)
$$ where, by a clever observation one can notice that Roy’s safety-first criterion is a special case of the lower-partial moment risk measure formula,
$$
\left(E[\min\{r_p-t,0\}^q]\right)^{1/q} \equiv \int_{-\infty}^{t} (t-r)^k f(r) dr
$$ when $k=0$. In financial risk management, Roy’s safety-first criterion is referred to as shortfall risk, and additionally, for $t=0$, the shortfall risk is called the risk of loss.

For practical application of this risk measure check QuantAtRisk’s ebook on Applied Portfolio Optimization and Risk Management available from QaR eBook Store soon!

Extreme VaR for Portfolio Managers


Regardless of who you are, either an algo trader hidden in the wilderness of Alaska or an active portfolio manager in a large financial institution in São Paulo, your investments are the subject to sudden and undesired movements in the markets. There is one fear. We know it very well. It’s a low-probability, high-impact event. An extreme loss. An avalanche hitting your account balance hard. Really hard.

The financial risk management has developed a set of quantitative tools helping to rise an awareness of the worst case scenarios. The most popular two are Value-at-Risk (VaR) and Expected Shortfall (ES). Both numbers provide you with the amount of stress you put on your shoulders when the worst takes place indeed.

Quoting $VaR_{0.05}^{\rm 1-day}$ value we assume that in 95% of cases we should feel confident that the maximal loss for our portfolio, $C\times VaR_{0.05}^{\rm 1-day}$, will not be exceed in trading on the next day (1-day VaR) where $C$ is a current portfolio value (e.g. $\$$35,292.13). However if unforeseen happens, our best guess is that we will lose $C\times ES_{0.05}^{\rm 1-day}$ of dollars, where $ES_{0.05}^{\rm 1-day}$ stands for the 1-day expected shortfall estimate. Usually $ES_{0.05}^{\rm 1-day} > VaR_{0.05}^{\rm 1-day}$. We illustrated this case previously across VaR and Expected Shortfall vs. Black Swan article using IBM stock trading data. We also discussed a superbly rare case of extreme-loss in the markets on Oct 19, 1987 (-22%), far beyond any measure.

In this post we will underline an additional quantitative risk measure, quite often overlooked by portfolio managers, an extreme value-at-risk (EVaR). Using Matlab and an exemplary portfolio of stocks, we gonna confront VaR, ES, and EVaR for a better understanding of the amount of risk involved in the game. Hitting hard is inevitable under drastic market swings so you need to know the highest pain threshold: EVaR.

Generalised Extreme-Value Theory and Extreme VaR

The extreme-value theory (EVT) plays an important role in today’s financial world. It provides a tailor-made solution for those who wish to deal with highly unexpected events. In a classical VaR-approach, the risk manager is more concentrated and concerned around central tendency statistics. We are surrounded by them every day. The mean values, standard deviations, $\chi^2$ distributions, etc. They are blood under the skin. The point is they are governed by central limit theorems, which, on a further note, cannot be applied to extremes. That is where the EVT enters the game.

From the statistical point of view, the problem is simple. If we assume that $X$ is random variable of the loss and $X$ is independent and identically distributed (iid) by that we also assume a general framework suggesting us that $X$ comes from an unknown distribution of $F(x)=Pr(X\le x)$. EVT makes use of the fundamental Fisher-Trippett theorem and delivers practical solution for a large sample of $n$ extreme random loss variables knows also as a generalised extreme-value (GEV) distribution defined as:
$$
H(z;a,b) = \left\{
\begin{array}{lr}
\exp\left[-\left( 1+ z\frac{x-b}{a} \right)^{-1/z} \right] & : z \ne 0\\
\exp\left[-\exp\left(\frac{x-b}{a} \right) \right] & : z = 0 \\
\end{array}
\right.
$$ Under the condition of $x$ meeting $1+z(x-b)/a > 0$ both parameters $a$ and $b$ in GEV distribution ought to be understood as location and scale parameters of limiting distribution $H$. Their meaning is close but distinct from mean and standard deviation. The last parameter of $z$ corresponds to a tail index. Tht tail is important in $H$. Why? Simply because it points our attention to the heaviness of extreme losses in the data sample.

In our previous article on Black Swan and Extreme Loss Modeling we applied $H$ with $z=0$ to fit the data. This case of GEV is known as the Gumbel distribution referring to $F(x)$ to have exponential (light) tails. If the underlying data fitted with $H$ model reveal $z<0$, we talk about Fréchet distribution. From a practical point of view, the Fréchet case is more useful as it takes into account heavy tails more carefully. By plotting standardised Gumbel and Fréchet distributions ($a=1$, $b=0$), you can notice the gross mass of the $H$ contained between $x$ values of -2 and 6 what translates into $b-2a$ to $b+6a$ interval. With $z>0$, the tail contributes more significantly in production of very large $X$-values. The remaining case for $z<0$, Weibull distribution, we will leave out of our interest here (the case of tails lighter than normal).

If we set the left-hade side of GEV distribution to $p$, take logs of both sides of above equations and rearrange them accordingly, we will end up with a setup of:
$$
\ln(p) = \left\{
\begin{array}{lr}
-\left(1+zk \right)^{-1/z} & : z \ne 0\\
-\exp(-k) & : z = 0 \\
\end{array}
\right.
$$ where $k=(x-b)/a$ and $p$ denotes any chosen cumulative probability. From this point we can derive $x$ values in order to obtain quantiles associated with $p$, namely:
$$
x = \left\{
\begin{array}{ll}
b – \frac{a}{z} \left[1-(-\ln(p))^{-z}\right] & : ({\rm Fréchet}; z>0) \\
b – a\ln\left[-\ln(p)\right] & : ({\rm Gumbel}; z=0) \\
\end{array}
\right.
$$ Note, when $z\rightarrow 0$ then Fréchet quantiles tend to their Gumbel equivalents. Also, please notice that for instance for the standardised Gumbel distribution the 95% quantile is $-\ln[-\ln(0.95)] = 2.9702$ and points at its location in the right-tail, and not in the left-tail as usually associated with heaviest losses. Don’t worry. Gandhi says relax. In practice we fit GEV distribution with reversed sign or we reverse sign for 0.95th quantile for random loss variables of $x$. That point will become more clear with some examples that follow.

We allow ourselves to define the Extreme Value-at-Risk (EVaR) associated directly with fitted GEV distribution, $H_n$, to the data ($n$ data points), passing
$$
Pr[H_n < H'] = p = {Pr[X < H']}^n = [\alpha]^n $$ where $\alpha$ is the VaR confidence level associated with the threshold $H'$, in the following way: $$ \mbox{EVaR} = \left\{ \begin{array}{ll} b_n - \frac{a_n}{z_n} \left[1-(-n\ln(\alpha))^{-nz_n}\right] & : ({\rm Fréchet}; z>0) \\
b_n – a_n\ln\left[-n\ln(\alpha)\right] & : ({\rm Gumbel}; z=0) \\
\end{array}
\right.
$$ These formulas provide a practical way of estimating VaR based the analysis of the distribution and frequency of extreme losses in the data samples and are meant to work best for very high confidence levels. Additionally, as pointed by Dowd and Hauksson, the EVaR does not undergo the rule of square root of a holding period what is applicable for a standard VaR derived for (log-)normal distributions of return series.

1-day EVaR for Assets in your Portfolio

All right. Time to connect the dots. Imagine, you hold a portfolio of 30 bluechips being a part of Dow Jones Industrial index. Before market opens, regardless of quantity and direction of positions you hold in each of those 30 stocks, you, as a cautious portfolio manager, want to know the estimation of extreme loss that can take place on the next trading day.

You look at last 252 trading days of every stock, therefore you are able to extract a sample of $N=30$ largest losses (one per stock over last 365 calendar days). Given this sample, we can fit it with GEV distribution and find the best estimates for $z, a$ and $b$ parameters. However, in practice, $N=30$ is a small sample. There is nothing wrong to extract for each stock 5 worst losses that took place within last 252 trading days. That allows us to consider $n=150$ sample (see the code below).

Let’s use Matlab and the approach described in Create a Portfolio of Stocks based on Google Finance Data fed by Quandl post, here re-quoted in a shorter form:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
% Extreme Value-at-Risk
%
% (c) 2014 QuantAtRisk.com, by Pawel Lachowicz
 
 
close all; clear all; clc;
 
% Read the list of Dow Jones components
fileID = fopen('dowjones.lst');
tmp = textscan(fileID, '%s');
fclose(fileID);
dowjc=tmp{1};  % a list as a cell array
 
% Read in the list of tickers and internal code from Quandl.com
[ndata, text, alldata] = xlsread('QuandlStockCodeListUS.xlsx');
quandlc=text(:,1); % again, as a list in a cell array
quandlcode=text(:,3) % corresponding Quandl's Price Code
 
% fetch stock data for last 365 days
date2=datestr(today,'yyyy-mm-dd')     % from
date1=datestr(today-365,'yyyy-mm-dd') % to
 
stockd={};
% scan the Dow Jones tickers and fetch the data from Quandl.com
for i=1:length(dowjc)
    for j=1:length(quandlc)
        if(strcmp(dowjc{i},quandlc{j}))
            fprintf('%4.0f %s\n',i,quandlc{j});
            fts=0;
            [fts,headers]=Quandl.get(quandlcode{j},'type','fints', ...
                          'authcode','YourQUANDLtocken',...
                          'start_date',date1,'end_date',date2);
            cp=fts2mat(fts.Close,0); % close price
            R{i}=cp(2:end)./cp(1:end-1)-1; % return-series
        end
    end
end
N=length(dowjc);

where make use of external files: dowjones.lst (tickers of stocks in our portfolio) and QuandlStockCodeListUS.xlsx (see here for more details on utilising Quandl.com‘s API for fetching Google Finance stock data).

The Matlab’s cell array of $R$ holds 30 return-series (each 251 point long). Having that, we find a set of top 5 maximal daily losses for each stock ending up with a data sample of $n=150$ points as mentioned eariler.

40
41
42
43
44
45
Rmin=[];
for i=1:N
    ret=sort(R{i});
    Mn=ret(1:5); % extract 5 worst losses per stock
    Rmin=[Rmin; Mn];
end

Now, we fit the GEV distribution,
$$
H(z;a,b) = \exp\left[-\left( 1+ z\frac{x-b}{a} \right)^{-1/z} \right] \ \ ,
$$ employing a ready-to-use function, gevfit, from Matlab’s Statistics Toolbox:

47
48
49
50
51
% fit GEV distribution
[parmhat,parmci] = gevfit(Rmin)
z=parmhat(1); % tail index
a=parmhat(2); % scale parameter
b=parmhat(3); % location parameter

which returns for date2 = 2014-03-01,

parmhat =
   -0.8378    0.0130   -0.0326
parmci =
   -0.9357    0.0112   -0.0348
   -0.7398    0.0151   -0.0304

id est, the best estimates of the model’s parameters:
$$
z_{150} = -0.8378^{+0.0980}_{-0.0979}, \ \ a_{150} = 0.0130^{+0.0021}_{-0.0018}, \ \ b_{150} = -0.0326^{+0.0022}_{-0.0022} \ \ .
$$ At this point, it is essential to note that negative $z$ indicates Fréchet distribution since we fitted data with negative signs.

Plotting data with model,

53
54
55
56
57
58
59
60
61
62
63
64
65
66
% plot data with the GEV distribution for found z,a,b
x=-1:0.001:1;
hist(Rmin,-1:0.01:0);
h=findobj(gca,'Type','patch');
set(h,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.6 0.6 0.6]);
h=findobj(gca,'Type','box');
set(h,'Color','k');
% GEV pdf 
pdf1=gevpdf(x,z,a,b);
y=0.01*length(Rmin)*pdf1;
line(x,y,'color','r'); box on;
xlabel('R_{min}');
ylabel(['f(R_{min}|z,a,b)']); 
xlim([-0.15 0]);

illustrates our efforts in a clear manner:

gev01

Eventually, performing derivation,

68
69
z=-z;
EVaR=b-(a/z)*(1-(-n*log(0.95))^(-n*z))

we calculate the best estimation of 1-day EVaR as discussed:
$$
\mbox{EVaR} =
b_{150} – \frac{a_{150}}{-z_{150}} \left[1-(-n\ln(0.95))^{nz_{150}}\right] = -0.0481 \ \ .
$$ This number tells us that among 30 bluechips in our portfolio we should expect extreme loss of 4.81% on the following day (estimated using last 252 days of trading history).

Now, is 4.81% a lot or not? That reminds me an immortal question asked by one of my colleagues many years ago, a guy who loved studying philosophy: Is 4 kilograms of potatoes a lot or not? The best way to address it, is by comparison. Let us again quickly look at return series of 30 stocks in our portfolio (R cell array in the code). For each individual series we derive its normalised distribution (histogram), fit it with a normal probability density function, and calculate corresponding 95% VaR (var local variable) in the following way:

71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
%--figure;
VaR=[];
for i=1:N
    rs=R{i};
    % create the histogram for return series
    [H,h]=hist(rs,100);
    sum=0;
    for i=1:length(H)-1
        sum=sum+(H(i)*(h(i+1)-h(i)));
    end
    H=H/sum;
    %--bar(h,H,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.7 0.7 0.7])
    % fit the histogram with a normal distribution
    [muhat,sigmahat] = normfit([rs]');
    Y=normpdf(h,muhat(1),sigmahat(1));
    %--hold on; plot(h,Y,'b-'); % and plot the fit using a blue line
    % find VaR for stock 'i'
    sum=0;
    i=1;
    while (i<length(h))
        sum=sum+(Y(i)*(h(i+1)-h(i)));
        if(sum>=0.05) 
            break;
        end
        i=i+1;
    end
    var=h(i)
    %--hold on; plot(h(i),0,'ro','markerfacecolor','r'); % mark VaR_fit
    VaR=[VaR; var];
end

The vector of VaR then holds all thirty 95% VaR. By running the last piece of code,

102
103
104
105
106
107
108
figure;
[H,h]=hist(VaR*100,20);
line=[EVaR*100 0; EVaR*100 6];
bar(h,H,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.7 0.7 0.7])
hold on; plot(line(:,1),line(:,2),'r:');
xlim([-6 0]); ylim([0 6]);
xlabel('95% VaR [percent]'); ylabel('Histogram');

we plot their distribution and compare them to EVaR of -4.81%, here marked by red dotted line in the chart:

gev02

The calculated 1-day 95% EVaR of -4.81% places itself much further away from all classical 95% VaR derived for all 30 stocks. It’s an extreme case indeed.

You can use 1-day EVaR to calculate the expected loss in your portfolio just by assuming that every single stock will drop by -4.81% on the next day. This number (in terms of currency of your portfolio) for sure will make you more alert.

Richard Branson once said: If you gonna fail, fail well. Don’t listen to him.

Pre-Processing of Asset Price Series for Portfolio Optimization

Portfolio Optimization is a significant component of Matlab’s Financial Toolbox. It provides us with ready-to-use solution in finding optimal weights of assets that we consider for trading deriving them based on the historical asset performance. From a practical point of view, we can include it in our algorithmic trading strategy and backtest its applicability under different initial conditions. This is a subject of my next up-coming post. However, before we can enjoy the view from the peak, we need to climb the mountain first.

In Matlab, the portfolio is created as a dedicated object of the same name. It doesn’t read the raw stock data. We need to feed that beast. Two major ingredients satisfy the input: a vector of the expected asset returns and a covariance matrix. Matlab helps us to estimate these moments but first we need to deliver asset data in a digestable form.

In this post we will see how one can quickly download the stock data from the Internet based on our own stock selection and pre-process them for solving portfolio optimization problem in Matlab.

Initial Setup for Portfolio Object

Let’s say that at any point of time you have your own list of stocks you wish to buy. For simplicity let’s also assume that the list contains stocks traded on NYSE or NASDAQ. Since you have been a great fun of this game, now you are almost ready to buy what you jotted down on your ShoppingList.lst. Here, an example of 10 tech stocks:

AAPL   AOL   BIDU   GOOG   HPQ   IBM   INTC   MSFT   NVDA   TXN

They will constitute your portfolio of stocks. The problem of portfolio optimization requires a look back in time in the space of returns obtained in trading by each stock. Based on them the Return Proxy and Risk Proxy can be found.

The return matrix $R$ of dimensions $(N-1)\times M$ where $N$ stands for number of historical prices (e.g. derived daily, or monthly, etc.) and $M$ for the number of stocks in our portfolio, is required by Matlab as an input. We will see how does it work in next post. For now let’s solely focus on creation of this matrix.

In the article Create a Portfolio of Stocks based on Google Finance Data fed by Quandl I discussed Quandl.com as an attractive data provider for US stocks. Here, we will follow this solution making use of Quandl resources to pull out the stock price series for our shopping list. Ultimately, we aim at building a function, here: QuandlForPortfolio, that does the job for us:

% Pre-Processing of Asset Price Series for Portfolio Optimization in Matlab
%  (c) 2013, QuantAtRisk.com, by Pawel Lachowicz
 
clear all; close all; clc;
 
% Input Parameters
n=1*365;
tickers='ShoppingList.lst';
qcodes='QuandlStockCodeListUS.xlsx';
 
[X,Y,R,AssetList] = QuandlForPortfolio(n,tickers,qcodes);

We call this function with three input parameters. The first one, $n$, denotes a number of calendar days from today (counting backwards) for which we wish to retrieve the stock data. Usually, 365 days will correspond to about 250$-$252 trading days. The second parameter is a path/file name to our list of stock (desired to be taken into account in the portfolio optimisation process) while the last input defines the path/file name to the file storing stocks’ tickers and associated Quandl Price Codes (see here for more details).

Feeding the Beast

The QuandlForPortfolio Matlab function is an extended version of the previously discussed solution. It contains an important correcting procedure for the data fetched from the Quandl servers. First, let’s have a closer look on the function itself:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
% Function assists in fetching Google Finance data from the Quandl.com
%  server for a given list of tickers of stocks traded on NYSE or
%  NASDAQ. Data are retrieved for last 'n' days with daily sampling.
%
% INPUT
%   n       : number of calendar days from 'today' (e.g. 365 would
%             correspond to about 252 business days)
%   tickers : a path/file name of a text file listing tickers
%   qcodes  : a path/file name of Excel workbook (.xlsx) containing a list
%              of tickers and Quandl Price Codes in the format of
%              [Ticker,Stock Name,Price Code,Ratios Code,In Market?]
% OUTPUT
%   X0        : [Nx1] column vector with days
%   Y0        : [NxM] matrix with Close Prices for M stocks
%   R0        : [(N-1)xM] matrix of Retruns
%   AssetList : a list of tickers (cell array)
%
% (c) 2013, QuantAtRisk.com, by Pawel Lachowicz
 
function [X0,Y0,R0,AssetList0] = QuandlForPortfolio(n,tickers,qcodes)
    fileID = fopen(tickers);
    tmp = textscan(fileID, '%s');
    fclose(fileID);
    AssetList=tmp{1};  % a list as a cell array
 
    % Read in the list of tickers and internal Quandl codes
    %
    [~,text,~] = xlsread(qcodes);
    quandlc=text(:,1); % again, as a list in a cell array
    quandlcode=text(:,3); % corresponding Quandl's Price Code
 
    date1=datestr(today-n,'yyyy-mm-dd'); % from
    date2=datestr(today,'yyyy-mm-dd');   % to
 
    % Fetch the data from Quandl.com
    %
    QData={};
    for i=1:length(AssetList)
        for j=1:length(quandlc)
            if(strcmp(AssetList{i},quandlc{j}))
                fprintf('%4.0f %s\n',i,quandlc{j});
                fts=0;
                [fts,headers]=Quandl.get(quandlcode{j},'type','fints', ...
                   'authcode','x',...
                   'start_date',date1,'end_date',date2,'collapse','daily');
                QData{i}=fts;
            end
        end
    end
 
    % Post-Processing of Fetched Data
    %
    % create a list of days across all tickers
    TMP=[];
    for i=1:length(QData)
        tmp=fts2mat(QData{i},1);
        tmp=tmp(:,1);
        TMP=[TMP; tmp];
    end
    ut=unique(TMP);
    % use that list to find these days that are not present
    %  among all data sets
    TMP=[];
    for i=1:length(QData)
        tmp=fts2mat(QData{i},1);
        tmp=tmp(:,1);
        TMP=[TMP; setdiff(ut,tmp)];
    end
    ut=unique(TMP);
    % finally, extract Close Prices from FTS object and store them
    %  in Y0 matrix, plus corresponding days in X0
    X0=[];
    Y0=[]; 
    for i=1:length(QData)
        tmp=fts2mat(QData{i},1);
        cp=[];
        for j=1:size(tmp,1)
            [r,~,~]=find(ut==tmp(j,1));
            if(isempty(r))
                cp=[cp; tmp(j,5)]; % column 5 corresponds to Close Price
                if(i<2)
                    % create a time column vector listing days
                    % common among all data sets
                    X0=[X0; tmp(j,1)];
                end
            end
        end
        Y0=[Y0 cp];
    end
    % transform Close Prices into Returns, R(i)=cp(i)/cp(i-1)-1
    R0=tick2ret(Y0);
    AssetList0=AssetList';
end

The main bottleneck comes from the fact that Matlab’s portfolio object demands an equal number of historical returns ($N-1$) in the matrix of $R$ for all $M$ assets. We design the function in the way that it sets the common timeframe for all stocks listed on our shopping list. Of course, we ensure that all stocks were traded in the markets for about $n$ last days (rough estimation).

Now, the timeframe of $n$ last days should be understood as a first approximation. We fetch the data from Quandl (numeric date, Open, High, Low, Close, Volume) and save them in the cell array QData (lines #37-49) for each stock separately as FTS objects (Financial Time-Series objects; see Financial Toolbox). However, it may occur that not every stock we fetched displays the same amount of data. That is why we need to investigate for what days and for what stocks we miss the data. We achieve that by scanning each FTS object and creating a unique list of all days for which we have data (lines #54-60).

Next, we loop again over the same data sets but now we compare that list with a list of all dates for each stock individually (lines #63-69), capturing (line #67) those dates that are missing. Their complete list is stored as a vector in line #69. Eventually, given that, we are able to compile the full data set (e.g. Close Prices; here line #80) for all stocks in our portfolio ensuring that we will include only those dates for which we have prices across all $M$ assets (lines #70-91).

Beast Unleashed

We test our data pre-processing simply by running the block of code listed above engaging QuandlForPortfolio function and we check the results in the Matlab’s command window as follows:

>> whos X Y R AssetList
  Name             Size            Bytes  Class     Attributes
 
  AssetList        1x10             1192  cell                
  R              250x10            20000  double              
  X              251x1              2008  double              
  Y              251x10            20080  double

what confirms the correctness of dimensions as expected.

At this stage, the aforementioned function can be used two-fold. First, we are interested in the portfolio optimisation and we look back at last $n$ calendar days since the most current one (today). The second usage is handy too. We consider our stocks on the shopping list and fetch for their last, say, $n=7\times365$ days with data. If all stocks were traded over past 7 years we should be able to collect a reach data set. If not, the function will adjust the beginning and end date to meet the initial time constrains as required for $R$ matrix construction. For the former case, we can use 7-year data sample for direct backtesting of algo models utilizing Portfolio Optimization.

Stay tuned as we will rock this land in the next post!

Any Questions?

Share them across QuantCove.com – the official Forum of QuantAtRisk.

GARCH(p,q) Model and Exit Strategy for Intraday Algorithmic Traders

Forecasting future has always been a part of human untamed skill to posses. In an enjoyable Hollywood production of Next Nicolas Cage playing a character of Frank Cadillac has an ability to see the future just up to a few minutes ahead. This allows him to take almost immediate actions to avoid the risks. Now, just imagine for a moment that you are an (algorithmic) intraday trader. What would you offer for a glimpse of knowing what is going to happen within following couple of minutes? What sort of risk would you undertake? Is it really possible to deduce the next move on your trading chessboard? Most probably the best answer to this question is: it is partially possible. Why then partially? Well, even with a help of mathematics and statistics, obviously God didn’t want us to know future by putting his fingerprint into our equations and calling it a random variable. Smart, isn’t it? Therefore, our goal is to work harder in order to guess what is going to happen next!?


In this post I will shortly describe one of the most popular methods of forecasting future volatility in financial time-series employing a GARCH model. Next, I will make use of 5-min intraday stock data of close prices to show how to infer possible stock value in next 5 minutes using current levels of volatility in intraday trading. Ultimately, I will discuss an exit strategy from a trade based on forecasted worst case scenario (stock price is forecasted to exceed the assumed stop-loss level). But first, let’s warm up with some cute equations we cannot live without.

Inferring Volatility

Capturing and digesting volatility is somehow like an art that does not attempt to represent external, recognizable reality but seeks to achieve its effect using shapes, forms, colors, and textures. The basic idea we want to describe here is a volatility $\sigma_t$ of a random variable (rv) of, e.g. an asset price, on day $t$ as estimated at the end of the previous day $t-1$. How to do it in the most easiest way? It’s simple. First let’s assume that a logarithmic rate of change of the asset price between two time steps is:
$$
r_{t-1} = \ln\frac{P_{t-1}}{P_{t-2}}
$$ what corresponds to the return expressed in percents as $R_{t-1}=100[\exp(r_{t-1})-1]$ and we will be using this transformation throughout the rest of the text. This notation leaves us with a window of opportunity to denote $r_t$ as an innovation to the rate of return under condition that we are able, somehow, to deduce, infer, and forecast a future asset price of $P_t$.

Using classical definition of a sample variance, we are allowed to write it down as:
$$
\sigma_t^2 = \frac{1}{m-1} \sum_{i=1}^{m} (r_{t-i}-\langle r \rangle)^2
$$ what is our forecast of the variance rate in next time step $t$ based on past $m$ data points, and $\langle r \rangle=m^{-1}\sum_{i=1}^{m} r_{t-i}$ is a sample mean. Now, if we examine return series which is sampled every one day, or an hour, or a minute, it is worth to notice that $\langle r \rangle$ is very small as compared with the standard deviation of changes. This observation pushes us a bit further in rewriting the estimation of $\sigma_t$ as:
$$
\sigma_t^2 = \frac{1}{m} \sum_{i=1}^{m} r_{t-i}^2
$$ where $m-1$ has been replaced with $m$ by adding an extra degree of freedom (equivalent to a maximum likelihood estimate). What is superbly important about this formula is the fact that it gives an equal weighting of unity to every value of $r_{t-i}$ as we can always imagine that quantity multiplied by one. But in practice, we may have a small wish to associate some weights $\alpha_i$ as follows:
$$
\sigma_t^2 = \sum_{i=1}^{m} \alpha_i r_{t-i}^2
$$ where $\sum_{i=1}^{m} \alpha_i = 1$ what replaces a factor of $m^{-1}$ in the previous formula. If you think for a second about the idea of $\alpha$’s it is pretty straightforward to understand that every observation of $r_{t-i}$ has some significant contribution to the overall value of $\sigma_t^2$. In particular, if we select $\alpha_i<\alpha_j$ for $i>j$, every past observation from the most current time of $t-1$ contributes less and less.

In 1982 R. Engle proposed a tiny extension of discussed formula, finalized in the form of AutoRegressive Conditional Heteroscedasticity ARCH($m$) model:
$$
\sigma_t^2 = \omega + \sum_{i=1}^{m} \alpha_i r_{t-i}^2
$$ where $\omega$ is the weighted long-run variance taking its position with a weight of $\gamma$, such $\omega=\gamma V$ and now $\gamma+\sum_{i=1}^{m} \alpha_i = 1$. What the ARCH model allows is the estimation of future volatility, $\sigma_t$, taking into account only past $m$ weighted rates of return $\alpha_i r_{t-i}$ and additional parameter of $\omega$. In practice, we aim at finding weights of $\alpha_i$ and $\gamma$ using maximum likelihood method for given return series of $\{r_i\}$. This approach, in general, requires approximately $m>3$ in order to describe $\sigma_t^2$ efficiently. So, the question emerges: can we do much better? And the answer is: of course.

Four years later, in 1986, a new player entered the ring. His name was Mr T (Bollerslev) and he literally crushed Engle in the second round with an innovation of the Generalized AutoRegressive Conditional Heteroscedasticity GARCH($p,q$) model:
$$
\sigma_t^2 = \omega + \sum_{i=1}^{p} \alpha_i r_{t-i}^2 + \sum_{j=1}^{q} \beta_j \sigma_{t-j}^2
$$ which derives its $\sigma_t^2$ based on $p$ past observations of $r^2$ and $q$ most recent estimates of the variance rate. The inferred return is then equal $r_t=\sigma_t\epsilon_t$ where $\epsilon_t\sim N(0,1)$ what leave us with a rather pale and wry face as we know what in practice that truly means! A some sort of simplification meeting a wide applause in financial hassle delivers the solution of GARCH(1,1) model:
$$
\sigma_t^2 = \omega + \alpha r_{t-1}^2 + \beta \sigma_{t-1}^2
$$ which derives its value based solely on the most recent update of $r$ and $\sigma$. If we think for a shorter while, GARCH(1,1) should provide us with a good taste of forecasted volatility when the series of past couple of returns were similar, however its weakness emerges in the moments of sudden jumps (shocks) in price changes what causes overestimated volatility predictions. Well, no model is perfect.

Similarly as in case of ARCH model, for GARCH(1,1) we may use the maximum likelihood method to find the best estimates of $\alpha$ and $\beta$ parameters leading us to a long-run volatility of $[\omega/(1-\alpha-\beta)]^{1/2}$. It is usually achieved in the iterative process by looking for the maximum value of the sum among all sums computed as follows:
$$
\sum_{i=3}^{N} \left[ -\ln(\sigma_i^2) – \frac{r_i^2}{\sigma_i^2} \right]
$$ where $N$ denotes the length of the return series $\{r_j\}$ ($j=2,…,N$) available for us. There are special dedicated algorithms for doing that and, as we will see later on, we will make use of one of them in Matlab.

For the remaining discussion on verification procedure of GARCH model as a tool to explain volatility in the return time-series, pros and cons, and other comparisons of GARCH to other ARCH-derivatives I refer you to the immortal and infamous quant’s bible of John Hull and more in-depth textbook by a financial time-series role model Ruey Tsay.

Predicting the Unpredictable


The concept of predicting the next move in asset price based on GARCH model appears to be thrilling and exciting. The only worry we may have, and as it has been already recognized by us, is the fact that the forecasted return value is $r_t=\sigma_t\epsilon_t$ with $\epsilon_t$ to be a rv drawn from a normal distribution of $N(0,1)$. That implies $r_t$ to be a rv such $r_t\sim N(0,\sigma_t)$. This model we are allowed to extend further to an attractive form of:
$$
r_t = \mu + \sigma_t\epsilon_t \ \ \ \sim N(\mu,\sigma_t)
$$ where by $\mu$ we will understand a simple mean over past $k$ data points:
$$
\mu = k^{-1} \sum_{i=1}^{k} r_{t-i} \ .
$$ Since we rather expect $\mu$ to be very small, its inclusion provides us with an offset in $r_t$ value modeling. Okay, so what does it mean for us? A huge uncertainty in guessing where we gonna end up in a next time step. The greater value of $\sigma_t$ inferred from GARCH model, the wider spread in possible values that $r_t$ will take. God’s fingerprint. End of hopes. Well, not exactly.

Let’s look at the bright side of our $N(\mu,\sigma_t)$ distribution. Remember, never give in too quickly! Look for opportunities! Always! So, the distribution has two tails. We know very well what is concealed in its left tail. It’s the devil’s playground of worst losses! Can a bad thing be a good thing? Yes! But how? It’s simple. We can always compute, for example, 5% quantile of $Q$, what would correspond to finding a value of a rv $r_t$ for which there is 5% and less of chances that the actual value of $r_t$ will be smaller or equal $Q$ (a sort of equivalent of finding VaR). Having this opportunity, we may wish to design a test statistic as follows:
$$
d = Q + r’
$$ where $r’$ would represent a cumulative return of an open trading position. If you are an intraday (or algorithmic high-frequency) trader and you have a fixed stop-loss limit of $s$ set for every trade you enter, a simple verification of a logical outcome of the condition of:
$$
d < s \ ? $$ provides you immediately with an attractive model for your decision to close your position in next time step based on GARCH forecasting derived at time $t-1$. All right, let’s cut the cheese and let’s see now how does that work in practice!

5-min Trading with GARCH Exit Strategy

In order to illustrate the whole theory of GARCH approach and dancing at the edge of uncertainty of the future, we analyze the intraday 5-min stock data of Toyota Motor Corporation traded at Toyota Stock Exchange with a ticker of TYO:7203. The data file of TYO7203m5.dat contains a historical record of trading in the following form:

TYO7203
DATES      TIMES OPEN HIGH LOW  CLOSE VOLUME
07/8/2010  11:00 3135 3140 3130 3130  50900
07/8/2010  11:05 3130 3135 3130 3130  16100
07/8/2010  11:10 3130 3145 3130 3140  163700
...
01/13/2011 16:50 3535 3540 3535 3535  148000
01/13/2011 16:55 3540 3545 3535 3535  1010900

i.e. it is spanned between 8-Jul-2010 11:00 and 13-Jan-2011 16:55. Using Matlab language,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
% Forecasting Volatility, Returns and VaR for Intraday 
%   and High-Frequency Traders
%
% (c) 2013 QuantAtRisk.com, by Pawel Lachowicz
 
clear all; close all; clc;
 
%% ---DATA READING AND PREPROCESSING
 
fname=['TYO7203m5.dat'];
% construct FTS object for 5-min data
fts=ascii2fts(fname,'T',1,2,[1 2]);
% sampling
dt=5; % minutes
 
% extract close prices
cp=fts2mat(fts.CLOSE,1);
 
% plot 5-min close prices for entire data set
figure(1);
plot(cp(:,1)-cp(1,1),cp(:,2),'Color',[0.6 0.6 0.6]);
xlabel('Days since 08-Jul-2010 11:00');
ylabel('TYO:7203 Stock Price (JPY)');

let’s plot 5-min close prices for a whole data set available:

Figure1

Now, a moment of concentration. Let’s imagine you are trading this stock based on 5-min close price data points. It is late morning of 3-Dec-2010, you just finished your second cup of cappuccino and a ham-and-cheese sandwich when at 10:55 your trading model sends a signal to open a new long position in TYO:7203 at 11:00.

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
% ---DATES SPECIFICATION
 
% data begin on/at
date1='08-Jul-2010';
time1='11:00';
% last available data point (used for GARCH model feeding)
dateG='03-Dec-2010';
timeG='12:35';
% a corresponding data index for that time point (idxGARCH)
[idxGARCH,~,~]=find(cp(:,1)==datenum([dateG,' ' ,timeG]));
% enter (trade) long position on 03-Dec-2010 at 11:00am
dateT='03-Dec-2010';
timeT='11:00';
% a corresposding data index for that time point(idx)
[idx,~,~]=find(cp(:,1)==datenum([dateT,' ',timeT]));

You buy $X$ shares at $P_{\rm{open}}=3310$ JPY per share at 11:00 and you wait observing the asset price every 5 min. The price goes first up, than starts to fall down. You look at your watch, it is already 12:35. Following a trading data processing, which your Matlab does for you every 5 minutes,

41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
 % ---DATA PREPROCESSING (2)
 
% extract FTS object spanned between the beginning of data
%  and last available data point
ftsGARCH=fetch(fts,date1,time1,dateG,timeG,1,'d');
cpG=fts2mat(ftsGARCH.CLOSE,1); % extract close prices (vector)
retG=log(cpG(2:end,2)./cpG(1:end-1,2)); % extract log return (vector)
 
figure(1); 
hold on; 
plot(cp(1:idxGARCH,1)-cp(1,1),cp(1:idxGARCH,2),'b') % a blue line in Figure 2
 
figure(2);
% plot close prices
x=1:5:size(cpG,1)*5;
x=x-x(idx+1); % 
plot(x,cpG(:,2),'-ok');
xlim([-20 110]);
ylim([3285 3330]);
ylabel('TYO:7203 Stock Price (JPY)'); 
xlabel('Minutes since 03-Dec-2010 11:00');
% add markers to the plot
hold on; %
plot(-dt,cpG(idx,2),'o','MarkerEdgeColor','g');
hold on; 
plot(0,cpG(idx+1,2),'o','MarkerFaceColor','k','MarkerEdgeColor','k');
% mark -0.5% stoploss line
xs=0:5:110;
stoploss=(1-0.005)*3315; % (JPY)
ys=stoploss*ones(1,length(xs));
plot(xs,ys,':r');

you re-plot charts, here: as for data extracted at 3-Dec-2010 12:35,

TYO:7203

where by blue line it has been marked the current available price history of the stock (please ignore a grey line as it refers to the future prices which we of course don’t know on 3-Dec-2010 12:35 but we use it here to understand better the data feeding process in time, used next in GARCH modeling), and

Fig3

what presents the price history with time axis adjusted to start at 0 when a trade has been entered (black filled marker) as suggested by the model 5 min earlier (green marker). Assuming that in our trading we have a fixed stop-loss level of -0.5% per every single trade, the red dotted line in the figure marks the corresponding stock price of suggested exit action. As for 12:35, the actual trading price of TYO:7203 is still above the stop-loss. Great! Well, are you sure?

As we trade the stock since 11:00 every 5 min we recalculate GARCH model based on the total data available being updated by a new innovation in price series. Therefore, as for 12:35, our GARCH modeling in Matlab,

73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
% ---ANALYSIS
 
% GARCH(p,q) parameter estimation
model = garch(3,3) % define model
[fit,VarCov,LogL,Par] = estimate(model,retG)
% extract model parameters
parC=Par.X(1) % omega
parG=Par.X(2) % beta (GARCH)
parA=Par.X(3) % alpha (ARCH)
% estimate unconditional volatility
gamma=1-parA-parG
VL=parC/gamma;
volL=sqrt(VL)
% redefine model with estimatated parameters
model=garch('Constant',parC,'GARCH',parG,'ARCH',parA)
 
% infer 5-min varinace based on GARCH model
sig2=infer(model,retG); % vector

runs,

model = 
    GARCH(3,3) Conditional Variance Model:
    --------------------------------------  
    Distribution: Name = 'Gaussian'
               P: 3
               Q: 3
        Constant: NaN
           GARCH: {NaN NaN NaN} at Lags [1 2 3]
            ARCH: {NaN NaN NaN} at Lags [1 2 3]
____________________________________________________________
   Diagnostic Information
 
Number of variables: 7
 
Functions 
Objective:                            @(X)OBJ.nLogLikeGaussian(X,V,E,Lags,...)
Gradient:                             finite-differencing
Hessian:                              finite-differencing (or Quasi-Newton)
 
Constraints
Nonlinear constraints:                do not exist
 
Number of linear inequality constraints:    1
Number of linear equality constraints:      0
Number of lower bound constraints:          7
Number of upper bound constraints:          7
 
Algorithm selected
   sequential quadratic programming
____________________________________________________________
   End diagnostic information
                                                          Norm of First-order
 Iter F-count            f(x) Feasibility  Steplength        step  optimality
    0       8   -2.545544e+04   0.000e+00                           1.392e+09
    1      60   -2.545544e+04   0.000e+00   8.812e-09   8.813e-07   1.000e+02
 
Local minimum found that satisfies the constraints.
 
Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the selected value of the function tolerance,
and constraints are satisfied to within the selected value of the constraint tolerance.

and returns the best estimates of GARCH model’s parameters:

   GARCH(1,1) Conditional Variance Model:
    ----------------------------------------
    Conditional Probability Distribution: Gaussian
 
                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant    2.39895e-07   6.87508e-08        3.48934
     GARCH{1}            0.9     0.0224136        40.1542
      ARCH{1}           0.05     0.0037863        13.2055
 
fit = 
    GARCH(1,1) Conditional Variance Model:
    --------------------------------------  
    Distribution: Name = 'Gaussian'
               P: 1
               Q: 1
        Constant: 2.39895e-07
           GARCH: {0.9} at Lags [1]
            ARCH: {0.05} at Lags [1]

defining the GARCH(1,1) model on 3-Dec-2010 12:35 of TYO:7302 as follows:
$$
\sigma_t^2 = 2.39895\times 10^{-7} + 0.05r_{t-1}^2 + 0.9\sigma_{t-1}^2 \ \ .
$$ where the long-run volatility $V=0.0022$ ($0.2\%$) at $\gamma=0.05$. Please note, that due to some computational mind tricks, we allowed to define the raw template of GARCH model as GARCH(3,3) which has been adjusted by Matlab’s solver down to resulting GARCH(1,1).

Based on the model and data available till $t-1=$ 12:35, we get a forecast of $\sigma_t$ value at $t=$ 12:40 to be

92
93
94
% forcast varianace 1-step (5-min) ahead, sigma^2_t
sig2_t=forecast(model,1,'Y0',retG,'V0',sig2); % scalar
sig_t=sqrt(sig2_t) % scalar

$$
\sigma_t=0.002 \ (0.2\%) \ .
$$
Plotting return series for our trade, at 12:35 we have:

96
97
98
99
100
101
102
103
% update a plot of 5-min returns
figure(3);
x=1:length(retG); x=x-idx; x=x*5;
plot(x,100*(exp(retG)-1),'ko-');  
xlim([-20 110]);
ylim([-0.8 0.8]);
xlabel('Minutes since 03-Dec-2010 11:00');
ylabel('R_t [%]');

Fig4

Estimating the mean value of return series by taking into account last 12 data points (60 min),

105
106
% estimate mean return over passing hour (dt*12=60 min)
mu=mean(retG(end-12:end))

we get $\mu=-2.324\times 10^{-4}$ (log values), what allows us to define the model for return at $t=$ 12:40 to be:
$$
R_t = \mu + \sigma_t\epsilon_t = -0.02324 + 0.2\epsilon \ , \ \ \ \epsilon\sim N(0,1)\ .
$$ The beauty of God’s fingerprint denoted by $\epsilon$ we can understand better but running a simple Monte-Carlo simulation and drawing 1000 rvs of $r_t=\mu+\sigma_t\epsilon$ which illustrate 1000 possible values of stock return as predicted by GARCH model!

108
109
% simulate 1000 rvs ~ N(0,sig_t)
r_t=sig_t*randn(1000,1);

One of them could be the one at 12:40 but we don’t know which one?! However, as discussed in Section 2 of this article, we find 5% quantile of $Q$ from the simulation and we mark this value by red filled circle in the plot.

111
112
113
114
115
116
117
tmp=[(length(retG)-idx)*5+5*ones(length(r_t),1) 100*(exp(r_t)-1)];
hold on; plot(tmp(:,1),tmp(:,2),'x');
 
Q=norminv(0.05,mu,sig_t); 
Q=100*(exp(Q)-1);
 
hold on; plot((length(retG)-idx)*5+5,Q,'or','MarkerFaceColor','r');

Including 1000 possibilities (blue markers) of realisation of $r_t$, the updated Figure 4 now looks like:

Fig5

where we find from simulation $Q=-0.3447\%$. The current P&L for an open long position in TYO:7203,

119
120
121
% current P/L for the trade
P_open=3315; % JPY (3-Dec-2010 11:00)
cumRT=100*(cpG(end,2)/P_open-1) %  = r'

returns $r’=-0.3017\%$. Employing a proposed test statistic of $d$,

123
124
% test statistics (percent)
d=cumRT+Q

we get $d=r’+Q = -0.6464\%$ which, for the first time since the opening the position at 11:00, exceeds our stop-loss of $s=-0.5\%$. Therefore, based on GARCH(1,1) forecast and logical condition of $d\lt s$ to be true, our risk management engine sends out at 12:35 an order to close the position at 12:40. The forecasted closing price,

126
127
% forecasted stock price under test (in JPY)
f=P_open*(1+d/100)

is estimated to be equal 3293.6 JPY, i.e. -0.6464% loss.

At 12:40 the order is executed at 3305 JPY per share, i.e. with realized loss of -0.3017%.

The very last question you asks yourself is: using above GARCH Exit Strategy was luck in my favor or not? In other words, has the algorithm taken the right decision or not? We can find out about that only letting a time to flow. As you come back from the super-quick lunch break, you sit down, look at the recent chart of TYO:7203 at 12:55,

Fig6

and you feel relief, as the price went further down making the algo’s decision a right decision.

For reference, the red filled marker has been used to denote the price at which we closed our long position with a loss of -0.3017% at 12:40, and by open red circle the forecasted price under $d$-statistics as derived at 12:35 has been marked in addition for the completeness of the grand picture and idea standing behind this post.

Black Swan and Extreme Loss Modeling

When I read the book of Nassim Nicholas Taleb Black Swan my mind was captured by the beauty of extremely rare events and, concurrently, devastated by the message the book sent: the non-computability of the probability of the consequential rare events using scientific methods (as owed to the very nature of small probabilities). I rushed to the local library to find out what had been written on the subject. Surprisingly I discovered the book of Embrechts, Kluppelberg & Mikosh on Modelling Extremal Events for Insurance and Finance which appeared to me very inaccessible, loaded with heavy mathematical theorems and proofs, and a negligible number of practical examples. I left it on the shelf to dust for a longer while until last week when a fresh drive to decompose the problem came back to me again.

In this article I will try to take you for a short but efficient journey through the part of the classical extreme value theory, namely, fluctuations of maxima, and fulfil the story with easy-to-follow procedures on how one may run simulations of the occurrences of extreme rare losses in financial return series. Having this experience, I will discuss briefly the inclusion of resulting modeling to the future stock returns.

1. The Theory of Fluctuations of Maxima

Let’s imagine we have a rich historical data (time-series) of returns for a specific financial asset or portfolio of assets. A good and easy example is the daily rate of returns, $R_i$, for a stock traded e.g. at NASDAQ Stock Market,
$$
R_t = \frac{P_t}{P_{t-1}} – 1 \ ,
$$ where $P_t$ and $P_{t-1}$ denote a stock price on a day $t$ and $t-1$, respectively. The longer time coverage the more valuable information can be extracted. Given the time-series of daily stock returns, $\{R_i\}\ (i=1,…,N)$, we can create a histogram, i.e. the distribution of returns. By the rare event or, more precisely here, the rare loss we will refer to the returns placed in the far left tail of the distribution. As an assumption we also agree to $R_1,R_2,…$ to be the sequence of iid non-degenerate rvs (random variables) with a common df $F$ (distribution function of $F$). We define the fluctuations of the sample maxima as:
$$
M_1 = R_1, \ \ \ M_n = \max(R_1,…,R_n) \ \mbox{for}\ \ n\ge 2 \ .
$$ That simply says that for any time-series $\{R_i\}$, there is one maximum corresponding to the rv (random variable) with the most extreme value. Since the main line of this post is the investigation of maximum losses in return time-series, we are eligible to think about negative value (losses) in terms of maxima (therefore conduct the theoretical understanding) thanks to the identity:
$$
\min(R_1,…,R_n) = -\max(-R_1,…,-R_n) \ .
$$ The distribution function of maximum $M_n$ is given as:
$$
P(M_n\le x) = P(R_1\le x, …, R_n\le x) = P(R_1\le x)\cdots P(R_n\le x) = F^n(x)
$$ for $x\in\Re$ and $n\in\mbox{N}$.

What the extreme value theory first ‘investigates’ are the limit laws for the maxima $M_n$. The important question here emerges: is there somewhere out there any distribution which satisfies for all $n\ge 2$ the identity in law
$$
\max(R_1,…,R_n) = c_nR + d_n
$$ for appropriate constants $c_n>0$ and $d_n\in\Re$, or simply speaking, which classes of distributions $F$ are closed for maxima? The theory defines next the max-stable distribution within which a random variable $R$ is called max-stable if it satisfies a aforegoing relation for idd $R_1,…,R_n$. If we assume that $\{R_i\}$ is the sequence of idd max-stable rvs then:
$$
R = c_n^{-1}(M_n-d_n)
$$ and one can say that every max-stable distribution is a limit distribution for maxima of idd rvs. That brings us to the fundamental Fisher-Trippett theorem saying that if there exist constants $c_n>0$ and $d_n\in\Re$ such that:
$$
c_n^{-1}(M_n-d_n) \rightarrow H, \ \ n\rightarrow\infty\ ,
$$ then $H$ must be of the type of one of the three so-called standard extreme value distributions, namely: Fréchet, Weibull, and Gumbel. In this post we will be only considering the Gumbel distribution $G$ of the corresponding probability density function (pdf) $g$ given as:
$$
G(z;\ a,b) = e^{-e^{-z}} \ \ \mbox{for}\ \ z=\frac{x-b}{a}, \ x\in\Re
$$ and
$$
g(z;\ a,b) = b^{-1} e^{-z}e^{-e^{-z}} \ .
$$ where $a$ and $b$ are the location parameter and scale parameter, respectively. Having defined the extreme value distribution and being now equipped with a better understanding of theory, we are ready for a test drive over daily roads of profits and losses in the trading markets. This is the moment which separates men from boys.

2. Gumbel Extreme Value Distribution for S&P500 Universe


As usual, we start with entrée. Our goal is to find the empirical distribution of maxima (i.e. maximal daily losses) for all stocks belonging to the S&P500 universe between 3-Jan-1984 and 8-Mar-2011. There were $K=954$ stocks traded within this period and their data can be downloaded here as a sp500u.zip file (23.8 MB). The full list of stocks’ names is provided in sp500u.lst file. Therefore, performing the data processing in Matlab, first we need to compute a vector storing daily returns for each stock, and next find the corresponding minimal value $M_n$ where $n$ stands for the length of each return vector:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
% Black Swan and Extreme Loss Modeling
%  using Gumbel distribution and S&P500 universe
%
% (c) 2013 QuantAtRisk, by Pawel Lachowicz
 
 
clear all; close all; clc;
tic;
 
%% DATA READING AND PREPROCESSING
 
% read a list of stock names
StockNames=dataread('file',['sp500u.lst'],'%s','delimiter', '\n');
K=length(StockNames); % the number of stocks in the universe
% path to data files
path=['./SP500u'];
 
fprintf('data reading and preprocessing..\n');
for si=1:K
    % --stock name
    stock=StockNames{si};
    fprintf('%4.0f  %7s\n',si,stock);
    % --load data
    n=[path,'/',stock,'.dat'];
    % check for NULL and change to NaN (using 'sed' command
    % in Unix/Linux/MacOS environment)
    cmd=['sed -i ''s/NULL/NaN/g''',' ',n]; [status,result]=system(cmd);
    % construct FTS object for daily data
    FTS=ascii2fts(n,1,2);
    % fill any missing values denoted by NaNs
    FTS=fillts(FTS);
    % extract the close price of the stock
    cp=fts2mat(FTS.CLOSE,0);
    % calculate a vector with daily stock returns and store it in
    % the cell array
    R{si}=cp(2:end)./cp(1:end-1)-1;
end
 
%% ANALYSIS
 
% find the minimum daily return value for each stock
Rmin=[];
for si=1:K
    Mn=min(R{si},[],1);
    Rmin=[Rmin; Mn];
end

Having that ready, we fit the data with the Gumbel function which (as we believe) would describe the distribution of maximal losses in the S&P500 universe best:

48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
% fit the empirical distribution with Gumbel distribution and 
% estimate the location, a, and scale, b, parameter
[par,parci]=evfit(Rmin);
a=par(1); 
b=par(2);
 
% plot the distribution
x=-1:0.01:1;
hist(Rmin,-1:0.01:0);
h=findobj(gca,'Type','patch');
set(h,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.6 0.6 0.6]);
h=findobj(gca,'Type','box');
set(h,'Color','k');
 
% add a plot of Gumbel pdf 
pdf1=evpdf(x,a,b);
y=0.01*length(Rmin)*pdf1;
line(x,y,'color','r'); box on;
xlabel('R_{min}');
ylabel(['f(R_{min}|a,b)']);
text(-1,140,['a = ',num2str(paramEstsMinima(1),3)]);
text(-1,130,['b = ',num2str(paramEstsMinima(2),3)]); 
xlim([-1 0]);

The maximum likelihood estimates of the parameters $a$ and $b$ and corresponding 95% confidence intervals we can find as follows:

>> [par,parci]=evfit(Rmin)
 
par =
   -0.2265    0.1135
 
parci =
   -0.2340    0.1076
   -0.2190    0.1197

That brings us to a visual representation of our analysis:

gumbel

This is a very important result communicating that the expected value of extreme daily loss is equal about -22.6%. However, the left tail of the fitted Gumbel distribution extends far up to nearly -98% although the probability of the occurrence of such a massive daily loss is rather low.

On the other hand, the expected value of -22.6% is surprisingly close to the trading down-movements in the markets on Oct 19, 1987 known as Black Monday when Dow Jones Industrial Average (DJIA) dropped by 508 points to 1738.74, i.e. by 22.61%!

3. Blending Extreme Loss Model with Daily Returns of a Stock

Probably you wonder how can we include the results coming from the Gumbel modeling for the prediction of rare losses in the future daily returns of a particular stock. This can be pretty straightforwardly done combining the best fitted model (pdf) for extreme losses with stock’s pdf. To do it properly we need to employ the concept of the mixture distributions. Michael B. Miller in his book Mathematics and Statistics for Financial Risk Management provides with a clear idea of this procedure. In our case, the mixture density function $f(x)$ could be denoted as:
$$
f(x) = w_1 g(x) + (1-w_1) n(x)
$$ where $g(x)$ is the Gumbel pdf, $n(x)$ represents fitted stock pdf, and $w_1$ marks the weight (influence) of $g(x)$ into resulting overall pdf.

In order to illustrate this process, let’s select one stock from our S&P500 universe, say Apple Inc. (NASDAQ: AAPL), and fit its daily returns with a normal distribution:

72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
% AAPL daily returns (3-Jan-1984 to 11-Mar-2011)
rs=R{18};
figure(2);
hist(rs,50);
h=findobj(gca,'Type','patch');
set(h,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.6 0.6 0.6]);
h=findobj(gca,'Type','box');
set(h,'Color','k');
% fit the normal distribution and plot the fit
[muhat,sigmahat]=normfit(rs)
x =-1:0.01:1;
pdf2=normpdf(x,muhat,sigmahat);
y=0.01*length(rs)*pdf2;
hold on; line(x,y,'color','r');
xlim([-0.2 0.2]); ylim([0 2100]);
xlabel('R');

aapl

where the red line represents the fit with a mean of $\mu=0.0012$ and a standard deviation $\sigma=0.0308$.

We can obtain the mixture distribution $f(x)$ executing a few more lines of code:

89
90
91
92
93
94
95
96
% Mixture Distribution Plot
figure(3);
w1= % enter your favorite value, e.g. 0.001
w2=1-w1;
pdfmix=w1*(pdf1*0.01)+w2*(pdf2*0.01);  % note: sum(pdfmix)=1 as expected
x=-1:0.01:1;
plot(x,pdfmix);
xlim([-0.6 0.6]);

It is important to note that our modeling is based on $w_1$ parameter. It can be intuitively understood as follows. Let’s say that we choose $w_1=0.01$. That would mean that Gumbel pdf contributes to the overall pdf in 1%. In the following section we will see that if a random variable is drawn from the distribution given by $f(x)$, $w_1=0.01$ simply means (not exactly but with a sufficient approximation) that there is 99% of chances of drawing this variable from $n(x)$ and only 1% from $g(x)$. The dependence of $f(x)$ on $w_1$ illustrates the next figure:

w1

It is well visible that a selection of $w_1>0.01$ would be a significant contributor to the left tail making it fat. This is not the case what is observed in the empirical distribution of daily returns for AAPL (and in general for majority of stocks), therefore we rather expect $w_1$ to be much much smaller than 1%.

4. Drawing Random Variables from Mixture Distribution

A short break between entrée and main course we fill with a sip of red wine. Having discrete form of $f(x)$ we would like to be able to draw a random variable from this distribution. Again, this is easy too. Following a general recipe, for instance given in the Chapter 12.2.2 of Philippe Jorion’s book Value at Risk: The New Benchmark for Managing Financial Risk, we wish to use the concept of inverse transform method. In first step we use the output (a random variable) coming from a pseudo-random generator drawing its rvs based on the uniform distribution $U(x)$. This rv is always between 0 and 1, and in the last step is projected on the cumulative distribution of our interest $F(x)$, what in our case would correspond to the cumulative distribution for $f(x)$ pdf. Finally, we read out the corresponding value on the x-axis: a rv drawn from $f(x)$ pdf. Philippe illustrates that procedure more intuitively:

Drawing a Random Variable Process

This methods works smoothly when we know the analytical form of $F(x)$. However, if this not in the menu, we need to use a couple of technical skills. First, we calculate $F(x)$ based on $f(x)$. Next, we set a very fine grid for $x$ domain, and we perform interpolation between given data points of $F(x)$.

98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
% find cumulative pdf, F(x)
figure(4);
s=0;
x=-1;
F=[];
for i=1:length(pdfmix);
    s=s+pdfmix(i);
    F=[F; x s];
    x=x+0.01;
end
plot(F(:,1),F(:,2),'k')
xlim([-1 1]); ylim([-0.1 1.1]);
% perform interpolation of cumulative pdf using very fine grid
xi=(-1:0.0000001:1);
yi=interp1(F(:,1),F(:,2),xi,'linear'); % use linear interpolation method
hold on; plot(xi,yi);

The second sort of difficulty is in finding a good match between the rv drawn from the uniform distribution and approximated value for our $F(x)$. That is why a very fine grid is required supplemented with some matching techniques. The following code that I wrote deals with this problem pretty efficiently:

115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
% draw a random variable from f(x) pdf: xi(row)
tF2=round((round(100000*yi')/100000)*100000);
RV=[];
for k=1:(252*40)
    notok=false;
    while(~notok)
        tU=round((round(100000*rand)/100000)*100000);
        [r,c,v]=find(tF2==tU);
        if(~isempty(r))
            notok=true;
        end
    end
    if(length(r)>1)
        rv=round(2+(length(r)-2)*rand);
        row=r(rv); 
    else
        row=r(1);
    end
    % therefore, xi(row) is a number represting a rv
    % drawn from f(x) pdf; we store 252*40 of those
    % new rvs in the following matrix:
    RV=[RV; xi(row) yi(row)];
end
% mark all corresponding rvs on the cumulative pdf
hold on; plot(RV(:,1),RV(:,2),'rx');

Finally, as the main course we get and verify the distribution of a large number of new rvs drawn from $f(x)$ pdf. It is crucial to check whether our generating algorithm provides us with a uniform coverage across the entire $F(x)$ plot,

Fx

where, in order to get more reliable (statistically) results, we generate 10080 rvs which correspond to the simulated 1-day stock returns for 252 trading days times 40 years.

5. Black Swan Detection

A -22% collapse in the markets on Oct 19, 1978 served as a day when the name of Black Swan event took its birth or at least had been reinforced in the financial community. Are black swans extremely rare? It depends. If you live for example in Perth, Western Australia, you can see a lot of them wandering around. So what defines the extremely rare loss in the sense of financial event? Let’s assume by the definition that by Black Swan event we will understand of a daily loss of 20% or more. If so, using the procedure described in this post, we are tempted to pass from the main course to dessert.

Our modeling concentrates around finding the most proper contribution of $w_1g(x)$ to resulting $f(x)$ pdf. As an outcome of a few runs of Monte Carlo simulations with different values of $w_1$ we find that for $w_1=[0.0010,0.0005,0.0001]$ we detect in the simulations respectively 9, 5, and 2 events (rvs) displaying a one-day loss of 20% or more.

Therefore, the simulated daily returns for AAPL, assuming $w_1=0.0001$, generate in the distribution two Black Swan events, i.e. one event per 5040 trading days, or one per 20 years:

Black Swans in future AAPL returns

That result agrees quite well with what has been observed so far, i.e. including Black Monday in 1978 and Flash Crash in intra-day trading on May 6, 2010 for some of the stocks.

Acknowledgements

I am grateful to Peter Urbani from New Zealand for directing my attention towards Gumbel distribution for modeling very rare events.

Simulation of Portfolio Value using Geometric Brownian Motion Model


Having in mind the upcoming series of articles on building a backtesting engine for algo traded portfolios, today I decided to drop a short post on a simulation of the portfolio realised profit and loss (P&L). In the future I will use some results obtained below for a discussion of key statistics used in the evaluation of P&L at any time when it is required by the portfolio manager.

Assuming that we trade a portfolio of any assets, its P&L can be simulated in a number of ways. One of the quickest method is the application of geometric brownian motion (GBM) model with a drift in time of $\mu_t$ and the process standard deviation of $\sigma_t$ over its total time interval. The model takes its form as follows:
$$
dS_t = \mu_t S_t dt + \sigma_t S_t dz
$$ where $dz\sim N(0,dt)$ and the process has variance equal to $dt$ (the process is brownian). Let $t$ is the present time and the portfolio has an initial value of $S_t$ dollars. The target time is $T$ therefore portfolio time horizon of evaluation is $\tau=T-t$ at $N$ time steps. Since the GBM model assumes no correlations between the values of portfolio on two consecutive days (in general, over time), by integrating $dS/S$ over finite interval we get a discrete change of portfolio value:
$$
\Delta S_t = S_{t-1} (\mu_t\Delta t + \sigma_t\epsilon \sqrt{\Delta t}) \ .
$$ For simplicity, one can assume that both parameters of the model, $\mu_t$ and $\sigma_t$ are constant over time, and the random variable $\epsilon\sim N(0,1)$. In order to simulate the path of portfolio value, we go through $N$ iterations following the formula:
$$
S_{t+1} = S_t + S_t(\mu_t\Delta t + \sigma_t \epsilon_t \sqrt{\Delta t})
$$ where $\Delta t$ denotes a local volatility defined as $\sigma_t/\sqrt{N}$ and $t=1,…,N$.

Example

Let’s assume that initial portfolio value is $S_1=\$10,000$ and it is being traded over 252 days. We allow the underlying process to have a drift of $\mu_t=0.05$ and the overall volatility of $\sigma_t=5%$ constant over time. Since the simulation in every of 252 steps depends on $\epsilon$ drawn from the normal distribution $N(0,1)$, we can obtain any number of possible realisations of the simulated portfolio value path.

Coding quickly the above model in Matlab,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
mu=0.05;      % drift
sigma=0.05;   % std dev over total inverval
S1=10000;     % initial capital ($)
N=252;        % number of time steps (trading days)
K=1;          % nubber of simulations
 
dt=sigma/sqrt(N);   % local volatility
 
St=[S1];
for k=1:K
    St=[S1];
    for i=2:N
        eps=randn; 
        S=St(i-1)+St(i-1)*(mu*dt+sigma*eps*sqrt(dt));
        St=[St; S];
    end
    hold on; plot(St);
end
xlim([1 N]);
xlabel('Trading Days')
ylabel('Simulated Portfolio Value ($)');

lead us to one of the possible process realisations,

Simulated Portfolio Value

quite not too bad, with the annual rate of return of about 13%.

The visual inspection of the path suggests no major drawbacks and low volatility, therefore pretty good trading decisions made by portfolio manager or trading algorithm. Of course, the simulation does not tell us anything about the model, the strategy involved, etc. but the result we obtained is sufficient for further discussion on portfolio key metrics and VaR evolution.

VaR and Expected Shortfall vs. Black Swan


It is one of the most fundamental approaches in measuring the risk, but truly worth revising its calculation. Value-at-Risk (VaR). A magical number quoted at the end of the day by the banks’ financial risk managers, portfolios’ risk managers, and risk managers simply concerned about the expected risk threshold not to be, hopefully, exceeded on the next day. What is so magical about it? According to definition, given a specific time horizon and taking into account last $N$ days of our trading activity, we can always calculate a number which would provide us with a simple answer to the question: if something really unexpected happened on the next day, what would be the loss margin we could for sure suffer? Well, welcome to the world of VaR!

More formally, given a daily (or, weekly, monthly, etc.) distribution of returns (of, for example, a single stock, a portfolio of assets, etc.), we are interested in finding the value of $VaR_\alpha$ of, say, $\alpha=0.05$ (five percent) which would say to as that there is 5% of chances that the value of $VaR_{0.05}$ would be exceeded in trading on the next day. This value is located in the left tail of the distribution and, by convention, it is given as positive number. Therefore, we define $VaR_\alpha$ as:
$$
P[X\le VaR_\alpha] = 1 – \alpha
$$ where $X$ is a random variable and can, as in our assumed case, represent a 1-day return on a traded asset or a portfolio of assets.

Beautiful! We have the risk threshold provided as a number. But in practice, how should we calculate $VaR_\alpha$? Here come into spotlight some of the shortcomings of VaR in general. VaR depends on the time-frame of the daily returns we examine: therefore $VaR_\alpha$ as a number will be different if you include last 252 trading days in estimation comparing to last 504 trading days (2 years). Intuitively, the more data (more days, years, etc.) we have the better estimation of $VaR_\alpha$, right? Yes and no. Keep in mind that the dynamic of the trading market changes continuously. In early 80s there was no high-frequency trading (HFT). It dominated trading traffic after the year of 2000 and now HFT plays a key role (an influential factor) in making impact on prices. So, it would be a bit unfair to compare $VaR_\alpha$ estimated in 80s with what is going on right now. $VaR_\alpha$ assumes that the distribution of returns is normal. Not most beautiful assumption but the most straightforward to compute $VaR_\alpha$. Should we use $VaR_\alpha$? Yes, but with caution.

Okay then, having $VaR_\alpha$ calculated we know how far the loss could reach at the $1-\alpha$ confidence level. The next super important question in risk management every risk manager should ask or at least be interested in reporting is: if $VaR_\alpha$ event occurs on the next day, what is the expected loss we might expect to suffer (say, in dollars)? $VaR_\alpha$ is the threshold. Now we are interested in the expected value of loss given the exceedance. It is the definition of expected shortfall and is based on the concept of conditional probability as follows:
$$
E[X\ |\ X>VaR_\alpha] = ES \ .
$$ In general, if our daily distribution of returns can be described by a function $f(x)$ which would represent a power density function (pdf), then:
$$
ES = \frac{1}{\alpha} \int_{-\infty}^{VaR_\alpha} xf(x)dx \ .
$$

Given any data set of returns, $R_t\ (t=1,…,N)$, calculated from our price history,
$$
R_t = \frac{P_{t+1}}{P_t} -1
$$ both numbers, $VaR_\alpha$ and $ES$ can be, in fact, calculated in two ways. The first method is based on the empirical distribution, i.e. using data as given:
$$
VaR_\alpha = h_i^{VaR} \ \ \mbox{for}\ \ \sum_{i=1}^{M-1} H_i(h_{i+1}-h_i) \le \alpha
$$ where $H$ represents the normalized histogram of $R_t$ (i.e., its integral is equal 1) and $M$ is the number of histograms bins. Similarly for $ES$, we get:
$$
ES = \sum_{i=1}^{h_i^{VaR}} h_iH_i(h_{i+1}-h_i) \ .
$$ The second method would be based on integrations given the best fit to the histogram of $R_t$ using $f(x)$ being a normal distribution. As we will see in the practical example below, both approaches returns different values, and an extra caution should be undertaken while reporting the final risk measures.

Case Study: IBM Daily Returns in 1987

The year of 1987 came into history as the most splendid time in the history of stock trading. Why? Simply because of so-called Black Monday on Oct 19, where markets denoted over 20% losses in a single trading day. Let’s analyse the case of daily returns and their distribution for IBM stock, as analysed by a risk manager after the close of the market on Thu, Dec 31, 1978. The risk manager is interested in the estimation of 1-day 95% VaR threshold, i.e. $VaR_0.05$, and the expected shortfall if on the next trading day (Jan 4, 1988) the exceedance event would occur. Therefore, let’s assume that our portfolio is composed solely of the IBM stock and we have the allocation of the capital of $C$ dollars in it as for Dec 31, 1987.

Using a record of historical trading, IBM.dat, of the form:

IBM
DATES 	      OPEN 	HIGH 	  LOW 	    CLOSE 	VOLUME
03-Jan-1984   0.000000 	0.000000  0.000000  30.437500 	0.000000
04-Jan-1984   0.000000 	0.000000  0.000000  30.968750 	0.000000
05-Jan-1984   0.000000 	0.000000  0.000000  31.062500 	0.000000
06-Jan-1984   0.000000 	0.000000  0.000000  30.875000 	0.000000
...
04-Mar-2011   0.000000 	0.000000  0.000000  161.830000 	0.000000
07-Mar-2011   0.000000 	0.000000  0.000000  159.930000 	0.000000

containing the close price of IBM stock, in Matlab, we extract the data for the year of 1987, then we construct a vector of daily returns $R_t$ on the stock,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
% Reading in the data
fn=['IBM.dat'];
ftsD=ascii2fts(fn,1,2);
range=[datenum('02-Jan-1987'),datenum('31-Dec-1987')];
drr=[datestr(range(1)),'::',datestr(range(2))];
fts=ftsD(drr);
data=fts2mat(fts,1);
 
% R_t vector
rdata=data(2:end,5)./data(1:end-1,5)-1;
 
% Plotting
figure(1);
subplot(2,1,1);
plot(data(:,1)-datenum('2-Jan-1987'),data(:,5));
xlim([0 365]); ylabel('IBM Close Price ($)');
subplot(2,1,2);
plot(data(2:end,1)-datenum('2-Jan-1987'),rdata);
xlim([0 365]); ylabel('R_t')
xlabel('t (days)')

and present the data graphically:

varesR

Having our data, let’s find $VaR_{0.05}$ and corresponding $ES$,

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
% Construct the normalized histogram
[H,h]=hist(rdata,100);
hist(rdata,100);
sum=0;
for i=1:length(H)-1
    sum=sum+(H(i)*(h(i+1)-h(i)));
end
figure(2)
H=H/sum;
bar(h,H,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.7 0.7 0.7])
 
% Calculate VaR 95 based on empirical distribution
sum=0;
i=1;
while (i<length(H))
    sum=sum+(H(i)*(h(i+1)-h(i)));
    if(sum>=0.05) 
        break;
    end
    i=i+1;
end
VaR=h(i)
hold on; plot(h(i),0,'ro'); % mark VaR in the plot
 
% Fit the normal distribution to R_t data
[muhat,sigmahat] = normfit([rdata]');
Y = normpdf(h,muhat(1),sigmahat(1));
hold on; plot(h,Y,'b-'); % and plot the fit using blue line
 
% Find for the fitted N(muhat,sigmahat) VaR 95%
sum=0;
i=1;
while (i<length(h))
    sum=sum+(Y(i)*(h(i+1)-h(i)));
    if(sum>=0.05) 
        break;
    end
    i=i+1;
end
VaR_fit=h(i)
hold on; plot(h(i),0,'ro','markerfacecolor','r'); % mark VaR_fit
ivar=i;
 
% Calculate the Expected Shortfall 
%  based on fitted normal distribution
sum=0;
i=1;
while (i<=ivar)
    sum=sum+(h(i)*Y(i)*(h(i+1)-h(i)));
    i=i+1;
end
ES_fit=sum/0.05
hold on; plot(ES_fit,0,'ko','markerfacecolor','k');
 
% Add numbers to the plot
text(-0.23,25,['1-Day VaR and Expected Shortfall']);
text(-0.23,23.5,['as for 31/12/1987']);
text(-0.23,20,['VaR = ',num2str(VaR*-1,3),' (',num2str(VaR*-100,2),'%)']);
text(-0.23,18,['ES  = ',num2str(ES_fit*-1,3),' (',num2str(ES_fit*-100,2),'%)']);
text(-0.23,13,['\mu = ',num2str(muhat,2)]);
text(-0.23,11,['\sigma = ',num2str(sigmahat,2)]);
xlabel('R_t');

marking all numbers in resulting common plot:

VarES

In the plot, filled red and black markers correspond to $VaR_{0.05}$ and $ES$, respectively, and are calculated using the fitted normal distribution. Both values have been displayed in the plot as well. In addition to that, the plot contains a red open circle marker denoting the value of $VaR_{0.05}$ obtained from the empirical histogram discrete integration. Please note how the choice of the method influences the value of VaR.

The derived numbers tell the risk manager that in 95% of cases one should feel confident that the value of $VaR_{0.05}$ will not be exceed in trading of IBM on Jan 4, 1988 (next trading day). $C\times VaR_{0.05} = 0.028C$ would provide us with the loss amount (in dollars) if such undesired event had occurred. And if it was the case indeed, the expected loss (in dollars) we potentially would suffer, would be $C\times ES = 0.058C$.

Now it easy to understand why the Black Swan event of Oct 19, 1987 falls into the category of super least likely moves in the markets. It places itself in the far left tail of both empirical and fitted distributions denoting an impressive loss of 23.52% for IBM on that day. If we assume that the daily price changes are normally distributed then probability of a drop of at least 22 standard deviations is
$$
P(Z\le -22) = \Phi(-22) = 1.4 \times 10^{-107} \ .
$$ This is astronomically rare event!

They say that a lightening does not strike two times, but if a 1-day drop in the market of the same magnitude had occurred again on Jan 4, 1988, our long position in IBM would cause us to suffer not the loss of $0.058C$ dollars but approximately $0.23C$. Ahhhuc!

Coskewness and Cokurtosis Computation for Portfolio Managers


Suppose you are responsible for the management of a current company’s portfolio $P$ consisting of $N$ securities (e.g. equities) traded actively at NASDAQ. As a portfolio manager you probably are interested in portfolio performance, new asset allocation, and risk control. Therefore, your basic set of statistics should include portfolio expected return, $\mu_P$, and portfolio standard deviation, $\sigma_P$, as two key numbers you want to calculate every time you evaluate your portfolio performance.

Coskewness, $s_P$, and cokurtosis, $k_P$, are two additional multivariate higher moments (co-moments) important in asset allocation process and portfolio management. Similarly to the fundamental meaning of a sample’s skewness and kurtosis, the co-skewness and co-kurtosis provides a portfolio manager with an ability to test the same portfolio under different composition in order to facilitate changes required to be introduced (e.g. an elimination of some allocations in badly performing securities causing more pronounced negative coskewness).

In this post I will provide with the basic matrix algebra covering the computational solution for four portfolio statistics with a Matlab code example as a practical implementation of mathematics in action.

Expected Return on Portfolio

Let $P$ is our portfolio containing $N$ securities $K_i$ such that $P=\{K_1,K_2,…,K_N\}$ and $i=1,…,N$. Each security has a trading history for last T days (or months, or years) and we are looking for its $T-1$ rates of return $r_t$ for $t=1,…,T-1$. Therefore, let us denote by $K_i$ a column vector of dimensions $(T-1,1)$ such that:
$$
K_i = [r_1,r_2,…,r_{T-1}]^T\ ,
$$
and portfolio matrix, $P$, containing all securities as:
$$
P =
[K_1\ |\ K_2\ |\ …\ |\ K_N ]
=
\left[
\begin{array}{cccc}
r_{1,1} & r_{1,2} & … & r_{1,N} \\
r_{2,1} & r_{2,2} & … & r_{2,N} \\
… & … & … & … \\
r_{T-1,1} & r_{T-1,2} & … & r_{T-1,N}
\end{array}
\right] \ .
$$ Our portfolio can be described in term of their weights:
$$
w_i = \frac{y_iS_i(0)}{V(0)}
$$ where numerator describes the product of the number of shares and stock purchase price in ratio to the amount initially invested in the portfolio. Rewriting it into one-row matrix we have:
$$
w = [\ w_1\ \ w_2\ \ … \ \ w_N\ ]
$$ where from obvious reason the sum of individual weights is one or $1=uw^T$. For each security we can calculate its expected return $\mu_i=E(K_i)$ and form a vector:
$$
m = [\ \mu_1\ \ \mu_2\ \ … \ \ \mu_N\ ] \ .
$$
Finally, the expected return on portfolio $P$ is given as:
$$
\mu_P = mw^T \
$$ what follows:
$$
\mu_P = E(K_i) = E\left(\sum_{i=1}^{N} w_iK_i \right) = \sum_{i=1}^{N} w_i\mu_i = mw^T .
$$

Portfolio Variance


By evaluation of variance, we are able to show that the expected portfolio variance can be expressed as:
$$
\sigma_P = w M_2 w^T
$$
where $M_2$ is the covariance matrix $(N,N)$ with the individual covariances denoted by $c_{ij}=Cov(K_i,K_j)$. At this point, each element of $M_2$ could be calculated as $c_{ij}=\rho_{ij}\sigma_i\sigma_j$, i.e. with a knowledge of coefficient of correlation, $\rho_{ij}$, between the securities $i$ and $j$, and corresponding standard deviations calculated from the history provided by $P$ matrix.

Just as an educational remark, let me remind you that elements $c_{ij}$ can be obtained in the following way:
$$
c_{ij} = \frac{1}{[(T-1)-1)]} \sum_{i,j=1}^{N} \sum_{t=1}^{T-1} (K_{t,i}-\mu_i)(K_{t,j}-\mu_j) \ ,
$$ therefore
$$
M_2 =
\left[
\begin{array}{cccc}
c_{11} & c_{12} & … & c_{1N} \\
c_{21} & c_{22} & … & c_{2N} \\
… & … & … & … \\
c_{N1} & c_{N2} & … & c_{NN}
\end{array}
\right] .
$$

Portfolio Coskewness

It can calculated as a product of:
$$
s_P = w M_3 (w^T\otimes w^T)
$$
where the symbol of $\otimes$ denotes the Kronecker product between $w$ and $w$, and
$$
M_3 = E[(r-\mu)(r-\mu)^T\otimes(r-\mu)^T] = \{a_{ijk}\} \ , \\
a_{ijk} = E[(r_i-\mu_i)(r_j-\mu_j)(r_k-\mu_k)] \ \ \mbox{for}\ \ i,j,k=1,…,N.
$$
The co-skewness matrix of $M_3$ of dimensions $(N,N^2)$ can be represented by $N$ $A_{ijk}$ matrixes $(N,N)$ such that:
$$
M_3 = [\ A_{1jk}\ A_{1jk}\ …\ A_{Njk}\ ]
$$
where $j,k=1,…,N$ as well as for an index $i$. The individual elements of the matrix can be obtained as:
$$
a_{ijk} = \frac{1}{T-1} \sum_{i,j,k=1}^{N} \sum_{t=1}^{T-1} (K_{t,i}-\mu_i)(K_{t,j}-\mu_j)(K_{t,k}-\mu_k) \ .
$$

Portfolio Cokurtosis

Cokurtosis can be obtained by calculation of the following products:
$$
k_P = w M_4 (w^T\otimes w^T\otimes w^T)
$$ where
$$
M_4 = E[(r-\mu)(r-\mu)^T\otimes(r-\mu)^T\otimes(r-\mu)^T] = \{b_{ijkl}\} \ , \\
b_{ijkl} = E[(r_i-\mu_i)(r_j-\mu_j)(r_k-\mu_k)(r_l-\mu_l)] \ \ \mbox{for}\ \ i,j,k,l=1,…,N.
$$
The co-kurtosis matrix $M_4$ of dimensions $(N,N^3)$ can be represented by $N$ $B_{ijkl}$ matrixes $(N,N)$ such that:
$$
M_4 = [\ B_{11kl}\ B_{12kl}\ …\ B_{1Nkl}\ |\ B_{21kl}\ B_{22kl}\ …\ B_{2Nkl}\ |\ …\ |\ B_{N1kl}\ B_{N2kl}\ …\ B_{NNkl}\ ]
$$
where $k,l=1,…,N$. The individual elements of the matrix can be obtained as:
$$
b_{ijkl} = \frac{1}{T-1} \sum_{i,j,k,l=1}^{N} \sum_{t=1}^{T-1} (K_{t,i}-\mu_i)(K_{t,j}-\mu_j)(K_{t,k}-\mu_k)(K_{t,l}-\mu_l) \ .
$$

Example

Below I present a simple Matlab code which computes discussed above portfolio key four portfolio statistics based on $N=3$ securities traded within given $P$. The choice of portfolio content is not solely limited to equities, and $P$ can contain any asset class under management, or even their mixture. Since the expected portfolio return and risk, $\mu_P$ and $\sigma_P$, respectively, are assumed to be expressed as percent, their transformation to a physical unit of money can be straightforwardly obtained.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
% Program demonstrates the calculattion of portfolio four key statistics:
% the expected portfolio return (muP), portfolio standard deviation (sigP),
% portfolio co-skewness (sP), and portfolio co-kurtosis (kP).
%
% (c) 2013, QuantAtRisk.com, by Pawel Lachowicz
%
% Assumptions: * portfolio comprises 252 trading days, i.e. T=252-1 values
%                of daily returns
%              * daily returns are simulated from the normal distribution
%                with zero mean and unit standard deviation (in practice
%                one can use real market data)
%              * N=3, number of securities in portfolio P
%
% Remark: the code can be easily modified for the need of any N securities
%         considered in your portfolio P; it is not difficult to note that
%         the calcution of all four statistics can be performed by risk
%         managers who manage the porfolio P of portfolios P_i such that
%         P=[P_1 P_2 ... P_N]
 
clear all; clc; close all;
 
T=252;
 
% Simulated returns for 3 securities X,Y,Z expressed in [%]
X=random('normal',0,1,T-1,1);
Y=random('normal',0,1,T-1,1);
Z=random('normal',0,1,T-1,1);
 
% Portfolio matrix
P=[X Y Z];
 
% Portfolio weights for all securities (sum=1)
w=[0.2 0.5 0.3];
 
% m-vector containing the expected returns of securities
m=mean([P(:,1) P(:,2) P(:,3)]);
 
% Computation of M2, the covariance matrix
S=[];
for i=1:size(P,2)
    for j=1:size(P,2)
        u=0;
        for t=1:T-1
            u=u+((P(t,i)-mean(P(:,i)))*(P(t,j)-mean(P(:,j))));
        end
        S(i,j)=u/((T-1)-1);
    end
end
M2=S; %
 
% Computation of M3, the co-skewness matrix
M3=[];
for i=1:size(P,2)
    S=[];
    for j=1:size(P,2)
        for k=1:size(P,2)
            u=0;
            for t=1:T-1 
                u=u+((P(t,i)-mean(P(:,i)))*(P(t,j)-mean(P(:,j))) ...
                    *(P(t,k)-mean(P(:,k))));
            end
            S(j,k)=u/(T-1); 
        end
    end
    M3=[M3 S];
end
 
% Computation of M4, the co-kurtosis matrix
M4=[];
for i=1:size(P,2)
    for j=1:size(P,2)
        S=[];
        for k=1:size(P,2)
            for l=1:size(P,2)
                u=0;
                for t=1:T-1 
                    u=u+((P(t,i)-mean(P(:,i)))*(P(t,j)-mean(P(:,j)))* ...
                        (P(t,k)-mean(P(:,k)))*(P(t,l)-mean(P(:,l))));
                end
                S(k,l)=u/(T-1);
            end
        end
        M4=[M4 S];
    end
end
 
% Results
muP=m*w'                           % expected return on a portfolio [%]
stdP=sqrt(w*M2*w')                 % portfolio standard deviation [%]
sK=w*M3*kron(w',w')                % portfolio coskewness
sP=w*M4*kron(kron(w',w'),w')       % portfolio cokurtosis

Measure of Risk of any Security in Equilibrium


The standard Capital Asset Pricing Model (CAPM) is an equilibrium model which construction allows us to determine the relevant measure of risk for any asset and the relationship between expected return and risk for asset when market are in equilibrium. Within CAPM the expected return on security $i$ is given by:
$$
\bar{R}_i = R_F + \beta_i(\bar{R}_M-R_F)
$$ where $\beta_i$ is argued to be the correct measure of a security’s risk in a very well-diversified portfolios. For such portfolios the non-systematic risk tends to go to zero and the reminder – the systematic risk – is measured by $\beta_i$. For the Market Portfolio its Beta is equal one. Therefore, the aforementioned equation defines the security market line.

It is possible to go one step further and write the same CAPM formula as follows:
$$
\bar{R}_i = R_F + \left( \frac{\bar{R}_M-R_F}{\sigma_M} \right) \frac{\sigma_{iM}}{\sigma_M}
$$ what keeps its linear relationship between the expected return but in $\sigma_{iM}/\sigma_M$ space. Recall that the term in braces is the market price of the risk at riskless rate of interest given by $R_F$. $\sigma_{iM}$ denotes the measure of the risk of security $i$ but if left defined as $\sigma_{iM}/\sigma_M$ is provide us with more intuitive understanding itself as the measure of how the risk on a security affects the risk of the market portfolio.

In other words, $\sigma_{iM}/\sigma_M$ is the measure of risk of any security in equilibrium and, as we will show further below, it is equal:
$$
\frac{\sigma_{iM}}{\sigma_M} = \frac{X_i^2\sigma_i^2 + \sum_{j=1,\\ j\ne 1}^{N} X_j \sigma_{ij}} {\sigma_M}
$$ We may get that performing a calculation of the first derivative of the standard deviation of the market portfolio $\sigma_M$, i.e. for
$$
\sigma_M^2 = \sum_{i=1}^{N} X_i^2\sigma_i^2 + \sum_{i=1}^N \sum_{j=1, j\ne i}^{N} X_iX_j \sigma_{ij}
$$ the risk of a security is the change in the risk of the market portfolio. as the holdings of that security are varied:
$$
\frac{d\sigma_M}{dX_i} = \frac { \left[ \sum_{i=1}^{N} X_i^2\sigma_i^2 + \sum_{i=1}^N \sum_{j=1, j\ne i}^{N} X_iX_j\sigma_{ij} \right]^{1/2} }
{ dX_i }
$$
$$
= \frac{1}{2} \left[ 2X_i\sigma_i^2 + 2 \sum_{j=1, j\ne 1}^{N} X_j\sigma_{ij} \right] \times \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \left[ \sum_{i=1}^{N} X_i^2\sigma_i^2 + \sum_{i=1}^N \sum_{j=1, j\ne i}^{N} X_iX_j\sigma_{ij} \right]^{-1/2}
$$
$$
= \frac{X_i^2\sigma_i^2 + \sum_{j=1, j\ne 1}^{N} X_j \sigma_{ij} } {\sigma_M}
= \frac{\sigma_{iM}}{\sigma_M}
$$ Above, $\sigma_{ij}$ denotes a covariance between any $i$ and $j$ security in the market portfolio and $X_i$ is the share of capital invested in $i$’s security.

Interestingly, the only and remaining question is: how do we know what is best choice of the market portfolio? I will try to address a separate post around the research one may conduct covering that topic. As for now, one may assume that S&P500 or S&P900 universe of stocks would serve here as a quite good example. If so, that would also mean that in order to find $\sigma_{iM}/\sigma_M$ for any security we invested our money in, we need to find $\sigma_M$. Is it hard? Well, if we assume the market to be S&P500 then for any $i$ its $X_i=1/500$. Now, all we need to care is to have an idea about the security’s variance $\sigma_i^2$ and the measure of covariance, i.e. how two securities like each other in out market portfolio. A clever reader would ask immediately: But what about a time-window? And that’s where the devil conceals itself.

Final Frontier of Efficient Frontier


Have you ever considered why in the risk-return space the efficient frontier takes its shape which resembles some of the Bézier curves? And also, why everybody around suggests you that if you approach the efficient frontier from the right-hand side, your selection of a new investment could be the best choice?

Well, as for the latter question the answer seems to be easy. First, you plot all your stocks or the combination of stocks into a number of various portfolios (i.e. you derive the expected return and volatility of a given stock or portfolio, respectively; check out my article on 2-Asset Portfolio Construction for some fundamentals in that field) in the expected risk-return space. Your result can be similar to that one that follows:

where red dots correspond to specific positions of securities which you actually consider for a future investment. The data in the plot are annualized, what simply means we are looking for a potential return and risk (volatility) in a 1-year time horizon only. Secondly, from the plot you can read out that UTX, based on historical data, offers the highest potential return but at quite high risk. If you are not a risk-taker, the plot suggest immediately to choose MMM or JNJ but at lower expected return. So far so good.

The blue line represent so called efficient frontier in the risk-return space. The investor wants to choose the portfolio on the efficient frontier becasue – for each volatility – there is portfolio on the efficient frontier that has a higher rate of return than any other portfolio with the same volatility. In addition, by holding a portfolio on the efficient frontier the investor benefits from the diversification (see also Riskless Diversification). That brings us to an attempt of answering the very first question we posed at the beginning: about the shape, therefore about:

THE MATHEMATICS OF EFFICIENT FRONTIER


As we will show below, the efficient frontier is a solution to the problem of efficient selection between two securities. It is sufficient to conduct the basic calculation based on two assets considered together. For the higher number of potential securities or portfolios of securities (as presented in the plot above), the finding of the efficient frontier line is the subject to more advanced optimization.

Let’s start from the simple case of two securities, A and B, respectively. In our consideration we will omit the scenario where short sales are allowed. The expected portfolio return would be:
$$
\bar{R_P} = x_A\bar{R_A} + x_B\bar{R_B}
$$
where $x_A+x_B=1$ represents the sum of shares (weights), therefore $x_B=1-x_A$ simply means that the our whole initial capital has been fully invested into both securities, and we firstly decided to put a part of $x_A$ of our money into security A (say, $x_A=0.75$ or 75%). By rewriting the above we get:
$$
\bar{R_P} = x_A\bar{R_A} + (1-x_A)\bar{R_B} .
$$
The volatility (standard deviation) of that portfolio is therefore equal:
$$
\sigma_P = [x_A^2\sigma_A^2 + (1-x_A)^2\sigma_B^2 + 2\sigma_A\sigma_B\rho_{AB}x_A(1-x_A)]^{1/2}
$$
where by $\rho_{AB}$ we denoted a linear correlation between the expected rates of return for both securities, also possible to denote as $corr(\bar{R_A},\bar{R_B})$. Please also note the the third term can be jotted down as:
$$
2\sigma_A\sigma_B\rho_{AB}x_A(1-x_A) = 2x_A(1-x_A)cov(\bar{R_A},\bar{R_B})
$$
where the relation between covariance and correlation holds as follows:
$$
cov(\bar{R_A},\bar{R_B}) = \sigma_A\sigma_B corr(\bar{R_A},\bar{R_B}) .
$$
The correlation can be a free parameter if we are given from the historical data of securities A and B the following parameters: $\sigma_A, \sigma_B$ and $\bar{R_A}, \bar{R_B}$. This is so important to keep it in mind!

Below, we will consider four cases leading us to the final mathematical curvature of the efficient frontier.

CASE 1: Perfect Positive Correlation ($\rho_{AB}=+1$)

Considering two assets only, A and B, respectively, the perfect positive correlation of $\rho_{AB}=+1$ leads us to the portfolio volatility:
$$
\sigma_P = [x_A^2\sigma_A^2 + (1-x_A)^2\sigma_B^2 + 2\sigma_A\sigma_B\times 1 \times x_A(1-x_A)]^{1/2}
$$
where we recognize $a^2+b^2+2ab=(a+b)^2$ what allows us to rewrite it as follows:
$$
\sigma_P = [x_A\sigma_A + (1-x_A)\sigma_B] \ \ \mbox{for}\ \ \rho_{AB}=+1 .
$$
Solving for $x_A$ we get:
$$
x_A = \frac{\sigma_P-\sigma_B}{\sigma_A-\sigma_B} \ \ \mbox{i.e.}\ \ x_A \propto \sigma_P .
$$
Substituting that into the formula for the expected portfolio return of 2-assets, we arrive at:
$$
\bar{R_P} = x_A\bar{R_A} + (1-x_A)\bar{R_B} \\
= \frac{\sigma_P-\sigma_B}{\sigma_A-\sigma_B}\bar{R_A} + \left(1-\frac{\sigma_P-\sigma_B}{\sigma_A-\sigma_B}\right)\bar{R_B} \\
… \\
= \sigma_P\left(\frac{\bar{R_A}-\bar{R_B}}{\sigma_A-\sigma_B}\right) – \sigma_B\left(\frac{\bar{R_A}-\bar{R_B}}{\sigma_A-\sigma_B}\right)+\bar{R_B} .
$$
Therefore, the solution for that case can be summarized by the linear relation holding between the expected portfolio return and the portfolio volatility if the portfolio is constructed based on two assets perfectly positively correlated:
$$
\bar{R_P}(\sigma_P) \propto \sigma_P .
$$

CASE 2: Perfect Negative Correlation ($\rho_{AB}=-1$)

In this case, we can denote the portfolio volatility as:
$$
\sigma_P = [x_A^2\sigma_A^2 + (1-x_A)^2\sigma_B^2 + 2\sigma_A\sigma_B\times (-1) \times x_A(1-x_A)]^{1/2}
$$
where similarly as in Case 1, we recognize $a^2+b^2-2ab=(a-b)^2$ what lead us to double possible solutions, namely:
$$
\sigma_P = x_A\sigma_A – (1-x_A)\sigma_B \ \ \ \ \ \ \ \mbox{or} \\
\sigma_P = -x_A\sigma_A + (1-x_A)\sigma_B \ .
$$
Since $\sigma_P$ must always be greater than zero, thus one of the solutions holds anytime.

Please also note that $\sigma_P$(Case 2) is always less than $\sigma_P$(Case 1) for all $x_A\in\langle 0,1\rangle$. That allows us to formulate a very important note: The risk on the portfolio with $\rho_{AB}=-1$ is smaller than for the portfolio with $\rho_{AB}=+1$. Interesting, isn’t it? Hold on and check this out!

It is possible to obtain a portfolio with zero risk by setting $\sigma_P$(Case 2) to zero and solving for $x_A$ (i.e. what fraction of our money we need to invest in asset A to minimize the risk) as follows:
$$
x_A\sigma_A – (1-x_A)\sigma_B = 0 \\
x_A^2\sigma_A – \sigma_B – x_A\sigma_B =0 \\
x_A(\sigma_A+\sigma_B) = \sigma_B
$$
what provide us with a solution of:
$$
x_A = \frac{\sigma_B}{\sigma_A+\sigma_B} .
$$
It simply means that for such derived value of $x_A$ the minimal risk of the portfolio occurs, i.e. $\sigma_P=0$.

CASE 3: No Correlation between returns on the assets ($\rho_{AB}=0$)

In this case the cross-term vanishes leaving us with the simplified form of the expected portfolio volatility:
$$
\sigma_P = [x_A^2\sigma_A^2 + (1-x_A)^2\sigma_B^2 ]^{1/2} .
$$
The graphical presentation is as follows:

The point marked by a dot is peculiar. This portfolio has a minimum risk. It is possible to derive for what sort of value of $x_A$ this is achievable. In order to do it, for a general form of $\sigma_P$,
$$
\sigma_P = [x_A^2\sigma_A^2 + (1-x_A)^2\sigma_B^2 + 2x_A(1-x_A)\sigma_A\sigma_B\rho_{AB}]^{1/2} ,
$$
we find $x_A$ by calculating the first derivative,
$$
\frac{d{\sigma_P}}{d{x_A}} = \left(\frac{1}{2}\right) \frac{2x_A\sigma_A^2-2\sigma_B^2+2x_A\sigma_B^2+2\sigma_A\sigma_B\rho_{AB}-4x_A\sigma_A\sigma_B\rho_{AB}}
{[x_A^2\sigma_A^2+(1-x_A)^2\sigma_B^2+2x_A(1-x_A)\sigma_A\sigma_B\rho_{AB}]^{1/2}} \\
= \left(\frac{1}{2}\right) \frac{2x_A\sigma_A^2-2\sigma_B^2+2x_A\sigma_B^2+2\sigma_A\sigma_B\rho_{AB}-4x_A\sigma_A\sigma_B\rho_{AB}}{\sigma_P} ,
$$
and by setting it to zero:
$$
\frac{d{\sigma_P}}{d{x_A}} = 0
$$
we get:
$$
x_A = \frac{\sigma_B^2-\sigma_A\sigma_B\rho_{AB}}{\sigma_A^2+\sigma_B^2-2\sigma_A\sigma_B\rho_{AB}}
$$
what in case of $\rho_{AB}=0$ reduces to:
$$
x_A = \frac{\sigma_B^2}{\sigma_A^2+\sigma_B^2} .
$$
This solution is also know as the minimum variance portfolio when two assets are combined in a portfolio.

CASE 4: All other correlations between two assets (e.g. $\rho_{AB}=0.4$)

Given $\sigma_A$ and $\sigma_B$ at $\rho_{AB}$ not equal to zero, $-1$ or $1$, one can find that the formula for the expected variance of 2-asset portfolio will take, in general, the following form:
$$
\sigma_P = \sqrt{c_1x_A^2 + c_2}
$$
where $c_1$ and $c_2$ denote some coefficients and it can be presented in the graphical form as follows:

There is particular point of interest here and the shape of the curve is simply concaved. Since the investor is always to choose the asset which offers a bigger rate of return (say, $\bar{R}_A\gt\bar{R}_B$) at lower risk ($\sigma_B\lt\sigma_A$), therefore the best combination of portfolios we get every time following these rules are called as efficient frontier, in general represented by the Case 4’s line.

2-Asset Portfolio Construction


When you start your journey in the world of finance and money investment, sooner or later, you hit the concept of a portfolio. The portfolio is simply a term used to describe shortly a set of assets you are or your are planning to hold in order to make a profit in a given horizon of time.

If you are a fan of stocks, your desire is to purchase some shares of the company you like or you anticipate it will bring you a profit. Make a note here. I said- the company. Single, not plural. If so, having some initial capital set aside for a particular choice of yours, say $C=\$100,000$, you decided to buy IBM stock for $\$100$. Great! You own a fraction of the IBM company. Regardless the fact whether you are a beginner in investing/trading or you are more advanced in that field, your purchase of IBM shares is now called a long position, i.e. you expect the value of the stock to go up. We will skip the scenario of short-selling (i.e. making the profit when the price of the stock goes down) for the simplicity of this article. All right, you waited a couple of days and, luckily, two wonderful events took place in meanwhile. Firstly, the IBM stock is now not worth $\$100$ any longer (the demand for that stock was so high that there were more buyers wanting to purchase a piece of IBM that their buying demand pushed the stock price higher), and now IBM is worth $\$120$ per share. And guess what?! You can’t believe it! You’re smiling just because now you can sell your part (share) of the IBM company to someone else (a buyer who is now willing to pay $\$120$ for the same share of IBM, i.e. $\$20$ more!). So, you do it! Secondly, in meanwhile, the company of IBM decided that a part of its net income would be distributed (given to, divided among) all shareholders in the form of dividends that value had been announced by IBM to be $\$5$ per one share hold. As a consequence of these two events, your net return occurred to be equal:
$$
R = \frac{S_t+D_t}{S_{t-1}} – 1 = \frac{\$120+$5}{\$100} – 1 = 0.25 \ \ \mbox{or} \ \ 25\%
$$
what in terms of your capital gain simply means you became richer by extra $\$25,000$ according to
$$
\mbox{Capital updated} \ = C(1+R) = \$100,000(1+0.25) = \$125,000 .
$$

To be completely honest, this situation would take place under one condition of the market’s operations of buying and selling the IBM stock. You may ask- what’s that? It is commonly denoted and referred to as the situation of perfect financial market(s) – i.e. an assumption of no taxes, no transaction costs, no costs of writing and enforcing contracts, no restrictions on investment in securities, no differences in information across investors, investors take prices as given because they are too small to affect them by their action of buying/selling (i.e. there exists a good liquidity of the stock in the market) – is taken as default. Oh, it’s a rough assumption! It is, indeed. But it has been assumed with a clear purpose of simplifying the basic understanding of what you can gain/lose in the processes of trading. For a curious reader, I refer to my article of Number of Shares for Limit Orders which provides with a broader feeling on how extra fees of trading operations influence your initial capital, $C$, that you wish to invest in a single stock.

Let’s now consider that you want to split your initial capital of $C$ between two choices: IBM and NVDA (NVIDIA Corporation). From some resources you have found out that the latter stock is more volatile than IBM (i.e. its price jumps more rapidly; it’s more risky) therefore you decided to invest:

  1. $\$75,000$ in IBM (0.75 share of portfolio)
  2. $\$25,000$ in NVDA (0.25 share of portfolio) .

Now, also having information that the expected return in 1-year horizon is equal 13% and 26% for IBM and NVDA, respectively, and their corresponding volatilities are 30% and 60%, firstly we may calculate the fundamental portfolio measures based on these two assets (stocks) we wish to hold (invest in and hold).

The Expected Portfolio ($P$) Return
$$
E(R_P) = \sum_{i=1}^{N} w_iE(R_i)
$$
what in our case study returns the following expectation for two assets ($N=2$),
$$
E(R_P) = 0.75\times 0.13+0.25\times 0.26 = 0.1625 \ \ \mbox{or} \ \ 16.25\% .
$$

The Volatility of the Return of the Portfolio


Before reaching the exact formula for the volatility of the expected return of our 2-asset portfolio, we need to consider the basic mathematics standing for it. Firstly, have in mind that for any variance for a random variable $R_i$ (here, the expected return of the asset $i$),
$$
Var(w_iR_i) = w_i^2 Var(R_i) .
$$
Secondly, keep in mind a high-school formula of $(a+b)^2=a^2+b^2+2ab$. Got it? Great! Let’s move forward with the variance of the sum of two variables, say $a$ and $b$:
$$
Var(a+b) = E[a+b-E(a+b)]^2 \\
= E[a+b-E(a)-E(b)]^2 \\
= E[a-E(a)+b-E(b)]^2 \\
= E[a-E(a)]^2 + E[b-E(b)]^2 + 2E[a-E(a)]E[b-E(b)] .
$$
Where does it take us? Well, to the conclusion of the day, i.e.:
$$
= Var(a) + Var(b) + 2Cov(a,b)
$$
where $Cov(a,b)$ is a covariance that measures how $a$ and $b$ move together. At this point, we wish to define correlation between two variables $a$ and $b$ as:
$$
Cov(a,b) = Vol(a)\times Vol(b)\times Corr(a,b)
$$
what for two same assets (variables) perfectly correlated with each other (i.e. $corr(a,a)=1$) would return us with:
$$
Cov(a,b)=Vol(a)\times Vol(b)\times 1 = Var(a) .
$$
Keeping all foregoing mathematics in mind, we may define the variance of the portfolio return as follows:
$$
Var(P) = Var\left(\sum_{i=1}^{N} w_iR_i\right) = \sum w_i^2Var(R_i)+\sum_{i=1}^{N}\sum_{i\ne j,\ j=1}^{N} w_iw_jCov(R_i,R_j) .
$$
where again,
$$
Cov(R_i,R_j) = Var(R_i)Var(R_j)Corr(R_i,R_j) ,
$$
describes the relation between the covariance of expected returns of an asset $i$ and $j$, respectively, and their corresponding variances and mutual (linear) correlation (factor). The latter to be within $[-1,1]$ interval. The formula for $Var(P)$ is extremely important to memorize as it provides the basic information on whether a combination of two assets is less or more risky than putting all our money into one basket?

Therefore, 2-asset portfolio construction constitutes the cornerstone for a basic risk management approach for any investments which require, in general, a division of the investor’s initial capital $C$ into $N$ number of assets (investments) he or she wishes to take the positions in.

Coming back to our case study of IBM and NVDA investment opportunities combined into the portfolio of two assets, we find that:
$$
Var(P) = (0.75)^2(0.30)^2 + (0.25)^2(0.60)^2 + 2(0.75)(0.25)(0.30)(0.60)\times \\ Corr(\mbox{IBM},\mbox{NVDA}) = 0.11
$$
at assumed, ad hoc, correlation of 0.5 between these two securities. Since,
$$
Vol(P)=\sqrt{Var(P)} ,
$$
eventually we find that our potential investment in both IBM and NVDA companies – the 2-asset portfolio – given their expected rates of returns at derived volatility levels for a 1-year time-frame is:
$$
E(P) \ \ \mbox{at}\ \ Vol(P) = 16.25\% \ \ \mbox{at}\ \ 33.17\% .
$$
It turns to be more rewarding ($16.25\%$) than investing solely in IBM (the expected return of $13\%$) at a slightly higher risk ($3\%$ difference).

We will come back to the concept of 2-asset portfolio during the discussion of risk-reward investigation for N-asset investment case.

Riskless Diversification

Diversification can be viewed by many as an attractive tool for risk limitation for a given portfolio currently held by investor. Dealing with numbers one can show that if one holds $N$ securities that are uncorrelated completely, the volatility of such portfolio $Vol(P)$ will decrease with an increasing number of securities $N$ as follows:
$$
Vol(P) = \left[\sum_{i=1}^{N} \left(\frac{1}{N}\right)^2 Var(y_i)^2 \right] = \frac{Var(y)}{\sqrt{N}} .
$$
What remains unspoken but important to keep in mind is that all securities $i$ where $i=1,…,N$ have the same volatility and the same expected return. Therefore, by holding uncorrelated securities, one can eliminate portfolio volatility if one holds sufficiently many of these securities.

That is a theory. The investor’s reality is usually far from this ideal situation as portfolios usually contain a number of investments mutually connected or dependable upon themselves as the feedback to the market movements or sentiments. Thus, the reminder of diversifiable risk which cannot be eliminated by the diversification process becomes a systematic risk.

Contact Form Powered By : XYZScripts.com