Quantitative Analysis, Risk Management, Modelling, Algo Trading, and Big Data Analysis

Python for Algo-Trading: Workshops in Poland


QuantAtRisk.com would love to invite You to attend

poland-banner

where we will cover all essential aspects of algo-trading covering Warsaw Stock Exchange and selected EU/UK/US exchanges + FOREX markets.


Interested? Click here for Early-Bird Registration Special Offer and Details!

Organised and Led by
Dr. Pawel Lachowicz (Sydney, Australia)

Recovery of Financial Price-Series based on Daily Returns Matrix in Python

Lesson 10>>

As a financial analyst or algo trader, you are so often faced with information on, inter alia, daily asset trading in a form of a daily returns matrix. In many cases, it is easier to operate with the return-series rather than with price-series. And there are excellent reasons standing behind such decision, e.g. the possibility to plot the histogram of daily returns, the calculation of daily Value-at-Risk (Var), etc.

When you use Python (not Matlab), the recovery of price-series for return-series may be a bit of challenge, especially when you face the problem for the first time. A technical problem, i.e. “how to do it?!” within Python, requires you to switch your thinking mode and adjust your vantage point from Matlab-ish to Pythonic. Therefore, let’s see what is the best recipe to turn your world upside down?!

Say, we start with a $N$-asset portfolio of $N$ assets traded for $L+1$ last days. It will require the use of the Python’s NumPy arrays. We begin with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import numpy as np
import matplotlib.pyplot as plt
 
np.random.seed(2014)
 
# define portfolio
N = 5  # number of assets
L = 4  # number of days
 
# asset close prices
p = np.random.randint(10, 30, size=(N, 1)) + \
    np.random.randn(N, 1)  # a mixture of uniform and N(0,1) rvs
print(p)
print(p.shape)
print()

where we specify a number of assets in portfolio and a number of days. $L = 4$ has been selected for the clarity of printing of the outcomes below, however, feel free to increase that number (or both) anytime you rerun this code.

Next, we create a matrix ($N\times 1$) with a starting random prices for all $N$ assets to be between \$10 and \$30 (random integer) supplemented by (0, 1) fractional part. Printing p returns:

[[ 25.86301396]
 [ 19.82072772]
 [ 22.33569347]
 [ 21.38584671]
 [ 24.56983489]]
(5, 1)

Now, let’s generate a matrix of random daily returns over next 4 days for all 5 assets:

17
18
19
r = np.random.randn(N, L)/50
print(r)
print(r.shape)

delivering:

[[ 0.01680965 -0.00620443 -0.02876535 -0.03946471]
 [-0.00467748 -0.0013034   0.02112921  0.01095789]
 [-0.01868982 -0.01764086  0.01275301  0.00858922]
 [ 0.01287237 -0.00137129 -0.0135271   0.0080953 ]
 [-0.00615219 -0.03538243  0.01031361  0.00642684]]
(5, 4)

Having that, our wish is, for each asset, take its first close-price value from the p array and using information on daily returns stored row-by-row (i.e. asset per asset) in the r array, reconstruct the close-price asset time-series:

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
for i in range(r.shape[0]):
    tmp = []
    for j in range(r.shape[1]):
        if(j == 0):
            tmp.append(p[i][0].tolist())
            y = p[i] * (1 + r[i][j])
        else:
            y = y * (1 + r[i][j])
        tmp.append(y.tolist()[0])
 
    if(i == 0):
        P = np.array(tmp)
    else:
        P = np.vstack([P, np.array(tmp)])
 
print()
print(P)
print(P.shape)
print()

That returns:

[[ 25.86301396  26.29776209  26.13459959  25.38282873  24.38110263]
 [ 19.82072772  19.72801677  19.70230331  20.11859739  20.33905475]
 [ 22.33569347  21.91824338  21.53158666  21.8061791   21.99347727]
 [ 21.38584671  21.66113324  21.63142965  21.33881915  21.51156325]
 [ 24.56983489  24.41867671  23.55468454  23.79761839  23.95056194]]
(5, 5)

Thus, we have two loops: the outer one over rows/assets (index i) and inner one over columns/days (index j). For j = 0 we copy the price of the asset from p as a “starting close price”, e.g. on the first day. Concurrently, using the first information from r matrix we compute a change in price on the next day. In tmp list we store (per asset) the history of close price changes over all L+1 days. These operations are based on a simple list processing. Finally, having a complete information on i-th asset and its price changes after r.shape[1] + 1 days, we build a new array of P with an aid of np.vstack function (see more in Section 3.3.4 of Python for Quants. Volume I). Therefore, P stores the simulated close-price time-series for N assets.

We can display them by adding to our main code:

41
42
43
44
plt.figure(num=1, figsize=(8, 5))
plt.plot(P.T, '+-')
plt.xlabel("Days")
plt.ylabel("Asset Close Price (\$)")

what reveals:
stockss
where the transposition of P for plotting has been applied to deliver asset-by-asset price-series (try to plot the array without it and see what happens and understand why it is so).

The easiest solutions are most romantic, what we have proven above. You don’t impress your girlfriend by buying 44 red roses. 44 lines of Python code will do the same! Trust me on that! ;-)

This lesson comes from a fragment of my newest book of Python for Quants. Volume I. The bestselling book in a timeframe of last 3 days after publishing!

5 Words on How To Write A Quant Blog

An extract from Jacques Joubert’s newest article on How To Write A Great Quant Blog.

by Pawel Lachowicz

Do not commence working over your blog without the vision. “If you don’t know where you are going, any road will get you there!” You want to avoid that mistake. Spend some time dreaming of the final form of your site.

Highly sought after content is important but not as much as your commitment to excel in its delivery. Write from your heart. Listen to your inner voice. Follow your own curiosity. Not the trends. Not what would be profitable. Forget about the noise in the blogosphere. If you want to shine as a diamond, you need to get cut as a diamond.

Your blog should be a reflection of your personality. Forget about achieving success quickly. It will take time. It will count in years. Just remember: “Success [in this venture] is something that you attract by becoming an attractive person”.

Don’t rush. Make a plan. Build your blog around the uniqueness of your writing. And don’t worry about “instant” followers.

Lastly, write less frequently. However, deliver pearls. Not plums.

Applied Portfolio VaR Decomposition. (3) Incremental VaR and Portfolio Revaluation.

Portfolios are like commercial aircrafts. Rely on computers. Dotted around the world. Vulnerable to ever changing weather conditions. Designed to bring benefits. Crafted to take risks and survive when the unexpected happens. As portfolio managers we have an ability to control those risk levels and adjust positions accordingly. The turbulences may occur anytime. A good pilot then knows what to do.


In Part 1 we implemented VaR analytical methodology to an exemplary 7-asset portfolio of equities deriving a running portfolio VaR, marginal VaR, and component VaR. In Part 2 we have examined an approximative impact of historical data on the covariance matrix estimation used in the portfolio VaR calculations.

Today, we will add a third component: an incremental VaR.

Altitude Adjustment

For a current portfolio holding $N$ assets with a total exposure of $C$ (dollars) we get a 95% portfolio VaR estimation simply computing:
$$
\mbox{VaR}_P = 1.65 \sigma_P C = 1.65\left(\boldsymbol{d}’ \boldsymbol{M}_2 \boldsymbol{d} \right)^{1/2}
$$ This can be estimated any time. For example, if we evaluate a portfolio of US stocks on the daily basis, and new data are available to be downloaded after 4pm when markets close, 1-day VaR can be recalculated. If it increases day-over-day and, say, one of component VaR behaves in an abnormal way (rising a warning flag), we may want to do something about that particular position: remove it completely or decrease the exposure.

A second scenario could be more optimistic. Say, we have extra cash to invest and we want to increase one (or few) currently held positions in our portfolio. Adding more shares of $i$-th asset in portfolio will increase portfolio VaR. The secret is in not pushing running engines too much and keep the oil temperature within acceptable brackets.

Alternatively, as a risk manager working in a global bank, the bank’s portfolio lists 10,453 open positions (across various markets and continents) and you have an incoming proposal of a new trade coming from one of the bank’s client. He wants to invest $\$$2,650,000 in highly illiquid asset. It is physically possible but the bank VaR may increase significantly over night. How much?

Well, in this or any similar scenarios, you need to perform the portfolio revaluation employing the incremental VaR procedure. Your aircraft may be brand new but remember that big planes fall from the sky too.

In general, by the incremental VaR we consider the difference between a new portfolio VaR (including (a) new trade(s) or a change in (a) current holding(s)) and recent portfolio VaR:
$$
\mbox{iVaR}_P = \mbox{VaR}_{P+a} – \mbox{VaR}_P
$$ where $a$ symbolises the proposed change. The incremental VaR changes in a non-linear way as the position size can be increased/reduced in any amount. That is the main difference as compared to the marginal VaR.

It is possible to expand $\mbox{VaR}_{P+a}$ in series around the original point, ignoring second-order terms if the deviations $a$ are small:
$$
\mbox{VaR}_{P+a} = \mbox{VaR}_{P} + (\Delta \mbox{VaR})’a + …
$$ and by “small” I mean a dollar change in (a) current holding(s) much much less than $C$ under market exposure. In such case,
$$
\mbox{iVaR}_P^{\rm approx} \approx (\Delta \mbox{VaR})’a
$$ and the apostrophe denotes the transposition of the vector. This approximation becomes important for really large portfolios (e.g. in banks) where revaluation and a new estimation of bank VaR needs to be derived quickly. A time saver at a reliable precision of landing. In all other cases, a new vector of extra exposures can be added to present position sizes and reevaluated. Let’s see how it works in practice!

 

Case Study A: Increasing Position of One Asset

Let’s recall the same initial settings as we have done in Part 1. We deal with 7-asset portfolio of US equities as of 12th of January, 2015. There is nearly 3 million dollars invested in stock market (I wish it were mine! ;) and based on 3-year historical data of all underlying stocks, the estimated 1-day 95% portfolio VaR is $\$$33,027.94 as forecasted for the next trading day. Id est, there is 5% of chances that we can lose 1.1% of the portfolio value before English men in New York will rise their cups of tea:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
% Applied Portfolio VaR Decomposition. (3) Incremental VaR and Portfolio 
% Revaluation.
%
% (c) 2015 QuantAtRisk.com, by Pawel Lachowicz
 
clear all; close all; clc;
format bank
 
%{
% Read the list of the portfolio components
fileID = fopen('portfolio.lst');
tmp = textscan(fileID, '%s');
fclose(fileID);
pc=tmp{1};  % a list as a cell array
 
% fetch stock data for last 3 years since:
t0=735976; % 12-Jan-2015
date2=datestr(t0,'yyyy-mm-dd');        % from
date1=datestr(t0-3*365,'yyyy-mm-dd');  % to
 
% create an empty array for storing stock data
stockd={};
% scan the tickers and fetch the data from Quandl.com
for i=1:length(pc)
    quandlc=['WIKI/',pc{i}];
    fprintf('%4.0f %s\n',i,quandlc);
    % fetch the data of the stock from Quandl
    % using recommanded Quandl's command and
    % saving them directly into Matlab's FTS object (fts)
    fts=0;
    [fts,headers]=Quandl.get(quandlc,'type','fints', ...
                  'authcode','YourQuandlCode',...
                  'start_date',date1,'end_date',date2);
    stockd{i}=fts; % entire FTS object in an array's cell
end
save data
%}
 
load data
 
% limit data to 3 years of business days, select adjusted
% close price time-series and calculate return-series
rs=[];
for i=1:length(pc)
    cp=fts2mat(stockd{i}.Adj_Close,0);
    rv=cp(2:end,1)./cp(1:end-1,1)-1;
    rs=[rs rv(end-(3*5*4*12):end)];
end
rs(isnan(rs))=0.0;
 
% covariance matrix
M2=cov(rs)
 
% an initial dollar expose per position
d=[   55621.00; ...
     101017.00; ...
      23409.00; ...
    1320814.00; ...
     131145.00; ...
     321124.00; ...
    1046867.00]
 
% invested capital of C
C=sum(d)
 
% the portfolio volatility
vol_P=sqrt(d'*M2*d);
 
% diversified portfolio VaR
VaR_P=1.65*vol_P
 
% volatility of assets (based on daily returns)
v=sqrt(diag(cov(rs)));
 
% individual and undiversified portfolio VaR
VaRi=1.65*abs(v.*d);
uVaR_P=sum(VaRi);
 
% the portfolio betas
beta=C*(M2*d)/vol_P^2;
 
% the portfolio marginal VaR
DVaR=1.65*(M2*d)/vol_P;
 
% the component VaR
cVaR=100*DVaR.*d/VaR_P; % [percent]
cVaRd=DVaR.*d; % dollars

Now, we want to add $\$$10,000 to our exposure in Walt Disney Co. stock. As of Jan/12 2015 DIS closes at $\$$94.48 therefore we consider to buy $\$$10,000/$\$$94.48=105.8 shares on the next day. We are flexible so we buy at the market price of $\$$95.23 on Jan/13, i.e. extra 105 shares in DIS. Therefore we added $105\times \$95.23$ or $\$$9,999.15 to our portfolio:

89
90
91
92
93
94
95
96
% proposed changes to current exposure
da=[0;       ...
    9999.15; ...
    0;       ...
    0;       ...
    0;       ...
    0;       ...
    0            ];

Of course we can assume a new portfolio VaR estimation by adding extra 105 shares of DIS. In real weather conditions, this number can be lower. Keep that in mind.

The remaining part seems to be easy to code:

100
101
102
103
104
105
106
107
108
109
% new portfolio VaR
VaR_Pa=1.65*sqrt((d+da)'*M2*(d+da))
% incremental VaR
iVaR=VaR_Pa-VaR_P;
% incremental VaR (approximation; see text)
iVaR_approx=DVaR'*da;
 
% new component VaR
ncVaR=100*DVaR.*(d+da)/VaR_P; % [percent]
ncVaRd=DVaR.*(d+da); % dollars

followed by a new risk report on the impact of proposed changes:

111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
fprintf('Current exposure:\t\t\t $%1s\n',bankformat(C));
fprintf('Current Portfolio VaR (diversified):\t $%1s\n',bankformat(VaR_P));
fprintf('Proposed changes in holdings:\n');
for i=1:length(d)
    fprintf('   %s\t $%-s\n',pc{i},bankformat(da(i)) );
end
fprintf('New Portfolio VaR (diversified):\t $%1s\n',bankformat(VaR_Pa));
fprintf('Incremental VaR:\t\t\t $%1s\n',bankformat(iVaR));
fprintf('Incremental VaR (approximation):\t $%1s\n',bankformat(iVaR_approx));
fprintf('Change in exposure:\t\t\t %-1.4f%%\n',sum(da)/C);
fprintf('Component VaR:\n');
fprintf('\t Current\t\t New\t\t\tChange in CVaR\n');
for i=1:length(d)
    fprintf('   %s\t%6.2f%%   $%-10s\t%6.2f%%   $%-10s\t$%-10s\n', ...
            pc{i},cVaR(i), ...
            bankformat(cVaRd(i)),ncVaR(i), ...
            bankformat(ncVaRd(i)),bankformat(ncVaRd(i)-cVaRd(i)) );
end

In this case study, the report says:

Current exposure:			 $2,999,997.00
Current Portfolio VaR (diversified):	 $33,027.94
Proposed changes in holdings:
   AAPL	 $0.00
   DIS	 $9,999.15
   IBM	 $0.00
   JNJ	 $0.00
   KO	 $0.00
   NKE	 $0.00
   TXN	 $0.00
New Portfolio VaR (diversified):	 $33,084.26
Incremental VaR:			 $56.31
Incremental VaR (approximation):	 $55.84
Change in exposure:			 0.0033%
Component VaR:
	 Current		 New			Change in CVaR
   AAPL	  0.61%   $202.99    	  0.61%   $202.99    	$0.00      
   DIS	  1.71%   $564.13    	  1.88%   $619.97    	$55.84     
   IBM	  0.26%   $84.43     	  0.26%   $84.43     	$0.00      
   JNJ	 30.99%   $10,235.65 	 30.99%   $10,235.65 	$0.00      
   KO	  1.19%   $392.02    	  1.19%   $392.02    	$0.00      
   NKE	  9.10%   $3,006.55  	  9.10%   $3,006.55  	$0.00      
   TXN	 56.14%   $18,542.15 	 56.14%   $18,542.15 	$0.00

That’s the great example to show how the approximation of the incremental VaR works in practice. $\$$10,000 added to DIS is “small” as compared to $C$ of nearly 3 million dollars in the game. By adding 105 shares of DIS we didn’t move 1-day portfolio VaR significantly which is good. Let’s play again.

 

Case Study B: Increase and Decrease Exposure

From the initial risk report we learnt that TXN contributed most to the overall portfolio VaR. Let’s say we have extra $\$$126,000 to invest and we think that AAPL would be the next shining star. In the same time, we want to reduce our market exposure to TXN by selling stocks of the total worth of $\$$500,000. This operation would be denoted as:

89
90
91
92
93
94
95
96
% proposed changes to current exposure
da=[126000;  ...
    0;       ...
    0;       ...
    0;       ...
    0;       ...
    0;       ...
    -500000      ];

and

Current exposure:			 $2,999,997.00
Current Portfolio VaR (diversified):	 $33,027.94
Proposed changes in holdings:
   AAPL	 $126,000.00
   DIS	 $0.00
   IBM	 $0.00
   JNJ	 $0.00
   KO	 $0.00
   NKE	 $0.00
   TXN	 $-500,000.00
New Portfolio VaR (diversified):	 $25,869.04
Incremental VaR:			 $-7,158.90
Incremental VaR (approximation):	 $-8,396.16
Change in exposure:			 -0.1247%
Component VaR:
	 Current		 New			Change in CVaR
   AAPL	  0.61%   $202.99    	  2.01%   $662.85    	$459.85    
   DIS	  1.71%   $564.13    	  1.71%   $564.13    	$0.00      
   IBM	  0.26%   $84.43     	  0.26%   $84.43     	$0.00      
   JNJ	 30.99%   $10,235.65 	 30.99%   $10,235.65 	$0.00      
   KO	  1.19%   $392.02    	  1.19%   $392.02    	$0.00      
   NKE	  9.10%   $3,006.55  	  9.10%   $3,006.55  	$0.00      
   TXN	 56.14%   $18,542.15 	 29.33%   $9,686.13  	$-8,856.02

The incremental VaR reacts to a massive sell out of TXN reducing the overall portfolio VaR significantly. Adding $\$$126k to APPL (share number precision omitted for the sake of simplicity) does not affect the portfolio risk levels in any dramatic way.

 

Case Study C: Adding Two New Assets

Your are a child of a new era, the era of credit cards. More and more people use them worldwide. You have this brilliant idea to add to your portfolio extra $\$$100,000 invested in (split equally between) Mastercard Inc. and Visa Inc.

Considering such an operation requires a new portfolio revaluation. Let’s do it step by step. First, let’s fetch historical data for both MA and V stocks, the same way as we did it at the very beginning. We save the tickers names in a separate file named proposal.lst:

MA
V

We supplement out Matlab code with extra few lines:

130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
% Read the list of the portfolio components
fileID = fopen('proposal.lst');
tmp = textscan(fileID, '%s');
fclose(fileID);
pcE=tmp{1};  % a list as a cell array
 
% fetch stock data for last 3 years since:
t0=735976; % 12-Jan-2015
date2=datestr(t0,'yyyy-mm-dd');        % from
date1=datestr(t0-3*365,'yyyy-mm-dd');  % to
 
% create an empty array for storing stock data
stockd={};
% scan the tickers and fetch the data from Quandl.com
for i=1:length(pcE)
    quandlc=['WIKI/',pcE{i}];
    fprintf('%4.0f %s\n',i,quandlc);
    % fetch the data of the stock from Quandl
    % using recommanded Quandl's command and
    % saving them directly into Matlab's FTS object (fts)
    fts=0;
    [fts,headers]=Quandl.get(quandlc,'type','fints', ...
                  'authcode','YourQuandlCode',...
                  'start_date',date1,'end_date',date2);
    stockdp{i}=fts; % entire FTS object in an array's cell
end
save datap
%}
 
load datap
 
% limit data to 3 years of business days, select adjusted
% close price time-series and calculate return-series
rsp=[];
for i=1:length(pcE)
    cp=fts2mat(stockdp{i}.Adj_Close,0);
    rv=cp(2:end,1)./cp(1:end-1,1)-1;
    rsp=[rsp rv(end-(3*5*4*12):end)];
end
rsp(isnan(rsp))=0.0;

As quickly as new data arrive, we build and extend return-series matrix, now not $721×7$ but $721×9$ in its dimensions:

171
rsE=[rs rsp];

What follows is pretty intuitive in coding:

173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
% covariance matrix
M2E=cov(rsE);
 
% proposed changes to current exposure
daE=[0;       ...
     0;       ...
     0;       ...
     0;       ...
     0;       ...
     0;       ...
     0;       ...
     50000;   ...
     50000        ];
 
% new portfolio VaR
d=[d; 0; 0]  % we have to extend it to allow d+daE
VaR_PaE=1.65*sqrt((d+daE)'*M2E*(d+daE))
 
% a running list of tickers, too
pcE=[pc; pcE]
 
% incremental VaR
iVaRE=VaR_PaE-VaR_P;
 
fprintf('Current exposure:\t\t\t $%1s\n',bankformat(C));
fprintf('Current Portfolio VaR (diversified):\t $%1s\n',bankformat(VaR_P));
fprintf('Proposed changes in holdings:\n');
for i=1:length(daE)
    fprintf('   %s\t $%-s\n',pcE{i},bankformat(daE(i)) );
end
fprintf('New Portfolio VaR (diversified):\t $%1s\n',bankformat(VaR_PaE));
fprintf('Incremental VaR:\t\t\t $%1s\n',bankformat(iVaRE));

generating new risk estimations:

Current exposure:			 $2,999,997.00
Current Portfolio VaR (diversified):	 $33,027.94
Proposed changes in holdings:
   AAPL	 $0.00
   DIS	 $0.00
   IBM	 $0.00
   JNJ	 $0.00
   KO	 $0.00
   NKE	 $0.00
   TXN	 $0.00
   MA	 $50,000.00
   V	 $50,000.00
New Portfolio VaR (diversified):	 $33,715.09
Incremental VaR:			 $687.15

From this example we can see that extending our portfolio with two assets would lead us to the increase of 1-day portfolio VaR by approximately $\$$680. A digestable amount.

Stay tuned as in Part 4 we go extreme with Extreme VaR.

Endnote

When you pilot an aircraft from Amsterdam to Tokyo, the path is usually over Norway, North Pole, and eastern Russia. If one of the engines goes down, you switch it off and ground the plane at the nearest airport. The risk in a commercial aviation is too high to continue the flight.

As a portfolio manager you deal with similar situations but you do not necessarily close your portfolio at once. Instead, you reduce the risk to troublesome assets. You can still fly in the markets but with reduced thrust and hopes for big gains. But don’t worry! Every plane lands with its nose up towards the sky. You can be losing but stay positive till the very end. You never know when cross-winds will change your situation. You may be in the game again before the bell will close the session.

Quants, NumPy, and LOTTO

Lesson 9>>

Since I’m working over the Volume I of Python for Quants ebook and I am going through NumPy abilities, they leave me speechless despite the rain. Somehow. Every time. There is so much flexibility in expressing your thoughts, ideas, and freedom of coding of logical and mathematical concepts as a quant. With its grace of syntax, with its purity of simplicity.

When I was 16 I used to play LOTTO: the game where for your 3 favourite combinations of 6 numbers out of 49 you pay $\$1$. It was obvious that hitting “six” was an equivalent of the probability of:
$$
\frac{1}{C^6_{49}} = \left( \frac{49!}{6!(49-6)!} \right)^{-1} = \frac{1}{13983816} \ \ .
$$ I had my favourite 6 numbers. The magic was that I believed that I could win the entire prize. Somehow, the game was worth trying. You never know when your lucky day comes!

Monte-Carlo Simulation for LOTTO

It is out of any doubts that our mind can create certain circumstances in real world (that we do not immediately understand) that allow us to bend a reality to our knees. How? Well, if you attend any mind control seminar you learn on the power of your mind and how to trick mathematical, logical reasoning. That applies to LOTTO game as well. There is nearly 14M combinations and we aim to win that game. A low probability. A high reward if you are lucky.

I used to play with my favourite combination: 7, 9, 11, 21, 35, 45 for nearly 3 months. I was visualising a situation when one day exactly those numbers could appear in the LOTTO drawing machine next Wednesday or Saturday. It took me 3 months to hit 5 out of 6 numbers! The financial reward was nice, trust me! But because I was of small faith, I did not hit the bull between its eyes. Now it’s your turn!

Python language is superb. It offers you an opportunity to simulate your luck just in few minutes. Say, you are like me, you randomly pick up 6 out of 49 numbers and you play, two times a week, and… wait, for your early retirement. With NumPy library in Python, the estimation of the waiting time is straightforward. Analyse the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Quants, NumPy, and LOTTO
# (c) 2014 QuantAtRisk.com, by Pawel Lachowicz
 
import numpy as np
 
def lotto_numbers():
    ok = False
    while not ok:
        x = np.random.choice(49,6,replace=False)
        x.sort()
        tmp = np.where(x == 0)
        (m, )= tmp[0].shape
        if(m == 0):
            ok = True
    return x
 
fav = lotto_numbers()  # choose your favourite 6 numbers
print(fav)
 
match = False; i = 0
while not match:
    tmp = lotto_numbers()
    cmp = (tmp == fav)
    i += 1
    if cmp.all():
        match = True
 
print("Iterations: %g\nProbability = %.2e" % (i,1./i) )
print(tmp)

It’s a nice combination of most useful NumPy functions in action: .random.choice(), .where(), 
.shape(), 
.sort(), and 
.all().

In line #9 we create a 6-element vector of random integers drawn from a set $[0,49]$ and in line #10 we force them all to be sorted. Because $0$ is not the part of the game, we need to check for it presence and draw again if detected. Line #11 returns a temporary array storing indexes where $x$ has a zero-element. Since we use .random.choice function with a parameter replace=False, we can be sure that there is only one zero (if drawn). Therefore, the result of line #12 should be $m$ equal 0 or 1.

In the main program (lines #17-29) we repeat the loop until we find the match between a (random) selection of our favourite 6 numbers and a new LOTTO draw. In line #23 an array cmp holds 6 boolean elements as the result of comparison. Therefore, making the arrays sorted is essential part of the code. Finally, in line #25 a NumPy function of .all() returns True or False if all elements are True (or not), respectively. Nice and easy. An exemplary outcome you can get is:

[ 1  2  4 11 13 18]
Iterations: 37763
Probability = 2.65e-05
[ 1  2  4 11 13 18]

To provide you with a feeling how long you need to wait to hit your favourite “six” let’s calculate:

>>> 37763/(2*52)
363

what returns a number of years. Trust me, if you really really really want to win playing only with your lucky numbers, you can. It is a matter of faith, luck, and… time.

As a homework try to modify the code to run it $10^4$ times and plot the histogram of probabilities. Is it Normal or Uniform?

Now I guess I should say Good luck!

Setting up Python for Quantitative Analysis in OS X 10.10 Yosemite

Welcome!

The most recent update of Apple’s OS X 10.10 Yosemite makes its mark with a bit of splendour. But, under the hood, not too much has changed. The same Unix shell, the same way to perform our daily programming at the same level. Setting up Python in Yosemite appears today to be easier than ever (compare my post on performing the same task in OS X 10.9 Mavericks a year ago).

Download and Installation

The evolution of Continuum Analytics with its unique mission to deliver Python to everyone started to pay off. They created Anaconda, a completely free enterprise-ready Python distribution for large-scale data processing, predictive analytics, and scientific computing. It’s free including its application for commercial purposes. You can get a working version for Yosemite (as well for Linux and Microsoft Windows). Just go to:

Anaconda1

fill in your email details, and choose next with grand version (2.7 vs 3.4) you prefer,

Anaconda2

and download the .dmg file to begin the installation on your Mac in a graphical mode.

It installs into a single directory and does not affect other Python installations on your system. No admin/root privileges are required too. You can switch among Python interpreters in versions of 2.6, 2.7, 3.3, and 3.4. After installation you will get an instant access to 195+ of the most popular Python packages for science, math, engineering, data analysis including NumPy, SciPy, Pandas, IPython, Matplotlib, Numba, Blaze, and others.

Don’t rush with payment for an additional Anaconda Accelerate package making use of GPUs. As I will address this thread in my next post, Python using CUDA within Numbapro module offered by Continuum Analytics disappoints.

Visualisation of N-Asset Portfolio in Matlab

Many of you who wished to use the Matlab’s Financial Toolbox for portfolio visualization most probably realised that something was wrong. Not every time the annualised risk and return for different time-series matched with what was expected. Below you can find an improved piece of code which does the job correctly.

The input sampling of your financial time-series now can be specified directly as daily, weekly, monthly, or quarterly and the risk-return diagram will be displayed with annualised results. Otherwise, you have an option, instead of annualised value, specify another time-horizon. All smooth and easy.

% Specialised plotting function for visualisation of portfolios
% based on the Financial Toolbox's portfolio objects
% 
% Modified by Pawel Lachowicz, 2014/03/01
%
% Remarks: Essential corrections introduced in the calculation and plotting
% of the annualised risk and returns coded incorrectly by the team at 
% The MathWorks, Inc.
 
function plotportfolio(varargin)
 
    plottitle = varargin{1};  % plot title, e.g. 'Efficient Frontier'
    plotfreq =  varargin{2};  % frequency of raw data (dt; or 'daily', etc.)
    plotfreq2=  varargin{3};  % e.g. ’annualised’
    plotfreq3=  varargin{4};
    plotlegend = [];
 
    switch plotfreq2
        case 'annualised'
            switch plotfreq
                case 'daily'
                    periods=252;
                case 'weekly'
                    periods=52;
                case 'monthly'
                    periods=12;
                case 'quarterly'
                    periods=4;
            end
        otherwise
            periods=plotfreq3;
    end
 
    for i = 5:nargin
        plotinfo = varargin{i};
 
        plottype = plotinfo{1};
        plotrsk  = plotinfo{2};
        plotret  = plotinfo{3};
        if numel(plotinfo) > 3
            plotlabel = plotinfo{4};
        else
            plotlabel = [];
        end
        if numel(plotinfo) > 4
            plotstyle = plotinfo{5};
            if isempty(plotstyle)
                plotstyle = 'b';
            end
        else
            if strcmpi(plottype,'line')
                plotstyle = 'b';
            else
                plotstyle = 'g';
            end
        end
        if numel(plotinfo) > 5
            plotline = plotinfo{6};
            if isempty(plotline)
                plotline = 1;
            end
        else
            if strcmpi(plottype,'line')
                plotline = 2;
            else
                plotline = [];
            end
        end
 
        % line plot
        if strcmpi(plottype,'line')
            hold on;
            for k = 1:size(plotrsk,2)
                plot(plotrsk(:,k)*sqrt(periods),((plotret(:,k)+1).^periods-1), ...
                     plotstyle, 'LineWidth', plotline);
                if i == 2 && k == 1
                    hold on
                end
                if ~isempty(plotlabel) && ~isempty(plotlabel{k})
                    plotlegend = [ plotlegend, {plotlabel{k}} ];
                end
            end
 
        % scatter plot
        else
            if any(plotstyle == '.')
                scatter(plotrsk*sqrt(periods),((plotret(:,k)+1).^periods-1),'or','Filled');
            else
                scatter(plotrsk*sqrt(periods),((plotret(:,k)+1).^periods-1),'og','Filled');
            end			
            if i==2
                hold on
            end
            if ~isempty(plotlabel)
                for k = 1:numel(plotrsk)
                    if ~isempty(plotlabel{k})
                        text(plotrsk(k)*sqrt(periods)+0.005, ...
                            ((plotret(k)+1).^periods-1), plotlabel{k},'FontSize', 9);
                    end
                end
            end
        end
    end
 
    if ~isempty(plotlegend)
        legend(plotlegend,'Location','SouthEast');
    end
    if(plotfreq2=='annualised')
        xlabel('Annualised Volatility');
        ylabel('Annualised Expected Returns');
    else
        xlabel('Volatility');
        ylabel('Returns');
    end
    grid on
 
end

Now, you can call it with various parameters of your interest, matching the data directly, for instance:

plotportfolio('Portfolio',dt,'annualised',[], ... 
    {'line', prsk, pret}, ...
    {'scatter',CashRsk,CashRet, {'Cash'}}, ...
    {'scatter', sqrt(diag(p.AssetCovar)), p.AssetMean, p.AssetList,'.r'});

For some awesome practical application of this function in the process of construction and analysis of $N$-asset portfolio grab Applied Portfolio Optimization with Risk Management using Matlab ebook today.

Merry…

chrisk

– from QuantAtRisk!

Stay tuned as still this year we are coming back to the problem of Portfolio Optimization and new lesson on Accelerated Python for Quants. Be happy, relax, take your time, but strive for excellence, and never be satisfied with who you are and what you know! There is always somewhere out there someone who works harder than you!

Lots of Positive Vibes!
-Pawel

QuantAtRisk on YouTube!

Since Dec/1 you can follow QuantAtRisk YouTube Channel! My goal is to deliver you a quality information and visualisation of solutions for complex financial problems.

We kick off this session with the very first video on Accelerated Matlab for Quants, a tutorial for complete beginners in Matlab programming.

YouTube Preview Image

 

Enjoy!

Join us at HPC Technologies in Finances in Singapore (15-16 July 2013)

To All of You,

who dare to know more, to understand more, to become amazing at work! Don’t miss the opportunity of your life and sign up for this spectacular event in a picturesque city of Singapore! Enhance your skills, broaden your financial experience, make your boss proud of you! Visit us and enjoy this state-of-the art workshop (theory+practice) just one degree north off the equator.

See you in Singapore very soon!

Marek T. Michalewicz, PhD, Director of A*STAR
Pawel Lachowicz, PhD, QaR

Click on the banner for more information:

HPC in Finance

VaR and Expected Shortfall vs. Black Swan


It is one of the most fundamental approaches in measuring the risk, but truly worth revising its calculation. Value-at-Risk (VaR). A magical number quoted at the end of the day by the banks’ financial risk managers, portfolios’ risk managers, and risk managers simply concerned about the expected risk threshold not to be, hopefully, exceeded on the next day. What is so magical about it? According to definition, given a specific time horizon and taking into account last $N$ days of our trading activity, we can always calculate a number which would provide us with a simple answer to the question: if something really unexpected happened on the next day, what would be the loss margin we could for sure suffer? Well, welcome to the world of VaR!

More formally, given a daily (or, weekly, monthly, etc.) distribution of returns (of, for example, a single stock, a portfolio of assets, etc.), we are interested in finding the value of $VaR_\alpha$ of, say, $\alpha=0.05$ (five percent) which would say to as that there is 5% of chances that the value of $VaR_{0.05}$ would be exceeded in trading on the next day. This value is located in the left tail of the distribution and, by convention, it is given as positive number. Therefore, we define $VaR_\alpha$ as:
$$
P[X\le VaR_\alpha] = 1 – \alpha
$$ where $X$ is a random variable and can, as in our assumed case, represent a 1-day return on a traded asset or a portfolio of assets.

Beautiful! We have the risk threshold provided as a number. But in practice, how should we calculate $VaR_\alpha$? Here come into spotlight some of the shortcomings of VaR in general. VaR depends on the time-frame of the daily returns we examine: therefore $VaR_\alpha$ as a number will be different if you include last 252 trading days in estimation comparing to last 504 trading days (2 years). Intuitively, the more data (more days, years, etc.) we have the better estimation of $VaR_\alpha$, right? Yes and no. Keep in mind that the dynamic of the trading market changes continuously. In early 80s there was no high-frequency trading (HFT). It dominated trading traffic after the year of 2000 and now HFT plays a key role (an influential factor) in making impact on prices. So, it would be a bit unfair to compare $VaR_\alpha$ estimated in 80s with what is going on right now. $VaR_\alpha$ assumes that the distribution of returns is normal. Not most beautiful assumption but the most straightforward to compute $VaR_\alpha$. Should we use $VaR_\alpha$? Yes, but with caution.

Okay then, having $VaR_\alpha$ calculated we know how far the loss could reach at the $1-\alpha$ confidence level. The next super important question in risk management every risk manager should ask or at least be interested in reporting is: if $VaR_\alpha$ event occurs on the next day, what is the expected loss we might expect to suffer (say, in dollars)? $VaR_\alpha$ is the threshold. Now we are interested in the expected value of loss given the exceedance. It is the definition of expected shortfall and is based on the concept of conditional probability as follows:
$$
E[X\ |\ X>VaR_\alpha] = ES \ .
$$ In general, if our daily distribution of returns can be described by a function $f(x)$ which would represent a power density function (pdf), then:
$$
ES = \frac{1}{\alpha} \int_{-\infty}^{VaR_\alpha} xf(x)dx \ .
$$

Given any data set of returns, $R_t\ (t=1,…,N)$, calculated from our price history,
$$
R_t = \frac{P_{t+1}}{P_t} -1
$$ both numbers, $VaR_\alpha$ and $ES$ can be, in fact, calculated in two ways. The first method is based on the empirical distribution, i.e. using data as given:
$$
VaR_\alpha = h_i^{VaR} \ \ \mbox{for}\ \ \sum_{i=1}^{M-1} H_i(h_{i+1}-h_i) \le \alpha
$$ where $H$ represents the normalized histogram of $R_t$ (i.e., its integral is equal 1) and $M$ is the number of histograms bins. Similarly for $ES$, we get:
$$
ES = \sum_{i=1}^{h_i^{VaR}} h_iH_i(h_{i+1}-h_i) \ .
$$ The second method would be based on integrations given the best fit to the histogram of $R_t$ using $f(x)$ being a normal distribution. As we will see in the practical example below, both approaches returns different values, and an extra caution should be undertaken while reporting the final risk measures.

Case Study: IBM Daily Returns in 1987

The year of 1987 came into history as the most splendid time in the history of stock trading. Why? Simply because of so-called Black Monday on Oct 19, where markets denoted over 20% losses in a single trading day. Let’s analyse the case of daily returns and their distribution for IBM stock, as analysed by a risk manager after the close of the market on Thu, Dec 31, 1978. The risk manager is interested in the estimation of 1-day 95% VaR threshold, i.e. $VaR_0.05$, and the expected shortfall if on the next trading day (Jan 4, 1988) the exceedance event would occur. Therefore, let’s assume that our portfolio is composed solely of the IBM stock and we have the allocation of the capital of $C$ dollars in it as for Dec 31, 1987.

Using a record of historical trading, IBM.dat, of the form:

IBM
DATES 	      OPEN 	HIGH 	  LOW 	    CLOSE 	VOLUME
03-Jan-1984   0.000000 	0.000000  0.000000  30.437500 	0.000000
04-Jan-1984   0.000000 	0.000000  0.000000  30.968750 	0.000000
05-Jan-1984   0.000000 	0.000000  0.000000  31.062500 	0.000000
06-Jan-1984   0.000000 	0.000000  0.000000  30.875000 	0.000000
...
04-Mar-2011   0.000000 	0.000000  0.000000  161.830000 	0.000000
07-Mar-2011   0.000000 	0.000000  0.000000  159.930000 	0.000000

containing the close price of IBM stock, in Matlab, we extract the data for the year of 1987, then we construct a vector of daily returns $R_t$ on the stock,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
% Reading in the data
fn=['IBM.dat'];
ftsD=ascii2fts(fn,1,2);
range=[datenum('02-Jan-1987'),datenum('31-Dec-1987')];
drr=[datestr(range(1)),'::',datestr(range(2))];
fts=ftsD(drr);
data=fts2mat(fts,1);
 
% R_t vector
rdata=data(2:end,5)./data(1:end-1,5)-1;
 
% Plotting
figure(1);
subplot(2,1,1);
plot(data(:,1)-datenum('2-Jan-1987'),data(:,5));
xlim([0 365]); ylabel('IBM Close Price ($)');
subplot(2,1,2);
plot(data(2:end,1)-datenum('2-Jan-1987'),rdata);
xlim([0 365]); ylabel('R_t')
xlabel('t (days)')

and present the data graphically:

varesR

Having our data, let’s find $VaR_{0.05}$ and corresponding $ES$,

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
% Construct the normalized histogram
[H,h]=hist(rdata,100);
hist(rdata,100);
sum=0;
for i=1:length(H)-1
    sum=sum+(H(i)*(h(i+1)-h(i)));
end
figure(2)
H=H/sum;
bar(h,H,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.7 0.7 0.7])
 
% Calculate VaR 95 based on empirical distribution
sum=0;
i=1;
while (i<length(H))
    sum=sum+(H(i)*(h(i+1)-h(i)));
    if(sum>=0.05) 
        break;
    end
    i=i+1;
end
VaR=h(i)
hold on; plot(h(i),0,'ro'); % mark VaR in the plot
 
% Fit the normal distribution to R_t data
[muhat,sigmahat] = normfit([rdata]');
Y = normpdf(h,muhat(1),sigmahat(1));
hold on; plot(h,Y,'b-'); % and plot the fit using blue line
 
% Find for the fitted N(muhat,sigmahat) VaR 95%
sum=0;
i=1;
while (i<length(h))
    sum=sum+(Y(i)*(h(i+1)-h(i)));
    if(sum>=0.05) 
        break;
    end
    i=i+1;
end
VaR_fit=h(i)
hold on; plot(h(i),0,'ro','markerfacecolor','r'); % mark VaR_fit
ivar=i;
 
% Calculate the Expected Shortfall 
%  based on fitted normal distribution
sum=0;
i=1;
while (i<=ivar)
    sum=sum+(h(i)*Y(i)*(h(i+1)-h(i)));
    i=i+1;
end
ES_fit=sum/0.05
hold on; plot(ES_fit,0,'ko','markerfacecolor','k');
 
% Add numbers to the plot
text(-0.23,25,['1-Day VaR and Expected Shortfall']);
text(-0.23,23.5,['as for 31/12/1987']);
text(-0.23,20,['VaR = ',num2str(VaR*-1,3),' (',num2str(VaR*-100,2),'%)']);
text(-0.23,18,['ES  = ',num2str(ES_fit*-1,3),' (',num2str(ES_fit*-100,2),'%)']);
text(-0.23,13,['\mu = ',num2str(muhat,2)]);
text(-0.23,11,['\sigma = ',num2str(sigmahat,2)]);
xlabel('R_t');

marking all numbers in resulting common plot:

VarES

In the plot, filled red and black markers correspond to $VaR_{0.05}$ and $ES$, respectively, and are calculated using the fitted normal distribution. Both values have been displayed in the plot as well. In addition to that, the plot contains a red open circle marker denoting the value of $VaR_{0.05}$ obtained from the empirical histogram discrete integration. Please note how the choice of the method influences the value of VaR.

The derived numbers tell the risk manager that in 95% of cases one should feel confident that the value of $VaR_{0.05}$ will not be exceed in trading of IBM on Jan 4, 1988 (next trading day). $C\times VaR_{0.05} = 0.028C$ would provide us with the loss amount (in dollars) if such undesired event had occurred. And if it was the case indeed, the expected loss (in dollars) we potentially would suffer, would be $C\times ES = 0.058C$.

Now it easy to understand why the Black Swan event of Oct 19, 1987 falls into the category of super least likely moves in the markets. It places itself in the far left tail of both empirical and fitted distributions denoting an impressive loss of 23.52% for IBM on that day. If we assume that the daily price changes are normally distributed then probability of a drop of at least 22 standard deviations is
$$
P(Z\le -22) = \Phi(-22) = 1.4 \times 10^{-107} \ .
$$ This is astronomically rare event!

They say that a lightening does not strike two times, but if a 1-day drop in the market of the same magnitude had occurred again on Jan 4, 1988, our long position in IBM would cause us to suffer not the loss of $0.058C$ dollars but approximately $0.23C$. Ahhhuc!

If there is No Risk, there is No Reward

Today I allowed to post something different. Something what could help you to understand that true risks in life are not only those related to the financial markets or portfolio management. I learned a lot from this video clip and now, every time I go back, over and over again, I find more encouragement and power to simply follow what is worth taking the risk. Enjoy your weekend break!

YouTube Preview Image
Contact Form Powered By : XYZScripts.com