Quantitative Analysis, Risk Management, Modelling, Algo Trading, and Big Data Analysis

## Visualisation of N-Asset Portfolio in Matlab

Many of you who wished to use the Matlab’s Financial Toolbox for portfolio visualization most probably realised that something was wrong. Not every time the annualised risk and return for different time-series matched with what was expected. Below you can find an improved piece of code which does the job correctly.

The input sampling of your financial time-series now can be specified directly as daily, weekly, monthly, or quarterly and the risk-return diagram will be displayed with annualised results. Otherwise, you have an option, instead of annualised value, specify another time-horizon. All smooth and easy.

% Specialised plotting function for visualisation of portfolios % based on the Financial Toolbox's portfolio objects % % Modified by Pawel Lachowicz, 2014/03/01 % % Remarks: Essential corrections introduced in the calculation and plotting % of the annualised risk and returns coded incorrectly by the team at % The MathWorks, Inc.   function plotportfolio(varargin)   plottitle = varargin{1}; % plot title, e.g. 'Efficient Frontier' plotfreq = varargin{2}; % frequency of raw data (dt; or 'daily', etc.) plotfreq2= varargin{3}; % e.g. ’annualised’ plotfreq3= varargin{4}; plotlegend = [];   switch plotfreq2 case 'annualised' switch plotfreq case 'daily' periods=252; case 'weekly' periods=52; case 'monthly' periods=12; case 'quarterly' periods=4; end otherwise periods=plotfreq3; end   for i = 5:nargin plotinfo = varargin{i};   plottype = plotinfo{1}; plotrsk = plotinfo{2}; plotret = plotinfo{3}; if numel(plotinfo) > 3 plotlabel = plotinfo{4}; else plotlabel = []; end if numel(plotinfo) > 4 plotstyle = plotinfo{5}; if isempty(plotstyle) plotstyle = 'b'; end else if strcmpi(plottype,'line') plotstyle = 'b'; else plotstyle = 'g'; end end if numel(plotinfo) > 5 plotline = plotinfo{6}; if isempty(plotline) plotline = 1; end else if strcmpi(plottype,'line') plotline = 2; else plotline = []; end end   % line plot if strcmpi(plottype,'line') hold on; for k = 1:size(plotrsk,2) plot(plotrsk(:,k)*sqrt(periods),((plotret(:,k)+1).^periods-1), ... plotstyle, 'LineWidth', plotline); if i == 2 && k == 1 hold on end if ~isempty(plotlabel) && ~isempty(plotlabel{k}) plotlegend = [ plotlegend, {plotlabel{k}} ]; end end   % scatter plot else if any(plotstyle == '.') scatter(plotrsk*sqrt(periods),((plotret(:,k)+1).^periods-1),'or','Filled'); else scatter(plotrsk*sqrt(periods),((plotret(:,k)+1).^periods-1),'og','Filled'); end if i==2 hold on end if ~isempty(plotlabel) for k = 1:numel(plotrsk) if ~isempty(plotlabel{k}) text(plotrsk(k)*sqrt(periods)+0.005, ... ((plotret(k)+1).^periods-1), plotlabel{k},'FontSize', 9); end end end end end   if ~isempty(plotlegend) legend(plotlegend,'Location','SouthEast'); end if(plotfreq2=='annualised') xlabel('Annualised Volatility'); ylabel('Annualised Expected Returns'); else xlabel('Volatility'); ylabel('Returns'); end grid on   end

Now, you can call it with various parameters of your interest, matching the data directly, for instance:

plotportfolio('Portfolio',dt,'annualised',[], ... {'line', prsk, pret}, ... {'scatter',CashRsk,CashRet, {'Cash'}}, ... {'scatter', sqrt(diag(p.AssetCovar)), p.AssetMean, p.AssetList,'.r'});

For some awesome practical application of this function in the process of construction and analysis of $N$-asset portfolio grab Applied Portfolio Optimization with Risk Management using Matlab ebook today.

## Information Ratio and its Relative Strength for Portfolio Managers

The risk and return are the two most significant quantities investigated within the investment performance. We expect the return to be highest at the minimum risk. In practice, a set of various combinations is always achieved depending on a number of factors related to the investment strategies and market behaviours, respectively. The portfolio manager or investment analyst is interested in the application of most relevant tool in order to described the overall performance. The arsenal is rich but we need to know the character of outcomes we wish to underline prior to fetching the proper instruments from the workshop.

In general there are two schools of investment performance. The first one is strictly related to the analysis of time-series on the tick-by-tick basis. All I mean by tick is a time scale of your data, e.g. minutes or months. The analyst is concerned about changes in investments and risk tracked continuously. A good example of this sort of approach is measurement of Value-at-Risk and Expected Shortfall, Below Target Risk or higher-moments of Coskewness and Cokurtosis. It all supplements the n-Asset Portfolio Construction with Efficient Frontier diagnostic.

The second school in the investment analysis is a macro-perspective. It includes the computation on $n$-period rolling median returns (or corresponding quantities), preferably annualized for the sake of comparison. This approach carries more meaning behind the lines rather than a pure tick time-series analysis. For the former, we track a sliding changes in accumulation and performance while for the latter we focus more closely on tick-by-tick risks and returns. For the sake of global picture and performance (market-oriented vantage point), a rolling investigation of risk and return reveals mutual tracking factors between different considered scenarios.

In this article I scratch the surface of the benchmark-oriented analysis with a special focus on the Information Ratio (IR) as an investment performance-related measure. Based on the market data, I test the IR in action and introduce a new supplementary measure of Information Ratio Relative Strength working as a second dimension for IR results.

1. Benchmark-oriented Analysis

Let’s assume you a portfolio manager and you manage over last 10 years 11 investment strategies. Each strategy is different, corresponds to different markets, style of investing, and asset allocation. You wish to get a macro-picture how well your investment performed and are they beating the benchmark. You collect a set of 120 monthly returns for each investment strategy plus their benchmarks. The easiest way to kick off the comparison process is the calculation of $n$-period median (or mean) returns.

Annualized Returns

The rolling $n$Y ($n$-Year) median return is defined as:
$$\mbox{rR}_{j}(n\mbox{Y}) = \left[ \prod_{i=1}^{12n} (r_{i,j}+1)^{1/n} \right] -1 \ \ \ \ \ \mbox{for}\ \ \ j=12n,…,N\ \ (\mbox{at}\ \Delta j =1)$$
where $r$ are the monthly median returns meeting the initial data selection criteria (if any) and $N$ denotes the total number of data available (in our case, $N=120$).

Any pre-selection (pre-filtering) of the input data becomes more insightful when contrasted with the benchmark performance. Thereinafter, by the benchmark we can assume a relative measure we will refer to (e.g. S\&P500 Index for US stocks, etc.) calculated as the rolling $n$Y median. Therefore, for instance, the 1Y (rolling) returns can be compared against the corresponding benchmark (index). The selection of a good/proper benchmark belongs to us.

Active Returns

We define active returns on the monthly basis as the difference between actual returns and benchmark returns,
$$a_i = r_i – b_i ,$$
as available at $i$ where $i$ denotes a continuous counting of months over past 120 months.

Annualized Tracking Risk (Tracking Error)

We assume the widely accepted definition of the tracking risk as the standard deviation of the active returns. For the sake of comparison, we transform it the form of the annualized tracking risk in the following way:
$$\mbox{TR}_{j}(n\mbox{Y}) = \sqrt{12} \times \sqrt{ \frac{1}{12n-1} \sum_{i=1}^{12n} \left( a_{i,j}-\langle a_{i,j} \rangle \right)^2 }$$
where the square root of 12 secures a proper normalization between calculations and $j=12n,…,N$.

Annualized Information Ratio (IR)

The information ratio is often referred to as a variation or generalised version of Sharpe ratio. It tells us how much excess return ($a_i$) is generated from the amount of excess risk taken relative to the benchmark. We calculate the annualized information ratio by dividing the annualised active returns by the annualised tracking risk as follows:
$$\mbox{IR}_j(n\mbox{Y}) = \frac{ \left[ \prod_{i=1}^{12n} (a_{i,j}+1)^{1/n} \right] -1 } { TR_j(n\mbox{Y}) } \ \ \ \ \ \mbox{for}\ \ \ j=12n,…,N$$

We assume the following definition of the investment under-performance and out-performance, $P$, as defined based on IR:
$$P_j = \begin{cases} P_j^{-} =\mbox{under-performance}, & \text{if } \mbox{IR}_j(n\mbox{Y}) \le 0 \\ P_j^{+} =\mbox{out-performance}, & \text{if } \mbox{IR}_j(n\mbox{Y}) > 0 . \end{cases}$$

Information Ratio Relative Strength of Performance

Since $P_j$ marks in time only those periods of time (months) where a given quantity under investigation achieved better results than the benchmark or not, it does not tell us anything about the importance (weight; strength) of this under/over-performance. The case could be, for instance, to observe the out-performance for 65\% of time over 10 years but barely beating the benchmark in the same period. Therefore, we introduce an additional measure, namely, the IR relative strength defined as:
$$\mbox{IRS} = \frac{ \int P^{+} dt } { \left| \int P^{-} dt \right| } .$$
The IRS measures the area under $\mbox{IR}_j(n\mbox{Y})$ function when $\mbox{IR}_j(n\mbox{Y})$ is positive (out-performance) relative to the area cofined between $\mbox{IR}_j(n\mbox{Y})$ function and $\mbox{IR}_j(n\mbox{Y})=0$. Thanks to that measure, we are able to estimate the revelance of
$$\frac { n(P^{+}) } { n(P^{-}) + n(P^{+}) }$$
ratio, i.e. the fraction of time when the examined quantity displayed out-performance over specified period of time (e.g. 10 years of data within our analysis). Here, $n(P^{+})$ counts number of $P_j^{+}$ instances.

2. Case Study

Let’s demonstrate the aforementioned theory in practice. Making use of real data, we perform a benchmark-oriented analysis for all 11 investment strategies. We derive 1Y rolling median returns supplemented by the correspoding 1Y tracking risk and information ratio measures. The following figure presents the results for 1Y returns of for the Balanced Options:

Interestingly to notice, the Investment Strategy-2 (Inv-2) displays an increasing tracking risk between mid-2007 and mid-2009 corresponding to the Credit Global Financial Crisis (GFC). This consistent surge in TR is explained by worse than average performance of the individual options, most probably dictated by higher than expected volatility of assets within portfolio. The 1Y returns of Inv-2 were able to beat the benchmark over only 25% of time, i.e. about 2.25 years since Jun/30 2004. We derive that conclusion based on the quantitative inspection of the Information Ratio plot (bottom panel).

In order to understand the overall performance for all eleven investment strategies, we allow ourselves to compute the under-performance and out-performance ratios,
$$\frac { n(P^{-}) } { n(P^{-}) + n(P^{+}) } \ \ \ \mbox{and} \ \ \ \frac { n(P^{+}) } { n(P^{-}) + n(P^{+}) } ,$$
and plot these ratios in red and blue colour in the following figure, respectively:

Only two investment strategies, Inv-7 and Inv-11, managed to keep the out-performance time longer than their under-performance. Inv-8 denoted the highest ratio of under-performance to out-performance but also Inv-2 (consider above)performed equally badly.

The 3rd dimension to this picture delivers the investigation of the IR Relative Strength measure for all investment strategies. We derive and display the results in next figure:

The plot reveals close-to-linear correlation between two considered quantities. Inv-7 and Inv-11 strategies confirm their relative strength of out-performance while Inv-2 and Inv-8 their weakness in delivering strong returns over their short period of out-performance.

It is of great interest to compare Inv-3, Inv-4, and Inv-10 strategies. While relative strength of out-performance of Inv-4 remains lower as contrasted with the latter ones (and can be explained by low volatility within this sort of investment strategy), Inv-10 denotes much firmer gains over benchmark than Inv-3 despite the fact that their out-performance period was nearly the same ($\sim$40%). Only the additional cross-correlation of IR Relative Strength outcomes as a function of the average return year-over-year is able to address the drivers of this observed relation.

Discuss this topic on QaR Forum

You can discuss the current post with other quants within a new Quant At Risk Forum.

## Create a Portfolio of Stocks based on Google Finance Data fed by Quandl

There is an old saying Be careful what you wish for. Many quants and traders wished for a long time for a better quality of publicly available data. It wasn’t an easy task to accomplish but with a minimum of effort, it has become possible to get all what you truly dreamt of. When I started my experience with daily backtesting of my algo strategies I initially relied on Yahoo! data. Yeah, easily accessible but of low quality. Since the inception of Google Finance the same dream had been reborn. More precise data, covering more aspects of trading, platforms, markets, instruments. But how to get them all?

Quandl.com provides us with an enormous access to an enormous database. Just visit the webpage to find out what I mean by barley putting my thoughts all together in one sentence. In fact, the portal’s accessibility via different programming languages (e.g. Matlab, C++, Python, etc.) makes that database more alive than ever. The space of parameters the quants get an access to becomes astronomical in number!

Alleluja! Let’s ride this horse. Enough of watching the online charts changing a direction of our investing moods with a one-mintute time resolution. We were born to be smarter than stochastic processes taking control over our daily strategies. In this short post, making use of the Matlab R2013a platform, I want to show you how quickly you can construct any portfolio of US stocks taken from the Dow Jones Index universe (as an example). We will use the Google Finance stock data fed by Quandl.com.

Google Finance Stock Data for Model Feeding

Similar to Yahoo! Finance’s resources, Quandl.com has an own system of coding the names of all US stocks. Therefore, we need to know it before doing any further step. The .csv file containing all of the codes for Quandl’s stock data can be obtained via API for Stock Data webpage. The file contains the header of the form:

Ticker Stock Name Price Code Ratios Code In Market?

linking Google Finance ticker with Quandl’s code (Price Code). Doing some extra job with data filtering, I created the pre-processed set of all tickers excluding those non-active in the markets and removing all invalid data. You can download it here QuandlStockCodeListUS.xlsx as an Excel workbook.

The second list we need to have is a list of potential stock we want to deal with in our portfolio. As for today, the current list of Dow Jones Index’s components you can download here: dowjones.lst. Let’s start with it:

1 2 3 4 5 6 7 8 9 10 11 12 13 % Create a Portfolio from Dow Jones Stocks based on Google Finance % Data fed by Quandl.com % % (c) 2013 QuantAtRisk.com, by Pawel Lachowicz     close all; clear all; clc;   % Read the list of Dow Jones components fileID = fopen('dowjones.lst'); tmp = textscan(fileID, '%s'); fclose(fileID); dowjc=tmp{1}; % a list as a cell array

The guys from Mathworks suggest to make a new and good practice of using textscan command (lines 10-12) instead of dataread for reading data from the text files. The latter will be removed from the next Matlab release.

Next, we need to import the current list of tickers for US stocks and their corresponding codes in the Quandl’s database. We will make the use of my pre-processed data set of QuandlStockCodeListUS.xlsx as mentioned earlier in the following way:

15 16 17 % Read in the list of tickers and internal code from Quandl.com [ndata, text, alldata] = xlsread('QuandlStockCodeListUS.xlsx'); quandlc=text(:,1); % again, as a list in a cell array

For the simplicity of our example, we will be considering all 30 stocks as a complete sample set of stocks in our portfolio. A portfolio construction is not solely a selection of stocks. It usually links a time period of the past performance of each individual stock. In the following move, we will collect daily stock returns over last 365 calendar year (corresponding to about 252 trading days).

This framework is pretty handful. Why? If you build or run live a model, most probably your wish is to update your database when US markets are closed. You fetch Quandl.com for data. Now, if your model employs risk management or portfolio optimisation as an integrated building-block, it’s very likely, you are interested in risk(-return) estimation, e.g. by calculating Value-at-Risk (VaR) or similar risk measures. Of course, the choice of the time-frame is up to you based on your strategy.

Having that said,

19 20 21 % fetch stock data for last 365 days date2=datestr(today,'yyyy-mm-dd') % from date1=datestr(today-365,'yyyy-mm-dd') % to

what employs an improved Matlab’s command of datestr with formatting as a parameter. It’s really useful when it comes to data fetching from Quandl.com as the format of the date they like to see in your code is yyyy-mm-dd. The above two commands return in Matlab the following values:

date2 = 2013-11-03 date1 = 2012-11-03

as for the time of writing this article (string type).

Excellent! Let’s catch some fishes in the pond! Before doing it we need to make sure that Quandl’s software is installed and linked to our Matlab console. In order to download Quandl Matlab package just visit this website and follow the installation procedure at the top of the page. Add proper path in Matlab environment too. The best practice is to register yourself on Quandl. That allows you to rise the daily number of data call from 50 to 500. If you think you will need more data requests per day, from the same webpage you can send your request.

When registered, in your Profile section on Quandi, you will gain an access to your individual Authorization Token, for example:
You will need to replace ‘yourauthenticationtoken’ in the Matlab code (see line 35) with your token in order to get the data.

Fetching Stock Data

From this point the road stops to wind. For all 30 stocks belonging to the current composition of Dow Jones Index, we get their Open, High, Low, Close prices (as retrieved from Google Finance) in the following way:

23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 % create an empty array for storing stock data stockd={}; % scan the Dow Jones tickers and fetch the data from Quandl.com for i=1:length(dowjc) for j=1:length(quandlc) if(strcmp(dowjc{i},quandlc{j})) fprintf('%4.0f %s\n',i,quandlc{j}); % fetch the data of the stock from Quandl % using recommanded Quandl's command and % saving them directly into Matlab's FTS object (fts) fts=0; [fts,headers]=Quandl.get(quandlcode{j},'type','fints', ... 'authcode','yourauthenticationtoken',... 'start_date',date1,'end_date',date2); stockd{i}=fts; % entire FTS object in an array's cell end end end % use 'save' command to save all actual variables from memory on the disc, % if you need it for multiple code running % save data

A code in line 35 to 37 is an example how you call Quandi for data and fetch them into FINTS (financial time series object from the Financial Toolbox in Matlab). So easy. If you wish to change the time frequency, apply transformation or get the data not in FINTS but .csv format, all is available for you as a parameter at the level of calling Quandl.get() command (explore all possibilities here).

Having that, you can easily plot the Google Finance data, for instance, of IBM stock straight away: