Quantitative Analysis, Risk Management, Modelling, Algo Trading, and Big Data Analysis

Information Ratio and its Relative Strength for Portfolio Managers

The risk and return are the two most significant quantities investigated within the investment performance. We expect the return to be highest at the minimum risk. In practice, a set of various combinations is always achieved depending on a number of factors related to the investment strategies and market behaviours, respectively. The portfolio manager or investment analyst is interested in the application of most relevant tool in order to described the overall performance. The arsenal is rich but we need to know the character of outcomes we wish to underline prior to fetching the proper instruments from the workshop.

In general there are two schools of investment performance. The first one is strictly related to the analysis of time-series on the tick-by-tick basis. All I mean by tick is a time scale of your data, e.g. minutes or months. The analyst is concerned about changes in investments and risk tracked continuously. A good example of this sort of approach is measurement of Value-at-Risk and Expected Shortfall, Below Target Risk or higher-moments of Coskewness and Cokurtosis. It all supplements the n-Asset Portfolio Construction with Efficient Frontier diagnostic.

The second school in the investment analysis is a macro-perspective. It includes the computation on $n$-period rolling median returns (or corresponding quantities), preferably annualized for the sake of comparison. This approach carries more meaning behind the lines rather than a pure tick time-series analysis. For the former, we track a sliding changes in accumulation and performance while for the latter we focus more closely on tick-by-tick risks and returns. For the sake of global picture and performance (market-oriented vantage point), a rolling investigation of risk and return reveals mutual tracking factors between different considered scenarios.

In this article I scratch the surface of the benchmark-oriented analysis with a special focus on the Information Ratio (IR) as an investment performance-related measure. Based on the market data, I test the IR in action and introduce a new supplementary measure of Information Ratio Relative Strength working as a second dimension for IR results.

1. Benchmark-oriented Analysis

Let’s assume you a portfolio manager and you manage over last 10 years 11 investment strategies. Each strategy is different, corresponds to different markets, style of investing, and asset allocation. You wish to get a macro-picture how well your investment performed and are they beating the benchmark. You collect a set of 120 monthly returns for each investment strategy plus their benchmarks. The easiest way to kick off the comparison process is the calculation of $n$-period median (or mean) returns.

Annualized Returns

The rolling $n$Y ($n$-Year) median return is defined as:
\mbox{rR}_{j}(n\mbox{Y}) = \left[ \prod_{i=1}^{12n} (r_{i,j}+1)^{1/n} \right] -1
\ \ \ \ \ \mbox{for}\ \ \ j=12n,…,N\ \ (\mbox{at}\ \Delta j =1)
where $r$ are the monthly median returns meeting the initial data selection criteria (if any) and $N$ denotes the total number of data available (in our case, $N=120$).

Any pre-selection (pre-filtering) of the input data becomes more insightful when contrasted with the benchmark performance. Thereinafter, by the benchmark we can assume a relative measure we will refer to (e.g. S\&P500 Index for US stocks, etc.) calculated as the rolling $n$Y median. Therefore, for instance, the 1Y (rolling) returns can be compared against the corresponding benchmark (index). The selection of a good/proper benchmark belongs to us.

Active Returns

We define active returns on the monthly basis as the difference between actual returns and benchmark returns,
a_i = r_i – b_i ,
as available at $i$ where $i$ denotes a continuous counting of months over past 120 months.

Annualized Tracking Risk (Tracking Error)

We assume the widely accepted definition of the tracking risk as the standard deviation of the active returns. For the sake of comparison, we transform it the form of the annualized tracking risk in the following way:
\mbox{TR}_{j}(n\mbox{Y}) = \sqrt{12} \times \sqrt{ \frac{1}{12n-1} \sum_{i=1}^{12n} \left( a_{i,j}-\langle a_{i,j} \rangle \right)^2 }
where the square root of 12 secures a proper normalization between calculations and $j=12n,…,N$.

Annualized Information Ratio (IR)

The information ratio is often referred to as a variation or generalised version of Sharpe ratio. It tells us how much excess return ($a_i$) is generated from the amount of excess risk taken relative to the benchmark. We calculate the annualized information ratio by dividing the annualised active returns by the annualised tracking risk as follows:
\mbox{IR}_j(n\mbox{Y}) = \frac{ \left[ \prod_{i=1}^{12n} (a_{i,j}+1)^{1/n} \right] -1 } { TR_j(n\mbox{Y}) }
\ \ \ \ \ \mbox{for}\ \ \ j=12n,…,N

We assume the following definition of the investment under-performance and out-performance, $P$, as defined based on IR:
P_j =
P_j^{-} =\mbox{under-performance}, & \text{if } \mbox{IR}_j(n\mbox{Y}) \le 0 \\
P_j^{+} =\mbox{out-performance}, & \text{if } \mbox{IR}_j(n\mbox{Y}) > 0 .

Information Ratio Relative Strength of Performance

Since $P_j$ marks in time only those periods of time (months) where a given quantity under investigation achieved better results than the benchmark or not, it does not tell us anything about the importance (weight; strength) of this under/over-performance. The case could be, for instance, to observe the out-performance for 65\% of time over 10 years but barely beating the benchmark in the same period. Therefore, we introduce an additional measure, namely, the IR relative strength defined as:
\mbox{IRS} = \frac{ \int P^{+} dt } { \left| \int P^{-} dt \right| } .
The IRS measures the area under $\mbox{IR}_j(n\mbox{Y})$ function when $\mbox{IR}_j(n\mbox{Y})$ is positive (out-performance) relative to the area cofined between $\mbox{IR}_j(n\mbox{Y})$ function and $\mbox{IR}_j(n\mbox{Y})=0$. Thanks to that measure, we are able to estimate the revelance of
\frac { n(P^{+}) } { n(P^{-}) + n(P^{+}) }
ratio, i.e. the fraction of time when the examined quantity displayed out-performance over specified period of time (e.g. 10 years of data within our analysis). Here, $n(P^{+})$ counts number of $P_j^{+}$ instances.

2. Case Study

Let’s demonstrate the aforementioned theory in practice. Making use of real data, we perform a benchmark-oriented analysis for all 11 investment strategies. We derive 1Y rolling median returns supplemented by the correspoding 1Y tracking risk and information ratio measures. The following figure presents the results for 1Y returns of for the Balanced Options:

Interestingly to notice, the Investment Strategy-2 (Inv-2) displays an increasing tracking risk between mid-2007 and mid-2009 corresponding to the Credit Global Financial Crisis (GFC). This consistent surge in TR is explained by worse than average performance of the individual options, most probably dictated by higher than expected volatility of assets within portfolio. The 1Y returns of Inv-2 were able to beat the benchmark over only 25% of time, i.e. about 2.25 years since Jun/30 2004. We derive that conclusion based on the quantitative inspection of the Information Ratio plot (bottom panel).

In order to understand the overall performance for all eleven investment strategies, we allow ourselves to compute the under-performance and out-performance ratios,
\frac { n(P^{-}) } { n(P^{-}) + n(P^{+}) } \ \ \ \mbox{and} \ \ \ \frac { n(P^{+}) } { n(P^{-}) + n(P^{+}) } ,
and plot these ratios in red and blue colour in the following figure, respectively:

Only two investment strategies, Inv-7 and Inv-11, managed to keep the out-performance time longer than their under-performance. Inv-8 denoted the highest ratio of under-performance to out-performance but also Inv-2 (consider above)performed equally badly.

The 3rd dimension to this picture delivers the investigation of the IR Relative Strength measure for all investment strategies. We derive and display the results in next figure:
The plot reveals close-to-linear correlation between two considered quantities. Inv-7 and Inv-11 strategies confirm their relative strength of out-performance while Inv-2 and Inv-8 their weakness in delivering strong returns over their short period of out-performance.

It is of great interest to compare Inv-3, Inv-4, and Inv-10 strategies. While relative strength of out-performance of Inv-4 remains lower as contrasted with the latter ones (and can be explained by low volatility within this sort of investment strategy), Inv-10 denotes much firmer gains over benchmark than Inv-3 despite the fact that their out-performance period was nearly the same ($\sim$40%). Only the additional cross-correlation of IR Relative Strength outcomes as a function of the average return year-over-year is able to address the drivers of this observed relation.

Discuss this topic on QaR Forum

You can discuss the current post with other quants within a new Quant At Risk Forum.

Black Swan and Extreme Loss Modeling

When I read the book of Nassim Nicholas Taleb Black Swan my mind was captured by the beauty of extremely rare events and, concurrently, devastated by the message the book sent: the non-computability of the probability of the consequential rare events using scientific methods (as owed to the very nature of small probabilities). I rushed to the local library to find out what had been written on the subject. Surprisingly I discovered the book of Embrechts, Kluppelberg & Mikosh on Modelling Extremal Events for Insurance and Finance which appeared to me very inaccessible, loaded with heavy mathematical theorems and proofs, and a negligible number of practical examples. I left it on the shelf to dust for a longer while until last week when a fresh drive to decompose the problem came back to me again.

In this article I will try to take you for a short but efficient journey through the part of the classical extreme value theory, namely, fluctuations of maxima, and fulfil the story with easy-to-follow procedures on how one may run simulations of the occurrences of extreme rare losses in financial return series. Having this experience, I will discuss briefly the inclusion of resulting modeling to the future stock returns.

1. The Theory of Fluctuations of Maxima

Let’s imagine we have a rich historical data (time-series) of returns for a specific financial asset or portfolio of assets. A good and easy example is the daily rate of returns, $R_i$, for a stock traded e.g. at NASDAQ Stock Market,
R_t = \frac{P_t}{P_{t-1}} – 1 \ ,
$$ where $P_t$ and $P_{t-1}$ denote a stock price on a day $t$ and $t-1$, respectively. The longer time coverage the more valuable information can be extracted. Given the time-series of daily stock returns, $\{R_i\}\ (i=1,…,N)$, we can create a histogram, i.e. the distribution of returns. By the rare event or, more precisely here, the rare loss we will refer to the returns placed in the far left tail of the distribution. As an assumption we also agree to $R_1,R_2,…$ to be the sequence of iid non-degenerate rvs (random variables) with a common df $F$ (distribution function of $F$). We define the fluctuations of the sample maxima as:
M_1 = R_1, \ \ \ M_n = \max(R_1,…,R_n) \ \mbox{for}\ \ n\ge 2 \ .
$$ That simply says that for any time-series $\{R_i\}$, there is one maximum corresponding to the rv (random variable) with the most extreme value. Since the main line of this post is the investigation of maximum losses in return time-series, we are eligible to think about negative value (losses) in terms of maxima (therefore conduct the theoretical understanding) thanks to the identity:
\min(R_1,…,R_n) = -\max(-R_1,…,-R_n) \ .
$$ The distribution function of maximum $M_n$ is given as:
P(M_n\le x) = P(R_1\le x, …, R_n\le x) = P(R_1\le x)\cdots P(R_n\le x) = F^n(x)
$$ for $x\in\Re$ and $n\in\mbox{N}$.

What the extreme value theory first ‘investigates’ are the limit laws for the maxima $M_n$. The important question here emerges: is there somewhere out there any distribution which satisfies for all $n\ge 2$ the identity in law
\max(R_1,…,R_n) = c_nR + d_n
$$ for appropriate constants $c_n>0$ and $d_n\in\Re$, or simply speaking, which classes of distributions $F$ are closed for maxima? The theory defines next the max-stable distribution within which a random variable $R$ is called max-stable if it satisfies a aforegoing relation for idd $R_1,…,R_n$. If we assume that $\{R_i\}$ is the sequence of idd max-stable rvs then:
R = c_n^{-1}(M_n-d_n)
$$ and one can say that every max-stable distribution is a limit distribution for maxima of idd rvs. That brings us to the fundamental Fisher-Trippett theorem saying that if there exist constants $c_n>0$ and $d_n\in\Re$ such that:
c_n^{-1}(M_n-d_n) \rightarrow H, \ \ n\rightarrow\infty\ ,
$$ then $H$ must be of the type of one of the three so-called standard extreme value distributions, namely: Fréchet, Weibull, and Gumbel. In this post we will be only considering the Gumbel distribution $G$ of the corresponding probability density function (pdf) $g$ given as:
G(z;\ a,b) = e^{-e^{-z}} \ \ \mbox{for}\ \ z=\frac{x-b}{a}, \ x\in\Re
$$ and
g(z;\ a,b) = b^{-1} e^{-z}e^{-e^{-z}} \ .
$$ where $a$ and $b$ are the location parameter and scale parameter, respectively. Having defined the extreme value distribution and being now equipped with a better understanding of theory, we are ready for a test drive over daily roads of profits and losses in the trading markets. This is the moment which separates men from boys.

2. Gumbel Extreme Value Distribution for S&P500 Universe

As usual, we start with entrée. Our goal is to find the empirical distribution of maxima (i.e. maximal daily losses) for all stocks belonging to the S&P500 universe between 3-Jan-1984 and 8-Mar-2011. There were $K=954$ stocks traded within this period and their data can be downloaded here as a sp500u.zip file (23.8 MB). The full list of stocks’ names is provided in sp500u.lst file. Therefore, performing the data processing in Matlab, first we need to compute a vector storing daily returns for each stock, and next find the corresponding minimal value $M_n$ where $n$ stands for the length of each return vector:

% Black Swan and Extreme Loss Modeling
%  using Gumbel distribution and S&P500 universe
% (c) 2013 QuantAtRisk, by Pawel Lachowicz
clear all; close all; clc;
% read a list of stock names
StockNames=dataread('file',['sp500u.lst'],'%s','delimiter', '\n');
K=length(StockNames); % the number of stocks in the universe
% path to data files
fprintf('data reading and preprocessing..\n');
for si=1:K
    % --stock name
    fprintf('%4.0f  %7s\n',si,stock);
    % --load data
    % check for NULL and change to NaN (using 'sed' command
    % in Unix/Linux/MacOS environment)
    cmd=['sed -i ''s/NULL/NaN/g''',' ',n]; [status,result]=system(cmd);
    % construct FTS object for daily data
    % fill any missing values denoted by NaNs
    % extract the close price of the stock
    % calculate a vector with daily stock returns and store it in
    % the cell array
% find the minimum daily return value for each stock
for si=1:K
    Rmin=[Rmin; Mn];

Having that ready, we fit the data with the Gumbel function which (as we believe) would describe the distribution of maximal losses in the S&P500 universe best:

% fit the empirical distribution with Gumbel distribution and 
% estimate the location, a, and scale, b, parameter
% plot the distribution
set(h,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.6 0.6 0.6]);
% add a plot of Gumbel pdf 
line(x,y,'color','r'); box on;
text(-1,140,['a = ',num2str(paramEstsMinima(1),3)]);
text(-1,130,['b = ',num2str(paramEstsMinima(2),3)]); 
xlim([-1 0]);

The maximum likelihood estimates of the parameters $a$ and $b$ and corresponding 95% confidence intervals we can find as follows:

>> [par,parci]=evfit(Rmin)
par =
   -0.2265    0.1135
parci =
   -0.2340    0.1076
   -0.2190    0.1197

That brings us to a visual representation of our analysis:


This is a very important result communicating that the expected value of extreme daily loss is equal about -22.6%. However, the left tail of the fitted Gumbel distribution extends far up to nearly -98% although the probability of the occurrence of such a massive daily loss is rather low.

On the other hand, the expected value of -22.6% is surprisingly close to the trading down-movements in the markets on Oct 19, 1987 known as Black Monday when Dow Jones Industrial Average (DJIA) dropped by 508 points to 1738.74, i.e. by 22.61%!

3. Blending Extreme Loss Model with Daily Returns of a Stock

Probably you wonder how can we include the results coming from the Gumbel modeling for the prediction of rare losses in the future daily returns of a particular stock. This can be pretty straightforwardly done combining the best fitted model (pdf) for extreme losses with stock’s pdf. To do it properly we need to employ the concept of the mixture distributions. Michael B. Miller in his book Mathematics and Statistics for Financial Risk Management provides with a clear idea of this procedure. In our case, the mixture density function $f(x)$ could be denoted as:
f(x) = w_1 g(x) + (1-w_1) n(x)
$$ where $g(x)$ is the Gumbel pdf, $n(x)$ represents fitted stock pdf, and $w_1$ marks the weight (influence) of $g(x)$ into resulting overall pdf.

In order to illustrate this process, let’s select one stock from our S&P500 universe, say Apple Inc. (NASDAQ: AAPL), and fit its daily returns with a normal distribution:

% AAPL daily returns (3-Jan-1984 to 11-Mar-2011)
set(h,'FaceColor',[0.7 0.7 0.7],'EdgeColor',[0.6 0.6 0.6]);
% fit the normal distribution and plot the fit
x =-1:0.01:1;
hold on; line(x,y,'color','r');
xlim([-0.2 0.2]); ylim([0 2100]);


where the red line represents the fit with a mean of $\mu=0.0012$ and a standard deviation $\sigma=0.0308$.

We can obtain the mixture distribution $f(x)$ executing a few more lines of code:

% Mixture Distribution Plot
w1= % enter your favorite value, e.g. 0.001
pdfmix=w1*(pdf1*0.01)+w2*(pdf2*0.01);  % note: sum(pdfmix)=1 as expected
xlim([-0.6 0.6]);

It is important to note that our modeling is based on $w_1$ parameter. It can be intuitively understood as follows. Let’s say that we choose $w_1=0.01$. That would mean that Gumbel pdf contributes to the overall pdf in 1%. In the following section we will see that if a random variable is drawn from the distribution given by $f(x)$, $w_1=0.01$ simply means (not exactly but with a sufficient approximation) that there is 99% of chances of drawing this variable from $n(x)$ and only 1% from $g(x)$. The dependence of $f(x)$ on $w_1$ illustrates the next figure:


It is well visible that a selection of $w_1>0.01$ would be a significant contributor to the left tail making it fat. This is not the case what is observed in the empirical distribution of daily returns for AAPL (and in general for majority of stocks), therefore we rather expect $w_1$ to be much much smaller than 1%.

4. Drawing Random Variables from Mixture Distribution

A short break between entrée and main course we fill with a sip of red wine. Having discrete form of $f(x)$ we would like to be able to draw a random variable from this distribution. Again, this is easy too. Following a general recipe, for instance given in the Chapter 12.2.2 of Philippe Jorion’s book Value at Risk: The New Benchmark for Managing Financial Risk, we wish to use the concept of inverse transform method. In first step we use the output (a random variable) coming from a pseudo-random generator drawing its rvs based on the uniform distribution $U(x)$. This rv is always between 0 and 1, and in the last step is projected on the cumulative distribution of our interest $F(x)$, what in our case would correspond to the cumulative distribution for $f(x)$ pdf. Finally, we read out the corresponding value on the x-axis: a rv drawn from $f(x)$ pdf. Philippe illustrates that procedure more intuitively:

Drawing a Random Variable Process

This methods works smoothly when we know the analytical form of $F(x)$. However, if this not in the menu, we need to use a couple of technical skills. First, we calculate $F(x)$ based on $f(x)$. Next, we set a very fine grid for $x$ domain, and we perform interpolation between given data points of $F(x)$.

% find cumulative pdf, F(x)
for i=1:length(pdfmix);
    F=[F; x s];
xlim([-1 1]); ylim([-0.1 1.1]);
% perform interpolation of cumulative pdf using very fine grid
yi=interp1(F(:,1),F(:,2),xi,'linear'); % use linear interpolation method
hold on; plot(xi,yi);

The second sort of difficulty is in finding a good match between the rv drawn from the uniform distribution and approximated value for our $F(x)$. That is why a very fine grid is required supplemented with some matching techniques. The following code that I wrote deals with this problem pretty efficiently:

% draw a random variable from f(x) pdf: xi(row)
for k=1:(252*40)
    % therefore, xi(row) is a number represting a rv
    % drawn from f(x) pdf; we store 252*40 of those
    % new rvs in the following matrix:
    RV=[RV; xi(row) yi(row)];
% mark all corresponding rvs on the cumulative pdf
hold on; plot(RV(:,1),RV(:,2),'rx');

Finally, as the main course we get and verify the distribution of a large number of new rvs drawn from $f(x)$ pdf. It is crucial to check whether our generating algorithm provides us with a uniform coverage across the entire $F(x)$ plot,


where, in order to get more reliable (statistically) results, we generate 10080 rvs which correspond to the simulated 1-day stock returns for 252 trading days times 40 years.

5. Black Swan Detection

A -22% collapse in the markets on Oct 19, 1978 served as a day when the name of Black Swan event took its birth or at least had been reinforced in the financial community. Are black swans extremely rare? It depends. If you live for example in Perth, Western Australia, you can see a lot of them wandering around. So what defines the extremely rare loss in the sense of financial event? Let’s assume by the definition that by Black Swan event we will understand of a daily loss of 20% or more. If so, using the procedure described in this post, we are tempted to pass from the main course to dessert.

Our modeling concentrates around finding the most proper contribution of $w_1g(x)$ to resulting $f(x)$ pdf. As an outcome of a few runs of Monte Carlo simulations with different values of $w_1$ we find that for $w_1=[0.0010,0.0005,0.0001]$ we detect in the simulations respectively 9, 5, and 2 events (rvs) displaying a one-day loss of 20% or more.

Therefore, the simulated daily returns for AAPL, assuming $w_1=0.0001$, generate in the distribution two Black Swan events, i.e. one event per 5040 trading days, or one per 20 years:

Black Swans in future AAPL returns

That result agrees quite well with what has been observed so far, i.e. including Black Monday in 1978 and Flash Crash in intra-day trading on May 6, 2010 for some of the stocks.


I am grateful to Peter Urbani from New Zealand for directing my attention towards Gumbel distribution for modeling very rare events.

Dutch Book: Making a Riskless Profit

If you think there is no way to make a riskless profit, think again. This concept is known as Dutch Book. First, let me remind the definition of the probability stated as odds. Given a probability $P(E)$ of an event $E$, odds for $E$ are equal
$$ E=P(E)/[1-P(E)] . $$ Given odds for $E$ of a:b, the implied probability of $E$ is $a/(a+b)$. Reversely speaking, odds against $E$ are $$ E=[1-P(E)]/P(E). $$ Thus, given odds against $E$ of a:b, the implied probability of $E$ is $b/(a+b)$.

Now, suppose John places $\$100$ bet on $X$ at odds of 10:1 against $X$, and later he is able to place a $\$600$ bet against $X$ at odds of 1:1 against $X$. Whatever the outcome of $X$, that person makes a riskless profit equal to $\$400$ if $X$ occurs or $\$500$ if $X$ does not occur because the implied probabilities are inconsistent.

John is said to have made a Dutch Book in $X$.

Performance-related Risk Measures

Enterprise Risks Management (ERM) can be described as a discipline by which an organization in any industry assesses, controls, exploits, finances, and monitors risks from all sources for the purpose of increasing the organization’s short- and long-term value to its stakeholders. It is a conceptual framework and when adopted by a company it provides with a set of tools to, inter alia, describe and quantify a risk profile. In general, most of the measures common in the practice of ERM can be broken in two categories: (a) solvency-related measures, and (b) performance-related measures. From a quantitative view point the latter refers to the volatility of the organization’s performance on a going-concern basis.

Performance-related risk measures provide us with a good opportunity to quickly review the fundamental definitions of the tools which concentrate on the mid-refion of the probability distribution, i.e. the region near the mean, and relevant for determination of the volatility around expected results:

Volatility (standard deviation), Variance, Mean
Vol(x) = \sqrt{Var(x)} = \left[\frac{\sum_{i=1}^{N}(x_i-\bar{x})^2}{N}\right]^{0.5}

Shortfall Risk
SFR = \frac{1}{N} \sum_{i=1}^{N} 1_{[x_i\lt T]} \times 100\%
where $T$ is the target value for the financial variable $x$. Shortfall Risk measure reflects the improvement over $Vol(x)$ measure by taking into account the fact that most of people are risk averse, i.e. they are more concerned with unfavorable deviations rather than favorable ones. Therefore, $SFR$ can be understood as the probability that the financial variable $x_i$ falls below a specified target level of $T$ (if true, $1_{[x_i\lt T]}$ above takes the value of 1).

Value at Risk (VaR)
In VaR-type measures, the equation is reversed: the shortfall risk is specified first, and the corresponding value at risk ($T$) is solved for.

Downside Volatility (or Downside Standard Deviation)
DVol(x) = \left[\frac{\sum_{i=1}^{N}(min[0,(x_i-T)]^2}{N}\right]^{0.5}
where again $T$ is the target value for the financial variable $x$. Downside volatility focuses not only on the probability of an unfavorable deviation in a financial vairable (as SFR) but also the extent to which it is favorable. It is usually interpreted as the extend to which the financial variable could deviate below a specified target level.

Below Target Risk
BTR = \frac{\sum_{i=1}^{N}(min[0,(x_i-T)]}{N}
takes its origin from the definition of the downside volatility but the argument is not squared, and there is no square root taken of the sum.

Probability of Financial Ruin

Very often in the modeling of rare events in finance and insurance industry the analysts are interested in both the probability of events and their financial consequences. That revolves around the aspect of a probability of ruin, i.e. when one of two dealing parties becomes insolvent (loses all its assets).

The estimation of chances of going broke may be understood by considering the following gambler’s problem. Two parties X and Y start the game with $x$ and $y$ dollars. If X wins, and its probability of winning is $p$, Y pays $\$1$ to X. The game ends when one of the parties amasses all money, $\$(a+b)$.

Let’s denote by $u_{n+1}$ the probability that X wins and holds $\$(n+1)$. If X wins again with $p$ it will hold $\$(n+2)$ and probability of winning of $u_{n+2}$ or if Y wins with $q=1-p$, it will hold $\$n$ and $u_n$:
u_{n+1} = pu_{n+2} + qu_n \ \ \mbox{for}\ \ 0 < n+1 < x+y. $$ But because $u_{n+1} = pu_{n+1} + qu_{n+1}$ we get: $$ (u_{n+2}-u_{n+1}) = \frac{q}{p}(u_{n+1}-u_n). $$ Applying recurrence relation for the problem we may denote: $$ (u_2-u_1) = \frac{q}{p}(u_1) \\ (u_3-u_2) = \frac{q}{p}(u_2-u1) = \left(\frac{q}{p}\right)^2 (u_1) \\ ... \\ (u_i-u_{i-1}) = \left(\frac{q}{p}\right)^{i-1}(u_1) . $$ Now, if we sum both sides we will find: $$ (u_i-u_1) = \left[\sum_{j=1}^{i-1} \left(\frac{q}{p}\right)^{j}\right](u_1), \ \ \ \mbox{or} \\ u_i = \left[1+\frac{q}{p}+\left(\frac{q}{p}\right)^2+ ... + \left(\frac{q}{p}\right)^{i-1} \right](u_1) $$ Doing maths, one can prove that: $$ \left(1-\frac{q}{p}\right) \left[1+\sum_{j=1}^{i-1} \left(\frac{q}{p}\right)^{j} \right] = 1-\left(\frac{q}{p}\right)^i $$ therefore for $p\neq q$ we have $$ u_i = \left[ \frac{1-\left(\frac{q}{p}\right)^i}{1-\left(\frac{q}{p}\right)} \right] u_1 . $$ Making a note that $u_{x+y}=1$, we evaluate it to: $$ u_1 = \left[ \frac{1-\left(\frac{q}{p}\right)}{1-\left(\frac{q}{p}\right)^{x+y}} \right]u_{x+y} $$ what generalized for $i$ gives $$ u_i = \left[ \frac{1-\left(\frac{q}{p}\right)^i}{1-\left(\frac{q}{p}\right)^{x+y}} \right] $$ therefore we have: $$ u_x = \left[ \frac{1-\left(\frac{q}{p}\right)^x}{1-\left(\frac{q}{p}\right)^{x+y}}\right] . $$ For the case of $p=q=\frac{1}{2}$ we have: $$ u_x = \frac{x}{x+y} . $$ So, what this $u_x$ means for us? Let's illustrate it using a real-life example. Learn to count cards before going to Las Vegas

If you start playing poker-like game having $y=\$1000$ in your pocket and casino’s total cash that you may win equals $x=\$$50,000,000 but your winning rate solely depends on your luck (favorably, say your $q=1-p=0.49$, i.e. only a bit lower that bank’s odds to win) then your chances of going broke are equal $u_x = 1$. However, if you can count carts and do it consistently over and over again (now your q=0.51) then $u_x \approx 4 \times 10^{-18}$ what simply means you have a huge potential to be very rich over night using your skills and mathematics.

Contact Form Powered By : XYZScripts.com