------------------------------------------------------
# #
# Chapter 11: Simulated Likelihood Inference #
# for Stochastic Volatility Models #
# #
------------------------------------------------------
Consider the model
S(t_k) = S(t_{k-1}) * [ 1 + vol(t_{k-1}) * phi_k ] (1)
or
ret(t_k) = vol(t_{k-1}) * phi_k (2)
with a general stochastic volatility process
vol_k = vol_k( vol_{k-1} , eps_k ) (3)
where the (phi_k,eps_k) are mutually independent
standard normal distributed random numbers. Let's
abbreviate
r_k := ret(t_k)
v_k := vol(t_k)
and assume that the time series under consideration
has length N+1 (instead of N). Then the likelihood
function for the model (1-3) is given by
L = int_{R^N} prod_{k=1}^N 1/sqrt{2pi} 1/v_k *
prod_{k=1}^N exp( -1/2*r_{k+1}^2/v_k^2 ) *
prod_{k=1}^N 1/sqrt{2pi} exp( -1/2*eps_k^2 ) deps_k (4)
where in the expression above v_k is a function of
eps_k and v_{k-1},
v_k = v_k( eps_k , v_{k-1} ) (5)
For example, for the GARCH-Diffusion model we had
v_k^2 = v_{k-1}^2 + kappa * [ bsvol^2 - v_{k-1}^2 ]
+ beta * v_{k-1}^2 * eps_k (6)
If we look at the expression (4), we see that it con-
tains a factor
prod_{k=1}^N 1/sqrt{2pi} 1/v_k (7)
Thus, if we would choose
v_k^2 = sigma^2 * eps_k^2 (8)
this factor looks somehow similar to the quantities
pi_N := phi_1^2 * ... * phi_N^2 (9)
which we considered in Uebungsblatt 1-3 where we saw
that we can't calculate these quantities with simple
Monte Carlo. That is, for N = 2500, the following simple
Monte Carlo sum does not converge for any reasonable
M <= 10^6 or even M <= 10^9 simulations:
mcsum = 1/M sum_{k=1}^M phi_{1,k}^2 * ... * phi_{N,k}^2 (10)
Now, for the pi_N's we know that all factors phi_i^2
are independent and we can calculate
E[pi_N] = E[phi_1^2] * ... * E[phi_N^2] (11)
= 1 .
The Monte Carlo analog of the calculation (11) would be
[1/M sum_{k_1} phi_{1,k_1}^2 ] * ... * [1/M sum_{k_N} phi_{1,k_N}^2 ]
= 1/M^N sum_{k_1,...,k_N} phi_{1,k_1}^2 * ... * phi_{1,k_N}^2 (12)
Observe that in (10) and (12) the same set of random
numbers can be used, namely N*M phi's. However, not too
surprisingly, the calculation (12) has much better conver-
gence properties. Let's make a quick check:
#---------------------------------------------------------
# warmup excercise: improved MC-Calculation of the pi_N's:
N = 2500 # length of time series
M = 40000 # number of MC simulations
phi = matrix(rnorm(N*M),nrow=M,ncol=N) # 100 Mio random numbers!
# old method:
piN = apply(phi*phi,1,cumprod)
piN = t(piN)
EpiN = colMeans(piN)
par(mfrow=c(1,2))
plot(EpiN,ylim=c(0,3))
plot(log(EpiN))
# new method:
EpiN = rep(0,N)
EpiN[1] = mcsum = sum(phi[,1]^2)/M
for(k in 2:N)
{
mcsum = sum(phi[,k]^2)/M
EpiN[k] = EpiN[k-1]*mcsum
}
par(mfrow=c(1,2))
plot(EpiN,ylim=c(0,3))
plot(log(EpiN),ylim=c(-1,1))
#
# end warmup excercise
#---------------------------------------------------------
THUS: It is critical to perform an MC-sum at every factor,
at every probability density factor for the r_k's,
factor := 1/sqrt{2pi*v_k^2} * exp( -1/2*r_{k+1}^2/v_k^2 ) (13)
and not just one sum at the very end.
HOWEVER: For a stochvol model, it is not clear how to do this,
since the factors (13) in the likelihood function (4)
are not completely independent, the vol-function v_k
depends, besides on the random number eps_k, also on
the vol v_{k-1} of the previous factor, and this vol
depends on the eps_{k-1}..
The following algorithm, which is taken from the very nice and
very transparent publication of Pitt, Malik and Doucet: "Simu-
lated Likelihood Inference for Stochastic Volatility Models"
of 2014, is quite performant and pretty general, such that it
can be applied to a large class of models. To be specific, we
implement it for the GARCH-Diffusion model (6):
logL_withComments = function( bsvol , w0 , d )
{
N = length(ret) # length of time series
M = length(eps[,1]) # number of MC-simus; the M x N matrix eps
# will be precalculated, such that, when
# varying model parameters, always the same
# random numbers will be taken.
w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2 # we put start-vol = mean-vol, vol0 = volbar
vol = rep(0,N)
logL = 0
volsq_withinfo = rep( vol0sq , M ) # with return information
volsq_withoutinfo = rep( vol0sq , M ) # without return information
for( k in 1:(N-1) ) # same loop as in new method of warmup exercise above:
{ # for each factor_k we perform an MC-sum
# GARCH-Diffusion vol-specification: v_k^2 =: volsq_k =>
#
# volsq_k = volsq_{k-1} + kappa * [ vol0sq - volsq_{k-1} ]
# + beta * volsq_{k-1} * eps_k
#
# this is a vector of length M = number MC-simus:
# [k] [k-1] [k-1]
volsq_withoutinfo = volsq_withinfo + kappa * ( vol0sq - volsq_withinfo ) +
beta * volsq_withinfo * eps[,k]
volsq_withoutinfo = abs( volsq_withoutinfo )
# the probability for the occurence of a given return ret_{k+1}, given
# the value of the volatility v_k of the previous day, is given by the
# probability density which we simply define as a quantity factor:
#
# factor := 1/sqrt{2pi*v_k^2} * exp( -1/2*r_{k+1}^2/v_k^2 )
#
# this is also a vector of length M = number MC-simus:
factor = dnorm( ret[k+1] , mean=0 , sd=sqrt(volsq_withoutinfo) )
# factor = prob( r_{k+1} | v_k )
# v_k without return information, but with eps_k and with v_{k-1} information
# to make the algorithm more transparent, we now use two different names,
# mcsum and normalization, for nearly the same quantity; the reason for this
# will become more clear below where we define the quantity prob_for_vol:
mcsum_factor = sum( factor ) / M # this is a sum over MC-simus and this
# goes directly into the calculation of logL
normalization = sum( factor ) # a normalization factor for the probabilities
# prob( v_k | r_{k+1} ), defined below
# the actual calculation of the log-Likelihood-function:
logL = logL + log( mcsum_factor )
# now we have to calculate the v_k's, the volsq_withinfo, which we will
# use for the calculation of the v_{k+1}'s in the GARCH_Diffusion equation
# at the very top:
# instead of considering the factor as a probability that a given return
# ret_{k+1} realizes, that is, factor = prob( r_{k+1} | v_k ), we consider
# it as a probability that, given this return ret_{k+1}, the squared vola-
# tility vol(t_k)^2 will be equal to v_k^2 = volsq_withinfo :
prob_for_vol = factor / normalization
# prob_for_vol = prob( v_k | r_{k+1} )
# now, to take these probabilities into account, the v_k's which are used
# for v_{k+1} = v_{k+1}( v_k , eps_{k+1} ) are simply drawn from the v_k's
# without return information, the quantities volsq_withoutinfo, where each
# volatility volsq_withoutinfo[j] in that vector comes with a probability
# prob_for_vol[j]. Fortunately, R has the built-in function sample() which
# does exactly this:
volsq_withinfo = sample( volsq_withoutinfo, M , replace=TRUE , prob_for_vol )
# [k]
# actually the algorithm does not only estimate the log-Likelihood function,
# but it can also estimate the whole vol-process itself: we do this purely
# for information and checking purposes, the quantity is not needed
# in the algorithm:
mcsum_volsq = sum( volsq_withinfo ) / M
vol[k] = sqrt( mcsum_volsq )
}#next k
result = list( logL , vol )
names(result) = c( "logL" , "vol" )
return(result)
}
# This completes the algorithm. Let's copy and paste the code without all the
# comments to see that it is actually quite short:
logL_withVol = function( bsvol , w0 , d )
{
N = length(ret)
M = length(eps[,1])
w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2
vol = rep(0,N)
logL = 0
volsq_withinfo = rep( vol0sq , M )
volsq_withoutinfo = rep( vol0sq , M )
for( k in 1:(N-1) )
{
# GARCH-Diffusion:
volsq_withoutinfo = abs( volsq_withinfo + kappa * ( vol0sq - volsq_withinfo ) +
beta * volsq_withinfo * eps[,k] )
factor = dnorm( ret[k+1] , mean=0 , sd=sqrt(volsq_withoutinfo) )
normalization = sum( factor )
mcsum_factor = normalization / M
logL = logL + log( mcsum_factor )
prob_for_vol = factor / normalization
volsq_withinfo = sample( volsq_withoutinfo, M , replace=TRUE , prob_for_vol )
mcsum_volsq = sum( volsq_withinfo ) / M
vol[k] = sqrt( mcsum_volsq )
}#next k
result = list( logL , vol )
names(result) = c( "logL" , "vol" )
return(result)
}
# we check the algorithm on an artificially generated return set with
# GARCH-Diffusion dynamics. We modify the GenerateStochVolPaths()-function
# of Loesung9.R:
GenerateGARCHDiffusionPath = function( N , bsvol , w0 , d )
{
S0 = 100
S = rep(0,N)
vol = rep(0,N)
ret = rep(0,N)
nu = rep(0,N) # nu := vol^2
phi = rnorm(N)
eps = rnorm(N)
w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2
# the first step, as usual, is special:
S[1] = S0 * ( 1 + bsvol * phi[1] )
ret[1] = (S[1] - S0)/S0
nu[1] = bsvol^2 + kappa*0 + beta*bsvol^2*eps[1]
vol[1] = sqrt(abs(nu[1]))
# now the actual simulation:
for(k in 2:N)
{
S[k] = S[k-1] * ( 1 + vol[k-1] * phi[k] )
ret[k] = (S[k]-S[k-1])/S[k-1]
# GARCH-Diffusion:
nu[k] = nu[k-1] + kappa*(bsvol^2-nu[k-1]) + beta*nu[k-1]*eps[k]
vol[k] = sqrt(abs(nu[k]))
}
par(mfrow=c(1,2))
plot(S,type="l",ylim=c(0,300),main="GARCH-Diffusion Preis")
plot(vol,type="l",ylim=c(-0.02,0.06),main="GARCH-Diffusion Vol")
abline(a=bsvol,b=0,col=2)
result = list( S , vol , ret )
names(result) = c("S","vol","ret")
return(result)
}
res = GenerateGARCHDiffusionPath( N=2500 , bsvol=1.5/100 , w0=0.15 , d=10 )
ret = res$ret
plot(ret,type="l")
# Now, let's try to recover the model parameters
# bsvol=1.5/100 , w0=0.15 , d=10
# from the return data set:
M = 500
N = length(ret)
eps = matrix( rnorm(N*M) , ncol=N , nrow=M)
bsvol = sd(ret)
bsvol
# first let's see whether we get a number at all:
resL = logL_withVol( bsvol , w0=0.15 , d=10 )
resL$logL # ok, looks good
# let's take a look at the estimated vol and compare to the exact vol
# (of course, an exact vol we only have for artificially generated data;
# for DAX or SPX we only have the return data ret):
plot(res$vol,type="l",ylim=c(0,0.06))
lines(resL$vol,col="red") # quite okay!
# to maximize logL, we don't need the estimated vols, so we just skip them
# from the code. Since the sample()-function uses random numbers, we set a
# seed at the beginning to make sure that when maximizing logL always the
# same random numbers are used:
logL = function( bsvol , w0 , d , seed=123 )
{
set.seed(seed)
N = length(ret)
M = length(eps[,1])
w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2
vol = rep(0,N)
logL = 0
volsq_withinfo = rep( vol0sq , M )
volsq_withoutinfo = rep( vol0sq , M )
for( k in 1:(N-1) )
{
volsq_withoutinfo = abs( volsq_withinfo + kappa * ( vol0sq - volsq_withinfo ) +
beta * volsq_withinfo * eps[,k] )
factor = dnorm( ret[k+1] , mean=0 , sd=sqrt(volsq_withoutinfo) )
normalization = sum( factor )
mcsum_factor = normalization / M
logL = logL + log( mcsum_factor )
prob_for_vol = factor / normalization
volsq_withinfo = sample( volsq_withoutinfo, M , replace=TRUE , prob_for_vol )
}
return(logL)
}
logL( bsvol=sd(ret), w0=0.15 , d=10 )
# we make a contour-plot in the (w0,d)-plane:
# takes about 2:40 min for M=500, N=2500:
w0 = seq(from=0.05,to=0.5,by=0.05)
dd = 1:40
nw = length(w0)
nd = length(dd)
z = matrix( 0 , nrow=nd , ncol=nw )
for(i in 1:nd)
{
for(j in 1:nw )
{
z[i,j] = logL( bsvol=sd(ret) , w0[j] , dd[i] )
}
}
# let's look at the result:
contour( dd , w0 , z )
contour( dd , w0 , z , nlevels = 50 )
contour( dd , w0 , z , nlevels = 100 ,zlim=c(7300,7500))
# maximum close to w0 = 0.15, d=10 !!
# we are able to recover model parameters!
# more refined:
# takes about 2:40 min for M=1000, N=2500: (half the number of points,
# double number of MC-simus)
w0 = seq(from=0.025,to=0.25,by=0.025)
dd = 1:20
nw = length(w0)
nd = length(dd)
M = 1000
N = length(ret)
eps = matrix( rnorm(N*M) , ncol=N , nrow=M)
z = matrix( 0 , nrow=nd , ncol=nw )
for(i in 1:nd)
{
for(j in 1:nw )
{
z[i,j] = logL( bsvol=sd(ret) , w0[j] , dd[i] )
}
}
contour( dd , w0 , z , nlevels = 100 ,zlim=c(7300,7500))
contour( dd , w0 , z , nlevels = 100 ,zlim=c(7400,7500))
max(z) # 7441.77
logL( sd(ret) , w0=0.15 , d=10 ) # 7438.83
logL( sd(ret) , w0=0.2 , d=10 ) # 7441.77
# okay, w0 a bit larger..