```------------------------------------------------------
#                                                    #
#     Chapter 11: Simulated Likelihood Inference     #
#         for Stochastic Volatility Models           #
#                                                    #
------------------------------------------------------

Consider the model

S(t_k) = S(t_{k-1}) * [ 1 + vol(t_{k-1}) * phi_k ]               (1)

or

ret(t_k) = vol(t_{k-1}) * phi_k                                 (2)

with a general stochastic volatility process

vol_k = vol_k( vol_{k-1} , eps_k )                              (3)

where the (phi_k,eps_k) are mutually independent
standard normal distributed random numbers. Let's
abbreviate

r_k := ret(t_k)
v_k := vol(t_k)

and assume that the time series under consideration
has length N+1 (instead of N). Then the likelihood
function for the model (1-3) is given by

L = int_{R^N} prod_{k=1}^N 1/sqrt{2pi} 1/v_k *

prod_{k=1}^N exp( -1/2*r_{k+1}^2/v_k^2 ) *

prod_{k=1}^N 1/sqrt{2pi} exp( -1/2*eps_k^2 ) deps_k           (4)

where in the expression above v_k is a function of
eps_k and v_{k-1},

v_k = v_k( eps_k , v_{k-1} )                            (5)

For example, for the GARCH-Diffusion model we had

v_k^2 = v_{k-1}^2 + kappa * [ bsvol^2 - v_{k-1}^2 ]

+ beta * v_{k-1}^2 * eps_k                     (6)

If we look at the expression (4), we see that it con-
tains a factor

prod_{k=1}^N 1/sqrt{2pi} 1/v_k                         (7)

Thus, if we would choose

v_k^2 = sigma^2 * eps_k^2                           (8)

this factor looks somehow similar to the quantities

pi_N := phi_1^2 * ... * phi_N^2                        (9)

which we considered in Uebungsblatt 1-3 where we saw
that we can't calculate these quantities with simple
Monte Carlo. That is, for N = 2500, the following simple
Monte Carlo sum does not converge for any reasonable
M <= 10^6 or even M <= 10^9 simulations:

mcsum = 1/M sum_{k=1}^M phi_{1,k}^2 * ... * phi_{N,k}^2          (10)

Now, for the pi_N's we know that all factors phi_i^2
are independent and we can calculate

E[pi_N] = E[phi_1^2] * ... * E[phi_N^2]                        (11)

= 1 .

The Monte Carlo analog of the calculation (11) would be

[1/M sum_{k_1} phi_{1,k_1}^2 ] * ... * [1/M sum_{k_N} phi_{1,k_N}^2 ]

= 1/M^N sum_{k_1,...,k_N} phi_{1,k_1}^2 * ... * phi_{1,k_N}^2   (12)

Observe that in (10) and (12) the same set of random
numbers can be used, namely N*M phi's. However, not too
surprisingly, the calculation (12) has much better conver-
gence properties. Let's make a quick check:

#---------------------------------------------------------
# warmup excercise: improved MC-Calculation of the pi_N's:
N = 2500                      # length of time series
M = 40000                     # number of MC simulations
phi = matrix(rnorm(N*M),nrow=M,ncol=N)  # 100 Mio random numbers!

# old method:
piN = apply(phi*phi,1,cumprod)
piN = t(piN)
EpiN = colMeans(piN)
par(mfrow=c(1,2))
plot(EpiN,ylim=c(0,3))
plot(log(EpiN))

# new method:
EpiN = rep(0,N)
EpiN[1] = mcsum = sum(phi[,1]^2)/M
for(k in 2:N)
{
mcsum = sum(phi[,k]^2)/M
EpiN[k] = EpiN[k-1]*mcsum
}
par(mfrow=c(1,2))
plot(EpiN,ylim=c(0,3))
plot(log(EpiN),ylim=c(-1,1))

#
# end warmup excercise
#---------------------------------------------------------

THUS:    It is critical to perform an MC-sum at every factor,
at every probability density factor for the r_k's,

factor := 1/sqrt{2pi*v_k^2} * exp( -1/2*r_{k+1}^2/v_k^2 )   (13)

and not just one sum at the very end.

HOWEVER: For a stochvol model, it is not clear how to do this,
since the factors (13) in the likelihood function (4)
are not completely independent, the vol-function v_k
depends, besides on the random number eps_k, also on
the vol v_{k-1} of the previous factor, and this vol
depends on the eps_{k-1}..

The following algorithm, which is taken from the very nice and
very transparent publication of Pitt, Malik and Doucet: "Simu-
lated Likelihood Inference for Stochastic Volatility Models"
of 2014, is quite performant and pretty general, such that it
can be applied to a large class of models. To be specific, we
implement it for the GARCH-Diffusion model (6):

logL_withComments = function( bsvol , w0 , d )
{
N = length(ret)             # length of time series
M = length(eps[,1])         # number of MC-simus; the M x N matrix eps
# will be precalculated, such that, when
# varying model parameters, always the same
# random numbers will be taken.
w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2            # we put start-vol = mean-vol, vol0 = volbar

vol = rep(0,N)
logL = 0

volsq_withinfo    = rep( vol0sq , M )     # with    return information
volsq_withoutinfo = rep( vol0sq , M )     # without return information

for( k in 1:(N-1) )         # same loop as in new method of warmup exercise above:
{                           # for each factor_k we perform an MC-sum

# GARCH-Diffusion vol-specification: v_k^2 =: volsq_k =>
#
# volsq_k = volsq_{k-1}  + kappa * [ vol0sq - volsq_{k-1} ]
#                        + beta * volsq_{k-1} * eps_k
#
# this is a vector of length M = number MC-simus:

#              [k]            [k-1]                              [k-1]
volsq_withoutinfo = volsq_withinfo + kappa * ( vol0sq - volsq_withinfo ) +
beta * volsq_withinfo *  eps[,k]
volsq_withoutinfo = abs( volsq_withoutinfo )

# the probability for the occurence of a given return ret_{k+1}, given
# the value of the volatility v_k of the previous day, is given by the
# probability density which we simply define as a quantity factor:
#
#      factor :=  1/sqrt{2pi*v_k^2} * exp( -1/2*r_{k+1}^2/v_k^2 )
#
# this is also a vector of length M = number MC-simus:

factor = dnorm( ret[k+1] , mean=0 , sd=sqrt(volsq_withoutinfo) )
# factor = prob( r_{k+1} | v_k )
# v_k without return information, but with eps_k and with v_{k-1} information

# to make the algorithm more transparent, we now use two different names,
# mcsum and normalization, for nearly the same quantity; the reason for this
# will become more clear below where we define the quantity prob_for_vol:

mcsum_factor =  sum( factor ) / M     # this is a sum over MC-simus and this
# goes directly into the calculation of logL
normalization = sum( factor )         # a normalization factor for the probabilities
# prob( v_k | r_{k+1} ), defined below

# the actual calculation of the log-Likelihood-function:
logL = logL + log( mcsum_factor )

# now we have to calculate the v_k's, the volsq_withinfo, which we will
# use for the calculation of the v_{k+1}'s in the GARCH_Diffusion equation
# at the very top:
# instead of considering the factor as a probability that a given return
# ret_{k+1} realizes, that is, factor = prob( r_{k+1} | v_k ), we consider
# it as a probability that, given this return ret_{k+1}, the squared vola-
# tility vol(t_k)^2 will be equal to v_k^2 = volsq_withinfo :

prob_for_vol = factor / normalization
# prob_for_vol = prob( v_k | r_{k+1} )

# now, to take these probabilities into account, the v_k's which are used
# for v_{k+1} = v_{k+1}( v_k , eps_{k+1} ) are simply drawn from the v_k's
# without return information, the quantities volsq_withoutinfo, where each
# volatility volsq_withoutinfo[j] in that vector comes with a probability
# prob_for_vol[j]. Fortunately, R has the built-in function sample() which
# does exactly this:

volsq_withinfo = sample( volsq_withoutinfo, M , replace=TRUE , prob_for_vol )
#          [k]

# actually the algorithm does not only estimate the log-Likelihood function,
# but it can also estimate the whole vol-process itself: we do this purely
# for information and checking purposes, the quantity is not needed
# in the algorithm:

mcsum_volsq = sum( volsq_withinfo ) / M
vol[k] = sqrt( mcsum_volsq )

}#next k

result = list( logL , vol )
names(result) = c( "logL" , "vol" )
return(result)
}
# This completes the algorithm. Let's copy and paste the code without all the
# comments to see that it is actually quite short:

logL_withVol = function( bsvol , w0 , d )
{
N = length(ret)
M = length(eps[,1])

w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2

vol = rep(0,N)
logL = 0

volsq_withinfo    = rep( vol0sq , M )
volsq_withoutinfo = rep( vol0sq , M )

for( k in 1:(N-1) )
{
# GARCH-Diffusion:
volsq_withoutinfo = abs( volsq_withinfo + kappa * ( vol0sq - volsq_withinfo ) +
beta * volsq_withinfo *  eps[,k] )

factor = dnorm( ret[k+1] , mean=0 , sd=sqrt(volsq_withoutinfo) )

normalization = sum( factor )
mcsum_factor =  normalization / M

logL = logL + log( mcsum_factor )

prob_for_vol = factor / normalization
volsq_withinfo = sample( volsq_withoutinfo, M , replace=TRUE , prob_for_vol )

mcsum_volsq = sum( volsq_withinfo ) / M
vol[k] = sqrt( mcsum_volsq )

}#next k

result = list( logL , vol )
names(result) = c( "logL" , "vol" )
return(result)
}

# we check the algorithm on an artificially generated return set with
# GARCH-Diffusion dynamics. We modify the GenerateStochVolPaths()-function
# of Loesung9.R:

GenerateGARCHDiffusionPath = function( N , bsvol , w0 , d )
{
S0 = 100
S   = rep(0,N)
vol = rep(0,N)
ret = rep(0,N)
nu  = rep(0,N)    # nu := vol^2

phi = rnorm(N)
eps = rnorm(N)

w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2

# the first step, as usual, is special:
S[1] = S0 * ( 1 + bsvol * phi[1] )
ret[1] = (S[1] - S0)/S0
nu[1] = bsvol^2 + kappa*0 + beta*bsvol^2*eps[1]
vol[1] = sqrt(abs(nu[1]))

# now the actual simulation:
for(k in 2:N)
{
S[k] = S[k-1] * ( 1 + vol[k-1] * phi[k] )
ret[k] = (S[k]-S[k-1])/S[k-1]
# GARCH-Diffusion:
nu[k] = nu[k-1] + kappa*(bsvol^2-nu[k-1]) + beta*nu[k-1]*eps[k]
vol[k] = sqrt(abs(nu[k]))
}

par(mfrow=c(1,2))
plot(S,type="l",ylim=c(0,300),main="GARCH-Diffusion Preis")
plot(vol,type="l",ylim=c(-0.02,0.06),main="GARCH-Diffusion Vol")
abline(a=bsvol,b=0,col=2)

result = list( S , vol , ret )
names(result) = c("S","vol","ret")
return(result)
}

res = GenerateGARCHDiffusionPath( N=2500 , bsvol=1.5/100 , w0=0.15 , d=10 )
ret = res\$ret
plot(ret,type="l")

# Now, let's try to recover the model parameters
# bsvol=1.5/100 , w0=0.15 , d=10
# from the return data set:

M = 500
N = length(ret)
eps = matrix( rnorm(N*M) , ncol=N , nrow=M)

bsvol = sd(ret)
bsvol

# first let's see whether we get a number at all:
resL = logL_withVol( bsvol , w0=0.15 , d=10 )
resL\$logL                      # ok, looks good

# let's take a look at the estimated vol and compare to the exact vol
# (of course, an exact vol we only have for artificially generated data;
# for DAX or SPX we only have the return data ret):
plot(res\$vol,type="l",ylim=c(0,0.06))
lines(resL\$vol,col="red")      # quite okay!

# to maximize logL, we don't need the estimated vols, so we just skip them
# from the code. Since the sample()-function uses random numbers, we set a
# seed at the beginning to make sure that when maximizing logL always the
# same random numbers are used:

logL = function( bsvol , w0 , d , seed=123 )
{
set.seed(seed)
N = length(ret)
M = length(eps[,1])
w = 1 - 1/d
kappa = (1-w)*w0
beta = (1-w)*(1-w0)*sqrt(2)
vol0sq = bsvol^2
vol = rep(0,N)
logL = 0
volsq_withinfo    = rep( vol0sq , M )
volsq_withoutinfo = rep( vol0sq , M )

for( k in 1:(N-1) )
{
volsq_withoutinfo = abs( volsq_withinfo + kappa * ( vol0sq - volsq_withinfo ) +
beta * volsq_withinfo *  eps[,k] )
factor = dnorm( ret[k+1] , mean=0 , sd=sqrt(volsq_withoutinfo) )
normalization = sum( factor )
mcsum_factor =  normalization / M
logL = logL + log( mcsum_factor )
prob_for_vol = factor / normalization
volsq_withinfo = sample( volsq_withoutinfo, M , replace=TRUE , prob_for_vol )
}
return(logL)
}

logL( bsvol=sd(ret), w0=0.15 , d=10 )

# we make a contour-plot in the (w0,d)-plane:
# takes about 2:40 min for M=500, N=2500:
w0 = seq(from=0.05,to=0.5,by=0.05)
dd = 1:40
nw = length(w0)
nd = length(dd)

z = matrix( 0 , nrow=nd , ncol=nw )
for(i in 1:nd)
{
for(j in 1:nw )
{
z[i,j] = logL( bsvol=sd(ret) , w0[j] , dd[i] )
}
}
# let's look at the result:
contour( dd , w0 , z )
contour( dd , w0 , z , nlevels = 50 )
contour( dd , w0 , z , nlevels = 100 ,zlim=c(7300,7500))

# maximum close to w0 = 0.15, d=10 !!
# we are able to recover model parameters!

# more refined:
# takes about 2:40 min for M=1000, N=2500: (half the number of points,
#                                           double number of MC-simus)
w0 = seq(from=0.025,to=0.25,by=0.025)
dd = 1:20
nw = length(w0)
nd = length(dd)

M = 1000
N = length(ret)
eps = matrix( rnorm(N*M) , ncol=N , nrow=M)

z = matrix( 0 , nrow=nd , ncol=nw )
for(i in 1:nd)
{
for(j in 1:nw )
{
z[i,j] = logL( bsvol=sd(ret) , w0[j] , dd[i] )
}
}

contour( dd , w0 , z , nlevels = 100 ,zlim=c(7300,7500))
contour( dd , w0 , z , nlevels = 100 ,zlim=c(7400,7500))
max(z)                               # 7441.77
logL( sd(ret) , w0=0.15 , d=10 )     # 7438.83
logL( sd(ret) , w0=0.2 , d=10 )      # 7441.77
# okay, w0 a bit larger..

```