---------------------------------------------------- # # # Chapter 5: Maximum Likelihood Estimation # # of the ARCH(1)-Model # # # ---------------------------------------------------- In the last chapter we defined the ARCH(d)-model and we simulated a couple of paths for various choices of model paramters. We saw that, depending on the choice of these model parameters, the model is already able to generate quite a variety of different price dynamics. Now the question arises how we can determine which of these price dynamics, that is, which choice of model parameters, are the best for a given time series. In this chapter we will answer this question for the ARCH(1)-model. The logic for the ARCH(d)-model is completely analog and will be considered in the next chapter. The ARCH(1)-Model is given by S(t_k) = S(t_{k-1}) * [ 1 + vol(t_{k-1})*phi_k ] (1) with volatility specification vol^2(t_{k-1}) = w0*bsvol^2 + w1*ret^2(t_{k-1}) (2) with weights w0+w1=1 and a constant Black-Scholes volatility bsvol. For w0=1, w1=0 the model reduces to the time discrete Black-Scholes model (with zero drift). Let's fit this model to the DAX time series. That is, we would like to determine the parameters w0 (then w1=1-w0 is also fixed) and bsvol, such that the corresponding price dynamics fits `best' to the given time series data. The meaning of `best' of course has to be formulated in some precise mathematical way. To this end we rewrite (1) as ret(t_k) = vol(t_{k-1}) * phi_k (3) and recall, that phi_k is a standard normal-distributed random number. Thus, if we are at time t_{k-1}, then the probability that the next return ret(t_k) will realize in the intervall [y_k,y_k+dy_k], is given by Prob[ ret(t_k) in [y_k,y_k+dy_k] | t_{k-1} ] (4) = 1/sqrt{2pi} 1/vol(t_{k-1}) * exp( -1/2*y_k^2/vol^2(t_{k-1}) ) dy_k Thus, the probability that all returns have values in some intervalls [y_k,y_k+dy_k], for all k=1,2,...,N, is given by the product of the probabilities in (4), Prob[ DAX-ret in [y_k,y_k+dy_k] for all k ] = prod_{k=1}^N 1/sqrt{2pi} 1/vol(t_{k-1}) * (5) prod_{k=1}^N exp( -1/2*y_k^2/vol^2(t_{k-1}) ) dy_k Now, if we substitute the ARCH(1) vol-specification given by (2) above, vol^2(t_{k-1}) = w0*bsvol^2 + w1*y_{k-1}^2 (2) on the right hand side of (5), then the probability in (5), Prob[ DAX-ret in [y_k,y_k+dy_k] for all k ] can be considered as a function of the model parameters bsvol and w0 exclusively, since the y_k are actually to be substituted by the actual realized returns of the last 10 years. To make this more explicit, that we actually consider this probability as a function of the model parameters bsvol and w0, we introduce a new letter L (for "Likelihood") and write Prob[ DAX-ret in [y_k,y_k+dy_k] for all k ] = L( w0 , bsvol ) (6) = L( w0 , bsvol | y_k=ret(t_k) ) where the notation in the last line of (6) should just be a reminder that the y_k are given data, namely the returns which have realized. This L is then called the likelihood function and a standard procedure in statistics is simply to estimate model parameters by maximizing this likelihood func- tion. Accordingly, this method is called "Maximum Likelihood Estimation". Since we have about N=2500 returns in a time series of 10 years, L is given by a product of N=2500 factors. Thus, we should expect that L behaves like L ~ exp( +N*something ) or L ~ exp( -N*something ) (7) which is either extremely large or extremely small. To avoid numerical issues, it is therefore common to consider the logarithm log(L) of the likelihood function. Apparently, maximizing L is equivalent of maximizing log(L), so the logic stays the same, but for log(L) we can expect a behaviour of log(L) ~ +N*something or log(L) ~ -N*something (8) which is much more tractable than the expressions in (7). Thus we consider log(L) = sum_{k=1}^N { log[1/vol(t_{k-1})] - (9) 1/2 * y_k^2 / vol^2(t_{k-1}) } + const where the constant const = N*log[1/sqrt{2pi}] + sum_{k=1}^N log[dy_k] (10) does not depend on the model paramters w0 and bsvol. Since we maximize with respect to model parameters, we therefore can ignore the constant (10) and end up with the problem of maximizing the function log(tilde L) = log(tilde L)( bsvol , w0 ) (11) = - sum_{k=1}^N { log[vol(t_{k-1})] + 1/2 * y_k^2 / vol^2(t_{k-1}) } with y_k = ret(t_k) and vol^2(t_{k-1}) = w0*bsvol^2 + (1-w0)*ret^2(t_{k-1}) (2) This maximization problem has no longer an analytic closed form solution and one has to resort to some kind of numerical method. This is actually a typical situation in Maximum Likeli- hood Estimation: the maximum cannot be found in analytic closed form, but has to be computed numerically. We make an R-implementation to perform this task: # Start R-Session: # # before we fit to DAX-data, we check on an artificially # generated ARCH(1) return set: let's use the pathArchd # function from the last chapter with d=1 to generate these # artificial return data: # to make this reproducible, we fix a seed for # the random number generator: set.seed(456) pathArchd = function( N , bsvol , w0 , d ) { S = rep(0,N) ret = rep(0,N) phi = rnorm(N) S0 = 100 vol0 = bsvol # we put startvol to bsvol vol = rep(vol0,N) # ARCH-vol sumretd = 0 # the first step, k=1, is special: S[1] = S0*(1+vol0*phi[1]) ret[1] = (S[1]-S0)/S0 vol[1] = sqrt( w0*bsvol^2 + (1-w0)*ret[1]^2 ) sumretd = sumretd + ret[1]^2 for(k in 2:N) { if(k<=d) { S[k] = S[k-1]*(1+vol[k-1]*phi[k]) ret[k] = (S[k]-S[k-1])/S[k-1] sumretd = sumretd + ret[k]^2 vol[k] = sqrt( w0*bsvol^2 + (1-w0)*sumretd/k ) } else { S[k] = S[k-1]*(1+vol[k-1]*phi[k]) ret[k] = (S[k]-S[k-1])/S[k-1] sumretd = sumretd + ret[k]^2 sumretd = sumretd - ret[k-d]^2 vol[k] = sqrt( w0*bsvol^2 + (1-w0)*abs(sumretd)/d ) } }#next k par(mfrow=c(2,1)) plot(S,type="l",ylim=c(0,250)) plot(ret,type="l",ylim=c(-0.08,0.08)) result = list(S,ret) names(result) = c("S","ret") return(result) } N = 2500 bsvol = 0.01 w0 = 0.5 res = pathArchd( N , bsvol , w0 , d=1 ) ret=res$ret summary(ret) # artificially generated price path # and return set:
# for this return set ret we now want to calculate the # log-likelihood function and a maximization should # result in model parameters around bsvol = 1% = 0.01 # and w0 = 0.5: # # we calculate the log-Likelihood function: logL = function( bsvol , w0 ) { w1 = 1 - w0 vol = sqrt(w0*bsvol^2+w1*ret^2) # we can use the vector ret, although we # did not pass it through the arguments # shift vol by 1 element: n = length(ret) vol = vol[-n] # remove last one vol = c(sqrt(w0*bsvol^2),vol) # add a new first one terms = log(vol) + 1/2*ret^2/vol^2 result = -sum(terms) return(result) } # let's check: logL( bsvol=0.01 , w0=0.5 ) # ok, we get some number # since we have only 2 variables, we can look for the maximum # by making a contour-plot. To become familiar with the # contour()-function, let's first consider an example: #----------------------------------------------------- # START Example: Contour-Plot: # # function of 2 variables to be plotted: testf = function(x,y) { result = (-x^2+y^2)*exp(-0.25*x^2-0.5*y^2) return(result) } # the contour()-command has syntax # contour( vector , vector , matrix ) = # contour( x-Achse , y-Achse , f(x,y) ) x = seq(from=-4,to=4,by=0.1) y = x # matrix with testf(x,y)-values: z = outer( x , y , FUN = testf ) # this works only if the function testf can # accept vectors as input variables # and now the level curves: contour( x , y , z )
contour( x , y , z , nlevels=50 ) # more refined
# instead of using the outer()-command # we can use a loop to generate z: nx = length(x) ny = length(y) z = matrix( 0 , nrow=nx , ncol=ny ) for(i in 1:nx) { for(j in 1:ny) { z[i,j] = testf(x[i],y[j]) # now scalars as arguments, # no longer vectors } } # same contour-plots as above: contour( x , y , z ) contour( x , y , z , nlevels=50 ) # more refined # also useful: image( x , y , z )
persp( x , y , z )
# with plot3D package: require(plot3D) persp3D( x , y , z )
# # END Example Contour-Plot #----------------------------------------------------- # Let's make a contour-plot of the log-likelihood function: # # Grid for the model parameters (recall that the model # is unstable for w0=0 or bsvol=0): bsvol = seq( from=0.1/100 , to=4/100 , by=0.1/100 ) w0 = seq( from=0.05 , to=0.95 , by=0.05 ) # the following does not work, since logL does not accept # vectors as input; the reason is that in logL there is a line # vol = sqrt(w0*bsvol^2+w1*ret^2) # which does not make sense if all w0, w1 and ret are vectors: z = outer( bsvol , w0 , FUN = logL ) # does not work # so let's do this with a loop: nbs = length(bsvol) nw = length(w0) z = matrix( 0 , nrow=nbs , ncol=nw ) for(i in 1:nbs) { for(j in 1:nw ) { z[i,j] = logL( bsvol[i] , w0[j] ) # this is ok } } # let's look at the result: contour( bsvol , w0 , z )
contour( bsvol , w0 , z , nlevels = 50 )
contour( bsvol , w0 , z , nlevels = 50 , zlim=c(10000,11000) ) # eventually vary range for zlim-values
# okay, that works!! -> we recover correct parameter values!! # bsvol around 0.01 and w0 around 0.5 # some more pictures (would require more fine-tuning): image( bsvol , w0 , z )
image( bsvol , w0 , z , col = rainbow(100) , zlim=c(10000,11000) )
persp( bsvol , w0 , z )
persp3D( bsvol , w0 , z )
--------------------------------------- # ARCH(1)-Fit to DAX-Data: # --------------------------------------- # load DAX.txt data and generate return vector ret: dax = read.table("C:/Users/Admin/Courses/FinancialStatistics/DAX.txt",header=TRUE,sep=";") dax = na.omit(dax) summary(dax) S = dax$index n = length(S) ret = rep(0,n) for(i in 2:n) { ret[i] = (S[i]-S[i-1])/S[i-1] } plot(ret,type="l",main="dax-returns")
# # calculate DAX-logL and make setup for contour-plot: # bsvol = seq( from=0.1/100 , to=4/100 , by=0.1/100 ) w0 = seq( from=0.05 , to=0.95 , by=0.05 ) nbs = length(bsvol) nw = length(w0) z = matrix( 0 , nrow=nbs , ncol=nw ) for(i in 1:nbs) { for(j in 1:nw ) { z[i,j] = logL( bsvol[i] , w0[j] ) } } # let's look at the result: contour( bsvol , w0 , z )
contour( bsvol , w0 , z , nlevels = 50 )
contour( bsvol , w0 , z , nlevels = 50 , zlim=c(8000,10000) )
contour( bsvol , w0 , z , nlevels = 50 , zlim=c(9000,9600) )
# we take a more refined look at the region w0 = 0.5 to 0.9 # and bsvol = 1.0% to 2.0%: bsvol = seq( from=1.0/100 , to=2.0/100 , by=0.02/100 ) w0 = seq( from=0.5 , to=0.9 , by=0.01 ) nbs = length(bsvol) nw = length(w0) z = matrix( 0 , nrow=nbs , ncol=nw ) for(i in 1:nbs) { for(j in 1:nw ) { z[i,j] = logL( bsvol[i] , w0[j] ) } } # let's look at the results: contour( bsvol , w0 , z , nlevels = 100 , zlim=c(9000,9600) )
contour( bsvol , w0 , z , nlevels = 100 , zlim=c(9500,9550) )
max(z) # 9534.98 logL( bsvol=sd(ret) , w0=0.72 ) # 9534.64 sd(ret) # 1.39% # thus: maximum approximately at # bsvol = sd(ret) = 1.39% and w0 = 0.72 # Before we redo the calculations for SPX and GE, we code # a small function GetZ which sets up the matrix z for # the contour-plots: GetZ = function( x , y ) { nx = length(x) ny = length(y) z = matrix( 0 , nrow=nx , ncol=ny ) for(i in 1:nx) { for(j in 1:ny) { z[i,j] = logL(x[i],y[j]) } } return(z) } --------------------------------------- # ARCH(1)-Fit to S&P500-Data: # --------------------------------------- # load SPX.txt data and generate return vector ret: spx = read.table("C:/Users/Admin/Courses/FinancialStatistics/SPX.txt",header=TRUE,sep=";") head(spx) summary(spx) # no NA's S = spx[,2] n = length(S) ret = rep(0,n) for(i in 2:n) { ret[i] = (S[i]-S[i-1])/S[i-1] } plot(ret,type="l",main="spx-returns") # returns look ok
# compute spx-logL and make setup for contour-plot: bsvol = seq( from=0.1/100 , to=4/100 , by=0.1/100 ) w0 = seq( from=0.05 , to=0.95 , by=0.05 ) z = GetZ( bsvol , w0 ) # let's look at the results: contour( bsvol , w0 , z )
contour( bsvol , w0 , z , nlevels = 50 )
contour( bsvol , w0 , z , nlevels = 50 , zlim=c(60000,80000) )
contour( bsvol , w0 , z , nlevels = 100 , zlim=c(60000,80000) )
# we take a more refined look at the region w0 = 0.5 to 0.9 # and bsvol = 0.5% to 1.5%: bsvol = seq( from=0.5/100 , to=1.5/100 , by=0.02/100 ) w0 = seq( from=0.5 , to=0.9 , by=0.01 ) z = GetZ( bsvol , w0 ) contour( bsvol , w0 , z , nlevels = 100 , zlim=c(60000,80000) )
contour( bsvol , w0 , z , nlevels = 100 , zlim=c(69000,70000) )
max(z) # 69471.9 logL( bsvol=sd(ret) , w0=0.69 ) # 69472.17 sd(ret) # 0.97% # thus: maximum approximately at # bsvol = sd(ret) = 0.97% and w0 = 0.69 --------------------------------------- # ARCH(1)-Fit to GE-Data: # --------------------------------------- # load GE.txt data and generate return vector ret: ge = read.table("C:/Users/Admin/Courses/FinancialStatistics/GE.txt",header=TRUE,sep=";") head(ge) summary(ge) # no NA's S = ge[,2] n = length(S) ret = rep(0,n) for(i in 2:n) { ret[i] = (S[i]-S[i-1])/S[i-1] } plot(ret,type="l",main="General Electric returns") # returns look ok:
# compute GE-logL and make setup for contour-plot: bsvol = seq( from=0.1/100 , to=4/100 , by=0.1/100 ) w0 = seq( from=0.05 , to=0.95 , by=0.05 ) z = GetZ( bsvol , w0 ) # let's look at the result: contour( bsvol , w0 , z )
contour( bsvol , w0 , z , nlevels = 50 )
contour( bsvol , w0 , z , nlevels = 50 , zlim=c(40000,50000) )
# we take a more refined look at the region w0 = 0.5 to 0.9 # and bsvol = 1.0% to 2.0%: bsvol = seq( from=1.0/100 , to=2.0/100 , by=0.02/100 ) w0 = seq( from=0.5 , to=0.9 , by=0.01 ) z = GetZ( bsvol , w0 ) contour( bsvol , w0 , z , nlevels = 50 , zlim=c(40000,50000) )
contour( bsvol , w0 , z , nlevels = 100 , zlim=c(49500,50000) )
max(z) # 49811.81 logL( bsvol=sd(ret) , w0=0.68 ) # 49811.89 sd(ret) # 1.62% # thus: maximum approximately at # bsvol = sd(ret) = 1.62% and w0 = 0.68