```---------------------------------------------------
#                                                 #
#     Chapter 2: Typical Properties of            #
#           Financial Time Series                 #
#                                                 #
---------------------------------------------------

Returns of liquid tradable assets like stocks, stock
indices, currencies or commodities like oil, gold or
silver have some common typical properties which are
usually referred to as stylized facts. In the following
we list these stylized facts and illustrate them on
concrete data.

Let S_k denote the closing price of some liquid
tradable asset on day t_k. Its return ret(t_k) from
day t_{k-1} to day t_k is defined by

ret(t_k) := (S_k - S_{k-1}) / S_{k-1}                (1)

The d-day mean on day t_k is defined by

mean_d(t_k) := 1/d sum_{j=0}^{d-1} ret(t_{k-j})      (2)

There are the following so called stylized facts:

Stylized Fact 1) For arbitrary time horizons d (larger
than 2 or 3 weeks) the d-day means are close to zero.

Because of stylized fact1 the d-day standard deviation
(also called d-day volatility) on day t_k can be defined
by

stddev_d(t_k) := sqrt{ 1/d sum_{j=0}^{d-1} ret(t_{k-j})^2 }
(3)

Normalized quantities, in a probabilistic context, are
defined by (quantity - mean)/stddev which reduces to
quantity/stddev if the mean is zero. Thus, because of
stylized fact1, we define normalized returns (for a
d-day time horizon) through

normret_d(t_k) := ret(t_k) / stddev_d(t_{k-1})       (4)

Then one can take a look at the distribution of the
normalized returns by making a histogram plot. One
finds:

Stylized Fact 2a) For large time horizon d (say, d>=250,
which corresponds to one year), the distribution of the
normalized returns is "leptocurtic", that is, there is
a more pronounced peak at zero and more faster decay
in a neighborhood of zero when compared to the standard
normal distribution. Furthermore there are "heavy tails":
there are some very large positive or negative normalized
returns which one would not see under the assumption of
a normal distribution.

Stylized Fact 2b) By making d smaller and finally put-
ting it to d=15 or d=20 (3 or 4 weeks), the distribution
of the normalized returns approaches more and more the
shape of a normal distribution. The occurance of heavy
tails decreases, but the variability of the d-day stddev
or equivalently, the variability of the d-day volatility
increases. This observation lies at the bottom of the
stochastic volatility models.

Furthermore one finds the following:

Stylized Fact 3) Returns for different days t_k are
nearly uncorrelated. However, the absolute value or
the square of returns have a significant positive
correlation (which could be around 10%-20%).

Stylized Fact 4) "volatility clustering": returns tend
to group themselves into clusters or phases with high
or low volatility, there is no uniform distribution
of the returns over time. This is also a motivation
for stochastic volatility models.

Let's demonstrate these facts now by looking at con-
crete data: the file DAX.txt contains the daily closing
prices for the time period 2005-2014, the file SPX.txt
contains the daily closings of the S&P500 for 1950-2015
and GE.txt contains the daily closing prices for the
General Electric stock from 1962 to 2015. This results
in approximately
DAX:  2500
SPX:  16500
GE:   13500
observation days. All data are taken from Yahoo Finance.

#
# Start R Session:
#
dax
tail(dax)
summary(dax)
str(dax)
# let's remove all rows with NAs:
dax = na.omit(dax)
summary(dax)

# in R there are basically 4 different data types: vectors,
# matrices, data frames and lists:
# dax is of type data frame
# the content of each column can be written into some vector:

names(dax)
S = dax\$index             # equivalent to S = dax[,2]
S
str(S)
plot(S)
plot(S,type="l")
dates = as.Date(dax\$dates)
plot(dates,S,type="l",main="DAX30, 2005-2014")
```

```
S[1]
n=length(S)
n
S[n]
S[n+1]

# we calculate the returns:
ret = rep(0,n)            # vector of length n, all entries 0
for(i in 2:n)
{
ret[i] = (S[i]-S[i-1])/S[i-1]
}
plot(ret)
plot(dates,ret,type="l",main="dax-returns, 2005-2014")
```

```
# we code 3 functions for the d-day means, stddevs and normrets:

dDayMean = function( d , ret )
{
n = length(ret)
result = rep(0,n)
summe = 0

for(i in 1:n)
{
summe = summe + ret[i]
if(i > d)
{
summe = summe - ret[i-d]
result[i] = summe/d
}
else
{
result[i] = summe/i
}
}
return(result)
}

# let's take a look at Stylized Fact 1:
mean20 = dDayMean(20,ret)
mean60 = dDayMean(60,ret)
mean180 = dDayMean(180,ret)
plot(dates,mean20,main="means of dax-returns, d = 20, 60 and 180 days")
points(dates,mean60,col="red")
points(dates,mean180,col="yellow")
```

```
dDayStdDev = function( d , ret )
{
n = length(ret)
result = rep(0,n)
summe = 0

for(i in 1:n)
{
summe = summe + ret[i]*ret[i]
if(i > d)
{
summe = summe - ret[i-d]*ret[i-d]
result[i] = summe/d
}
else
{
result[i] = summe/i
}
}
# a theoretical 0 could numerically become slightly negative:
result = abs(result)     # we want to take a square root
result = sqrt(result)    # square root element by element
return(result)
}

# let's check:
stddev20 = dDayStdDev(20,ret)
stddev60 = dDayStdDev(60,ret)
stddev180 = dDayStdDev(180,ret)
stddev_n = dDayStdDev(n,ret)
plot(dates,stddev20,main="standard deviation of dax-returns,\n d = 20, 60, 180 and all=2497 days")
points(dates,stddev60,col="red")
points(dates,stddev180,col="yellow")
points(dates,stddev_n,col="green")
# apparently: volatility is not constant
```

```
dDayNormRet = function( d , ret )
{
stddev = dDayStdDev(d,ret)
# stddev could be 0 if data are constant:
stddev = pmax(stddev,0.00000001)     # we want to devide by stddev
result = ret/stddev                  # division element by element
return(result)
}

normret15 = dDayNormRet(15,ret)
normret60 = dDayNormRet(60,ret)
normret250 = dDayNormRet(250,ret)
normret_n = ret/stddev_n[n]

par(mfrow=c(2,2),oma=c(0,0,2,0))       #set up 2 time 2 plot array
#that is, 4 pictures at once
hist(normret_n,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 2497 (all days)")
hist(normret250,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 250")
hist(normret60,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 60")
hist(normret15,breaks=40,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 15")
title(main="Histograms of d-Day Normalized DAX30 Returns",outer=TRUE)
```

```
# thus: if returns are normalized with more recent volatility
# data, then normalized returns are more Gaussian.
#
# Heavy tails are less likely to occur, since the normalization
# can take care of a recent increase in volatility, if the nor-
# malization is calculated from more recent data instead of
# using the whole time horizon of 10 or 20 years.

----------------------------------------------
#    let's redo this for SPX, since 1950:    #
----------------------------------------------

tail(spx)
summary(spx)
str(spx)                    # no NA's!

names(spx)
S = spx[,2]                 # equivalent to S = spx\$Adj.Close
dates = as.Date(spx\$Date,format="%d-%m-%y")
dates                       # dates prior to 1969 have wrong year
datecorr = as.Date("1968-12-31") - as.Date("2068-12-31")
dates[1:4750] = dates[1:4750] + datecorr
dates
plot(dates,S,type="l",main="S&P500, 1950 - 2015\n linear scale")
```

```plot(dates,S,type="l",log="y",main="S&P500, 1950 - 2015\n log scale")  # straight line would be constant exponential growth
```

```n=length(S)
n

# we calculate the returns:
ret = rep(0,n)
for(i in 2:n)
{
ret[i] = (S[i]-S[i-1])/S[i-1]
}
plot(dates,ret,type="l",main="spx-returns, 1950 - 2015")
```

```
# let's take a look at Stylized Fact 1:
mean20 = dDayMean(20,ret)
mean60 = dDayMean(60,ret)
mean180 = dDayMean(180,ret)
plot(dates,mean20,main="means of spx-returns, d = 20, 60 and 180 days")
points(mean60,col="red")
points(mean180,col="yellow")
```

```
# stddev's = daily volatilities:
stddev20 = dDayStdDev(20,ret)
stddev60 = dDayStdDev(60,ret)
stddev180 = dDayStdDev(180,ret)
stddev_n = dDayStdDev(n,ret)
plot(dates,stddev20,main="standard deviation of spx-returns,\n d = 20, 60, 180 and all=16553 days")
points(dates,stddev60,col="red")
points(dates,stddev180,col="yellow")
points(dates,stddev_n,col="green")
# apparently: volatility is not constant
```

```
# normalized returns:
normret15 = dDayNormRet(15,ret)
normret60 = dDayNormRet(60,ret)
normret250 = dDayNormRet(250,ret)
normret_n = ret/stddev_n[n]

par(mfrow=c(2,2),oma=c(0,0,2,0))       #set up 2 time 2 plot array
#that is, 4 pictures at once
hist(normret_n,breaks=150,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 16553 (all days)")
hist(normret250,breaks=100,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 250")
hist(normret60,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 60")
hist(normret15,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 15")
title(main="Histograms of d-Day Normalized S&P500 Returns",outer=TRUE)
```

```
# thus: if returns are normalized with more recent volatility
# data, then normalized returns are more Gaussian

-----------------------------------------------
#    finally we take a single stock:          #
#    General Electric Company, since 1962:    #
-----------------------------------------------
tail(GE)
summary(GE)                 # no NA's
names(GE)
S = GE[,2]
dates = as.Date(GE\$Date,format="%d-%m-%y")
dates                       # dates prior to 1969 have wrong year
datecorr = as.Date("1968-12-31") - as.Date("2068-12-31")
dates[1:1737] = dates[1:1737] + datecorr
dates
plot(dates,S,type="l",main="General Electric, 1962 - 2015\n linear scale")
```

```plot(dates,S,type="l",log="y",main="General Electric, 1950 - 2015\n log scale")  # straight line would be constant exponential growth
```

```
n=length(S)
n

# we calculate the returns:
ret = rep(0,n)
for(i in 2:n)
{
ret[i] = (S[i]-S[i-1])/S[i-1]
}
plot(dates,ret,type="l",main="GE-returns, 1962 - 2015")
```

```
# let's take a look at Stylized Fact 1:
mean20 = dDayMean(20,ret)
mean60 = dDayMean(60,ret)
mean180 = dDayMean(180,ret)
plot(dates,mean20,main="means of GE-returns, d = 20, 60 and 180 days")
points(dates,mean60,col="red")
points(dates,mean180,col="yellow")
```

```
# stddev's = daily volatilities:
stddev20 = dDayStdDev(20,ret)
stddev60 = dDayStdDev(60,ret)
stddev180 = dDayStdDev(180,ret)
stddev_n = dDayStdDev(n,ret)
plot(dates,stddev20,main="standard deviation of GE-returns,\n d = 20, 60, 180 and all=13540 days")
points(dates,stddev60,col="red")
points(dates,stddev180,col="yellow")
points(dates,stddev_n,col="green")
# apparently: volatility is not constant
```

```
# normalized returns:
normret15 = dDayNormRet(15,ret)
normret60 = dDayNormRet(60,ret)
normret250 = dDayNormRet(250,ret)
normret_n = ret/stddev_n[n]

par(mfrow=c(2,2),oma=c(0,0,2,0))       #set up 2 time 2 plot array
#that is, 4 pictures at once
hist(normret_n,breaks=150,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 13540 (all days)")
hist(normret250,breaks=100,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 250")
hist(normret60,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 60")
hist(normret15,breaks=50,xlim=c(-5,5),ylim=c(0,0.8),prob=TRUE,main="d = 15")
title(main="Histograms of d-Day Normalized GE Returns",outer=TRUE)
```

```
# thus: if returns are normalized with more recent volatility
# data, then normalized returns are more Gaussian

-------------------------------------------
#            Main   Conclusion:           #
-------------------------------------------

The normalized returns

normret_d(t_k) = ret(t_k) / stddev_d(t_{k-1})                (4)

may be approximated pretty well by a standard normal distri-
bution, if the time horizon d is chosen not too large: d=15
days or d=20 turned out to be quite reasonable choices.
Thus we can write

ret(t_k) / stddev_d(t_{k-1}) = phi_k                         (5)

with phi_k being a standard normal random number.
Equation (5) is equivalent to

S_k = S_{k-1} * [ 1 + stddev_d(t_{k-1}) * phi_k ]            (6)

Using the data which are known on day t_{k-1} and by drawing
a normally distributed random number, this equation gives us
a price for day t_k.

Now, is this a reasonable stochastic model? To this end we
will simulate a couple of paths with price dynamics given
by (6) in the next chapter and we will find that the model
actually cannot be used as exactly given by (6), but we have