frequency, the spread for BJS is equal to 1 in 90.5% of all data points and equals 2 ticks otherwise, and for MSFT, the spread is equal to 1 tick in 96.6% of all data points, otherwise being equal to 2 ticks.
5.3 Calibration
We henceforth assume that we have a cleaned dataset in the form (ti, β(ti), α(ti),z(ti))), i= 1, ..., N. We want to calibrate a continuous diffusion model to high-frequency financial market data, so we are faced with a number of problems:
The data is observed at discrete time points.
Time points are not evenly spaced.
Observations are made at extremely high frequency; sometimes there are only a few mil-liseconds between two trades.
Observations reflect typical market microstructure noise due to e.g. discreteness of price/vol-ume changes.
We need to disentangle the different volatility terms described in section 4.1, that is microstructure volatility caused by the spread, microstructure volatility caused by volume-imbalance, and exogeneous volatility.
5.3.1 Estimation of volatility parameters
We start by estimating the volatility parameters σ0, σ1 and σ2. If we naively try to estimate them by using every data point, the market microstructure noise introduces a bias. In fact, instead of estimating the volatility of the process we are interested in, we will estimate the volatility of the noise, scaled by the number of observations, as has been noticed in Ait-Sahalia, Zhang, and Mykland (2005, section 2.2). Thus we would obtain estimates for σi, i = 0,1,2 which are far too large. The microstructure noise is mainly due to discreteness of price ticks and other microstructure effects which are not explicitly included in our model, stemming, for example, from price ticks beyond the best quotes. To overcome this problem, we use the method oftwo scales realized volatility which was introduced in Ait-Sahalia, Zhang, and Mykland (2005) as the first-best approach for volatility estimation of noisy high-frequency data, and which we briefly describe below:
Suppose we want to estimate the quadratic variation [X] of a process X, and we observe the noisy process Yti = Xti +ti at times ti, i = 1, . . . , N, where the ti’s are independent noise around the true value ofXti. We proceed in 3 steps:
Partition the original time grid G = {t1, . . . , tN} into subsamples G(k), k = 1, . . . , K, withG(k)=n
t(k)i |i= 1, . . . , N(k)o
,t(k)i ≤t(k)i+1, and N/K→ ∞ asN → ∞.
70 Model calibration and test
(a)AAPL
(b)BJS
(c) MSFT
Figure 5.1: Overview midquote price, spread and volume imbalance for each stock
5.3 Calibration 71
For each subsample G(k) , compute d[X](k) = PN(k) i=1
Xt(k)
i+1
−Xt(k) i
2
. Then compute d[X]all =PN
i=1 Xti+1−Xti2
and d[X]avg=PK
i=1[X]d(k)/K.
Finally compute a combination of the estimators obtained over the two time scales ’all’
and ’avg’, the bias-adjusted estimator
d[X] =d[X]avg− 1 Kd[X]all
The estimator [X] is shown to have desirable properties in Ait-Sahalia, Zhang, and Myklandd (2005). In particular one can show that the estimator is correctly centered and converges at rate n1/6 ((see Ait-Sahalia, Zhang, and Mykland, 2005, Theorem 4)). Using this method, we obtain estimates forσ1 and σ2.
It remains to estimate the parameter σ0 in (5.2). We need to proceed carefully because the infinite variation part of the midquote price is composed by
the constant volatility term σ20dB0(t),
the stochastic volatiliy term, depending on the current spread σ1
2√ 2
ps(t)dB2(t), and
additional volatility coming from market microstructure noise.
Again, using the method of two scales realized volatility, we obtain estimators[m](t) for [m](t)c and [s](t) for [s](t). Now note thatb
[s](t) = σ12 2
Z t 0
s(u)du, [m](t) = σ12
8 Z t
0
s(u)du+ σ20 4 t, so we obtain a good estimator for σ0 by setting
ˆ σ0=
s
4[m](t)c −[s](t)b
t . (5.4)
5.3.2 Estimation of CIR process
Suppose we have data (ti, s(ti)), i = 1, . . . , N, and we assume that s(t) is a realization of a CIR process given by the SDE
ds(t) =κ(µ−s(t))dt+σp
s(t)dW(t),
where W is a standard Brownian motion, and σ is known. We will estimate the parameters ϑ= (κ, µ)∈Θ =R+×R+ using the method of maximum likelihood. Given s(t) at time t, let
p(s(t+ ∆t)|s(t);ϑ,∆t)
72 Model calibration and test
be the conditional density of s(t+ ∆t) at time t+ ∆t. A maximum likelihood estimator ˆϑ is defined as
ϑˆ= arg max
ϑ∈Θl(ϑ), wherel is log-likelihood function
l(ϑ) =
N−1
X
i=1
logp(s(ti+1)|s(ti);ϑ, ti+1−ti).
The transition density of the CIR process is
p(s(t+ ∆t)|s(t);ϑ,∆t) =ce−u−v(v/u)q/2Iq(2√ uv), whereIq is the modified Bessel function of the first kind and orderq, and
c= 2κ
σ2(1−eκ∆t), u=cs(t)eκ∆t, v=cs(t+ ∆t), q= 2κµ
σ2 −1.
As pointed out in Kladivko (2007), we shall use a scaled version of the Bessel functionIq1(x) = e−xIq(x) when implementing the log-likelihood function in MATLAB to ensure stability of the optimisation. For a MATLAB implementation of CIR log-likelihood function see appendix A.1.
Any numerical method used to find the optimal ˆϑ(e.g., gradient search or derivative-free meth-ods), requires a good initial estimate. As suggested in Kladivko (2007) we use ordinary least square (OLS) approximation to obtain the initial estimate; for this, consider a discretized version of the CIR process:
s(t+ ∆t)−s(t) =κ(µ−s(t))∆t+σp
s(t)(∆t), (5.5)
where (∆t) is normally distributed with zero mean and variance ∆t. Transforming (5.5), we obtain
s(t+ ∆t)−s(t)
ps(t) − κµ∆t ps(t) −κp
s(t)∆t=σ(∆t)
and we take as initial estimate forκ, µthe minimizer of the OLS objective function (ˆκ,µ) = arg minˆ
κ,µ N−1
X
i=1
s(ti+1)−s(ti)
ps(ti) − κµ∆ti ps(ti) +κp
s(ti)∆ti
!2
, where ∆ti=ti+1−ti. Appendix A.1 contains the corresponding MATLAB code.
5.3.3 Estimation of OU process
Suppose we have data (ti, z(ti)), i= 1, . . . , N, and it is assumed that z(t) is a realization of an OU process given by the SDE
dz(t) =ρ(ν−z(t))dt+σdW(t),
5.3 Calibration 73 whereW is a standard Brownian motion, and σ is known. As in the CIR case we estimate the parameters ϑ = (ρ, ν) ∈ Θ = R+×R using the method of maximum likelihood. Again, the transition density is known in closed form:
p(z(t+ ∆t)|z(t);ϑ,∆t) = 1
√
2πσ˜2 exp (
− z(t+ ∆t)−z(t)e−ρ∆t−ν(1−e−ρ∆t)2
2˜σ2
) ,
with ˜σ2=σ2 1−e2ρ−2ρ∆t. To obtain a good initial estimate, we apply again the OLS method. The discretized version of the OU process reads
z(t+ ∆t)−z(t) =ρ(ν−z(t))∆t+σ(∆t), (5.6) where (∆t) is normally distributed with zero mean and variance ∆t. Minimizing the OLS objective function we find initial estimates for the drift term
( ˆρ,νˆ) = arg min
ρ,ν N−1
X
i=1
(z(ti+1)−z(ti)−ρν∆ti+ρz(ti)∆ti)2, with ∆ti=ti+1−ti. For a MATLAB implementation see appendix A.1.
5.3.4 Estimation of midquote price
Consider a discretized version of (5.2):
m(t+ ∆t)−m(t) = 1
2(d1+d2z(t))∆t+(∆t,s(t)),
where(∆t,s(t)) is normally distributed with mean zero and variance σ12s(t)(∆t)8 2 +σ02(∆t)4 2. The parameters d1 and d2 are estimated by a simple ordinary least square optimization
( ˆd1,dˆ2) = arg min
d1,d2
N−1
X
i=1
m(ti+1)−m(ti)−1
2d1∆ti−1
2d2z(ti)∆ti 2
. Again, the MATLAB code can be found in appendix A.1.
5.3.5 Calibration algorithm
The following pseudocode illustrates the calibration algorithm.
Input: Orderbook data (ti, βi, αi,zi)Ni=1.
Output: Estimators for model parameters (ˆκ,µ,ˆ σˆ1,ρ,ˆ ν,ˆ σˆ2,dˆ1,dˆ2,σˆ0).
fori= 1 to N si =αi−βi; mi= (αi+βi)/2;
end
74 Model calibration and test
Parameter AAPL BJS MSFT
κ 7.4×10−5 1.2×10−2 2.6×10−4 µ 3.2×10−2 1.1×10−2 1.0×10−2 σ1 1.2×10−3 3.7×10−4 3.7×10−4 ρ 8.7×10−5 3.6×10−5 5.6×10−5 ν −8.9×10−2 −1.1×10−1 −9.7×10−3 σ2 2.0×10−2 1.8×10−2 2.3×10−2 d1 −2.3×10−7 2.1×10−8 −7.2×10−8 d2 1.9×10−6 8.0×10−8 1.4×10−7 σ0 9.5×10−4 1.2×10−4 1.4×10−4 Table 5.3: Parameters obtained by estimation procedure
Compute [s]bavg,[m]cavg,[z]bavg; (section 5.3.1) Compute [s]ball,[m]call,[z]ball; (section 5.3.1) [s] =b [s]bavg−[s]ball/K;
[m] =c [m]cavg−[m]call/K;
[z] =b [z]bavg−[z]ball/K;
ˆ
σ1 = (2[s]/b RtN
t1 s(u)du)1/2; ˆ
σ2 = (2[z]/(tb N−t1))1/2; ˆ
σ0 = ((4[m]c −[s])/(tb N −t1))1/2; (ˆκ,µ) = arg minˆ κ,µ
−LogLikeCIR(κ, µ; ˆσ1,(ti)Ni=1,(si)Ni=1) ; ( ˆρ,ν) = arg minˆ ρ,ν
−LogLikeOU(ρ, ν; ˆσ2,(ti)Ni=1,(zi)Ni=1) ; ( ˆd1,dˆ2) = arg mind1,d2
nPN−1
i=1 m(ti+1)−m(ti)−12d1∆ti−12d2z(ti)∆ti
2o
;
The detailed MATLAB code can be found in appendix A.1.
5.3.6 Calibration results
Table 5.3 shows the parameters obtained by the algorithms described above. Note that we have d2 > 0 in all cases, which is consistent with the order flow assumption (X6) regarding volume-imbalance in section 3.1.
Also note that the algorithm achieves our objectives we set at the beginning of the chapter:
The algorithm is numerically stable, provided that we use stable optimization routines to find the minimum of the log-likelihood functions (such routines certainly exist for 2-dimensional problems we consider here).