Volatility and duration models for financial intaday data: formulation, estimation and evaluation

This paper develops and tests empirically counting models for high frequency data: BIN (1,1) model with Poisson process, to check if this model allows to capture the clusturing phenomenon in the case of high frequence data, concening stocks intaday data. The process of estimation of the model using data generating process (DGP), then using the acutal data coming from three stocks of NYSE place (BOEING, DISNEY, and AWK), involves good results that validate model for generalisation to BIN(n,n) and for works on density forcasting. In this paper we studie the issue of adequacy of BIN models to capture the activities of financial markets about stocks intraday data (volume, quote, prices), and help to forecast the evolution of financial markets activities.


INTRODUCTION
It is usual to find time series consisting of count data. Such series record the number of events of a particular type occurring in a given interval. Since the data must consist of non-negative integers a model based on the normal distribution is not appropriate, although it might provide a reasonable approximation if the number of events observed in each time period is relatively large. Then for small numbers, the good distribution is a binomial process, but for a large number of observations the appropriate distribution is the Poisson. So, Poisson process should be used in the case of count data. The conditional distribution is important for the analysis of log-linear models and it leads us to an analysis based on multinomial distribution. financials events as trades through autoregressive conditional duration (ACD) models, while the so-called BIN models used for count data deal with the number of events of the high frequency data (as trades) during fixed durations.

Considering an independent Poisson random variables, if
The autoregressive conditional duration (ACD) model of Engle and Russell (1998) is one of the most important models of the durations in econometric literature. It was formulated as follows: ,... Here ( ), the conditional expected waiting time. In practice Engle and Russell (1998) have used an exponential or Weibull distribution on the } { t . Straightforward alternative structures would be to parameterize the t log instead of the t . This formulation called logACD proposed by Bauwens and Giot, allows to avoid to make constraints on parameters.
The two types of models are applied to high frequency data, particularly the financial data. The ACD models study the distribution of duration between the events (quote trades, volume or price duration), while the BIN models focus on the distribution of the number of events during a fixed length time. Then, the two types of models study the two faces of the same reality, but they could be considered as complementary than substitute.
The aim of this survey is to study the degree of relevance of BIN(1,1), the autoregressive form of the BIN models, in other words what is the degree of explanation of the financial market events, while in their paper, Rydberg and Shephard (2000) attempt to prove that, for modelling and forecasting the securities price changes on the stocks market, one could focus on i N which are the count data.

COUNT DATA MODEL: BIN MODELS
In their paper Rydberg and Shephard (2000) proposed to model an asset price ) (r p at time r using a compound Poisson process is a number of trades recorded up until r and t z is the price movement or change associated with the t-th trade. Rydberg and Shephard (2000) specified ) (r N to be a counting process 1 , modelled as Cox process -that is a Poisson process with a random intensity. From an economic viewpoint these authors are typically interested in comparing the rate of return on holding the asset with that obtainable by other risky investments (opportunity cost) or riskless interest rate bearing accounts. In order to do this one has to compute the return over a fixed length of time 0 . Then these returns will be based around the difference This shows that the number of trades in the interval ] plays a crucial role. To reflect this, Rydberg and Shephard specifies an expression as the number of trades in that time interval 2 . This operation called "binning operation" consists to partitions time into sections and we count the number of trades in that interval. Notice that if Thus predicting the variance of the price over the next period of length requires modeling the mean and variance of the future number of trades. In practice will be too small and so what matters in the above setup is really only ( ).  (2000)). Basically empirical modeling would require the assumptions that the } { i N and } { t z are stochastically independent.

Poisson process Definition 1
The counting process The process has independent increments, in other words the distribution is memoryless; (iii) The number of events in any interval of length t is Poisson process with mean t . That is for all ,... 1 , 0 = n Note that it follows from condition (iii) that a Poisson process has stationary increments and also that [ ] .
which explain why is called the rate of the Poisson process.
As a prelude to giving a second definition of a Poisson process we shall define the concept of function

Definition 2
The counting process The process has stationary and independent increment.
N is a Poisson process with parameter .
Then, under these assumptions, ; the duration between events follows exponential distribution of parameter : The hazard function is function of , and it is constant. The particularity of BIN models is that is random. So, this last model, will be the topic of this survey.

Structure of the model
In order to model the sequence Ridberg and Shephard (2000), suggested the BIN models that specify the one-step ahead forecast distribution of } { i N series using a counting distribution. In particular they specify denotes a Poisson distribution with mean i , i is a linear function of past data as moving average models: BIN (1,1). Then the BIN(1,1) is given as follows This model is inspired to the GARCH model due to Bollerslev (1986) and Taylor (1986).
This model is thus autoregressive moving average (ARMA) type for which is can be analyzed as standard multivariate ARMA models with white noise error term, where ). Many of the interesting features for the BIN model follow from this structure. This equation (as in the case of ACD(1,1)) shows that a BIN(1,1) process corresponds to a constrained ARMA(1,1) representation for i N , with autoregressive coefficient , + and moving average coefficient , and with a MD error term if 1 < + . The autocorrelation function (ACF) could be obtained by the standard formulae for the ARMA(1,1) model. Main features (mean, variance, autocorrelation function) of this model are described in following points.

Statistical properties of the BIN models
By definition, the conditional expectation of (3) allows us to forecast expected counts, based on the information set at the previous period.
3 i u is consider as a Martingale since i is the compensatory of i N .
And the autocorrelation function is derived as From the expression of ( 2 ), it is easy to check that ( ) that measures the overdispersion, is greater than zero. i N becomes more dispersed than i u when increases. From equation (8), it is easy to check that is greater than , if is greater than zero, and implies that the series observations is overdispersed.
The properties of i are sometimes helpful, in particular As in the standard ARMA case, p denotes the number of autoregressive terms in the model and q to denote the number of moving average ones.
We have: that leads to the fact that i is a non-negative sequence when 1 q with probability one. The ARMA representation of this model can be written as follows: is also a martingale difference sequence and .  BIN (1,1) model.

Numerical illustrations
The first and second unconditional moments, and the autocovariances can be computed analytically as shown above. Then it is interesting to give numerical results about these moments and autocovariances for several sets of parameters.
Numerical simulations allow by using (7), (8), and (9) to compute the degree of overdispersion and to get a figure (Figure 1) that plots the autocorrelation function for four set parameters 4 .
Similarly as in the case for the ARCH, GARCH, and ACD class of models, close to one implies a slowly decreasing autocorrelation function, and a large value of implies a large degree of overdispersion. Figure 1 gives the graphs of the theoretical and empirical autocorrelation function.

Lags Lags
The figure 1c seems to exhibit the best-fitted representation of the model according to the residuals. Thus, the real values of the parameters g and d should be closed to 0.10 and 0.85 respectively. The missing parameter here is the time interval of the count data denoted D. It will be taken into account in the case of the empirical work using the actual data.
In other hand, we can examine the experimental overdispersion implied by changes in values of gamma and delta in the following table. The results are derived from a simulated data that allows a Data Generating Process. The overdispersion ratio is defined as the ratio of standard deviation / mean, computed according to the formula (7) and (8). We parameterize l to one, thus a = 1-gd. In brackets we have the theoretical overdispersion ratio.
Results of Table 1 exhibit that the overdispersion ratio is an increasing function of g and decreasing function of d in BIN (1,1) model.

Estimation by Likelihood method Consider
T N N ,..., 1 be the T non-negative integers events count observations for the dependent variable that is a random dependent variable which represents the number of events (here financial: quote, price or volume) that have occurred during the observation period i . Let that the events, which occur within each period, are independent and have constant rate of occurrence, then, i N can by this fact follow a Poisson distribution with conditional probability density function: The Poisson regression model that is the standard model for count data, is a non linear regression. This regression model is hence based upon the Poisson distribution with intensity parameter that depends on covariates regressors. In the case of missing of stochastic variation, and with exact parametric dependence, with exogenous covariates, then we have the standard Poisson regression. The mixed Poisson regression is obtained if the function relating and the covariates is stochastic, likely because it involves unobserved random variables, then assumptions must be done to take into account the random term for obtaining the precise form or to come back to the standard Poisson model.
The appropriate data are cross-sectional for applied work, which consist of T independent observations, indexed by i ) , ( The log-linear form is the parameterization of the such that The Poisson distribution property allows us to write ( ) ( ),

N N
This probability is called the likelihood function, and it appears as a function of parameters conditional on the data. It is formulated as: This formulation allows suppressing the dependence of ( ) L on the data and has assumed independence over i . This definition could be extended to time series data by allowing i x to include lagged dependence and independent variables, even if it implicitly assumes crosssection data.
So, maximizing the likelihood function is equivalent to maximizing the log-likelihood function The data generating process for i n has density ( ) where 0 is the true parameter value. That is to say the asymptotic distribution of the MLE is usually obtained under the assumption that the density is correctly specified. Then, under the regularity conditions, 0 P , so the MLE is consistent for 0 . Then, where the q q × matrix A defined as

Simulation
In this section, we perform an algorithm for reference sample generation then, we compute the characteristics of theoretical model (after a parameterization), and the parameters from the generate sample. We did comparison between the two generated models. We also compute the autocorrelation function (see Figure1).
By Monte Carlo simulation of discrete distribution we can perform the estimation of Poisson distribution, but there exists in Gauss the command that allows to perform the estimation of a Poisson process.

Comparison analysis
By Data Generating Process (DGP) we realize a 1,000 observations sample.
The simulation allows us, by a play on parameters to obtain the following experimental results in Table 2 below, in brackets we have the theoretical results. We parameterize l to one, thus a = 1-gd in all computations.
The results allow us to draw the following conclusion.
, we have the best fitted model.
Then, the means of respectively theoretical and empirical results are 1.000 and 1.009. The standard deviations of the theoretical and empirical results are respectively 1.050 and 1.041. The theoretical and empirical overdispersions are respectively 1.050 and 1.032, and are the lower overdispersion values for theoretical and empirical results. The graph of the autocorrelation function in this case indicates that it is the best-fitted model.
It is easy to check that the overdispersion coefficient, which is equal to, N N increases with .
Another tools to check the adequacy of the model are the skewness and the kurtosis.
The skewness measures the locations indicate the number around which the sample data are centered. It indicates the direction in which a frequency distribution (or frequency curve or frequency polygon) leans. Then, the skewness equal to zero implies a symmetric distribution. We can also have the case of negative or positive skewness. The kurtosis measures a distribution's peakedness, the degree to which one narrow range of values contains a large fraction of sample data. So, skewness indicates whether the histogram "leans to the left (negative value)" or "leans to the right (positive value)", and kurtosis indicates how peaked it is. Their formulas are , which that the frequency curve (or density curve) leans to the right, then there are more values to the right of the model than to the left; 38 . 4 = kur > 3, that implies a peak curve.
These results are conforming to Poisson density function.

Validation of the model: Monte Carlo method
For purpose we used the data obtained by DGP for Maximum Likelihood estimation, and we get the following results in Table 3, 4, 5, 6. In brackets the fixed value of the parameter. The figures represent the graphs of counts and forecast counts. 79.87 Q(10)* 9.77 Q(10) and Q(10)* correspond respectively to the Ljung-Box Q-statistic of order 10 on counts ( i N ) and Q-statistic on the residual i u defined in the BIN(1,1) model. If Q is more than 18.307, then there is autocorrelation of order 10 for a threshold of 5 %.
The t-statistic must be compare to 1.96. If the t-stat is greater than 1.96 then the parameter is significant for a threshold of 5 %. Table 3 The results in Table 3 and the graphs of figure 2 show that the BIN(1,1) model behaves well, then the model may candidate for estimation and tests.      Table 5 It easy to check through the table 3, 4, 5 and their corresponding graphs, that the model has a good behaviour with a MLE thus, it can be used for estimation and for forecasting. There is an absence of residual autocorrelation in the three cases above. Then we can apply it for actual data.

APPLICATION TO NYSE DATA The data of three stocks
For empirical analysis, we choose a financial activity on two stocks traded on the NYSE: BOEING, DISNEY and AWK (American Water Work). The data that have been previously used for ACD by Bauwens, Giot and Veredas, were extracted from the Trade and Quote (TAQ) the database of the NYSE (for more detail, see CORE discussion paper of Bauwens and Giot (1999)). For the three stocks, we choose quote data for BOEING, quotes volume for DISNEY, and trades for AWK.
Before using these data, we must transform durations to counts. We get the data under the count data form according to Veredas method, that is program, which transforms durations to counts. The count is made for a fixed length of time. In our case we use the interval of 0.25 second, 1 second and 4 seconds, and estimations are performed using each form of data. The estimation method used here is the Maximum Likelihood method. The results are analyzed through the section 3.2 below. In brackets we have the standard deviation. Q(10) and Q(10)* are the Ljung-Box Q-statistic of order 10 on counts ( i N ) and Q-statistic on the residual i u defined in the BIN(1,1) model. If Q is more than 18.307, then there is autocorrelation of order 10 for a threshold of 5 %.

Estimation results
D is the fixed length of time. (The number of cases is 10,491 for D = 0.25; 2,623 for D = 1; and In brackets we have the standard deviation. Q(10) and Q (10)  In brackets we have the standard deviation. Q(10) and Q(10)* are the Ljung-Box Q-statistic of order 10 on counts ( i N ) and Q-statistic on the residual i u defined in the BIN(1,1) model. D is the fixed length of time. (The number of cases is 26,225 for D = 0.25; 6,557 for D = 1; and 1,640 for D = 4).
We can see that when the length of fixed time D increases, the values of a and g increase too.
Inversely, the increases of D make d decreasing. That implies the volatility of the clustering phenomenon, and it is the basic motivation of the ARMA representation, to capture the law of the process of the high frequency counts data in financial market microstructure system.
The good results are given by the fixed length D = 4, where there is no residual autocorrelation, in the estimation of the count data of the three stocks. Then, when D increases the estimation and tests results may be better, and the volatility of the clustering is more perceptible.

CONCLUSION
The aim of this survey is to built and test BIN(1,1) model for the counts data. Before generating the data by parametrization, we used these data for estimation by the ML method for the BIN(1,1) validation. The results exhibit a good behaviour of this model, so it could be applied to the actual data. It is what we did for empirical analysis. For purpose, we use the transformed data (durations to counts), for different fixed length of time. The results of estimation of three stocks that are object of trade on the NYSE (BOEING, DISNEY, and AWK) by the Maximum Likelihood method allow to draw the following remarks. There is less dependence between the high frequency data when we consider a large value of the fixed length of time, the value of gamma becomes more and more large which increases the dispersion. The model could be generalized to take into account other variables and it could be used for density forecasting.