<u<< ,*( ilia .>\li ucture
to a variety of datascis to gauge die explanatory power and stability ol each model.
3.-1. 3 Transactions Data
In llausman, l.o, and MacKinlay (1992), three specific aspects ol transactions data are examined using the ordered probit model оГ Section S.S.S: (I) Does the particular sequence <i(trades affect the conditional distribution of price changes, e.g., does the sequence of three price changes +1,-1.4-1 have the same effect on the conditional distribution of the next price change as the sequence - I, +1, +1? (2) Does trade size affect price changes, and if so, what is the price impact per unit volume of trade from one transaction to the next? (.!) Does price discreteness matter? In particular, can the conditional distribution of price changes be modeled as a simple linear regression ol price changes on explanatory variables without accounting for discreteness al all?
. To address these three questions, I Iausman, l.o, and MacKinlay (1992) estimate the ordered probit model for 1988 transactions data of over a hundred slocks. To conserve spate, we focus only on their smaller and more detailed sample of six stocks-International Business Machines Corporation (IBM), Quantum Chemical Corporation (CUK), Foster Wheeler Corporation (FWC), Handy and I human Company (IINI1), Navistar International Corporation (NAV), and American Telephone and Telegraph Incorporated (T). For these six slocks, they focus only on inlraday transaction price changes since it has been well-documented that overnight returns differ substantially from inlraday returns (see, for example, Amihud and Mendclson 11987, Stoll and Whaley [ 1990], and Wood, Mclnish, and Ord I 1985 j). They also impose several other filters to eliminate problem transactions and quotes, which yielded sample sizes ranging from 3,174 trades for I INI I to 200,794 trades for IBM.
They also use bid and ask prices in iheir analysis, and since bid-ask quotes are reported only when they are revised, some effort is required lo match quotes to transactions. A natural algorithm is to match each transaction price lo the most ret cully reported quote /trior to the transaction; however, Bronfman (1991) and l.ee and Ready (1991) have shown that prices of trades that precipitate quote revisions are sometimes reported with a lag, so that the order of quote revision and transaction price is reversed in official records such as the Consolidated Tape. To address this issue, llausman, l.o, and MacKinlay (1992) match transaction prices to quotes that are set at least Jive seconds (trior lo the transaction-the evidence in Lee and Ready (1991) suggests that this will account for most of the misscquenring. This is only one example of the kind ol unique challenges that transactions data pi ise.
З.-l. Recent Empirical Findings
To provide some intuition for this enormous dataset, we report a few summary statistics in Table 3.7. Our sample contains considerable price dispersion, with the low stock price ranging from $3.125 for NAV to $104.250 for IBM, and the high ranging from $7.875 for NAV to $129.500 for IBM. At $219 million, HNH has the smallest market capitalization in oursample, and IBM has the largest with a market value of $69.8 billion.
The empirical analysis also requires some indicator of whether a transaction was buyer-initiated or seller-initiated, otherwise the notion of price impact is ill-defined-a 100,000-share block-purchase has quite a different price impact from a 100,000-share block-sale. Obviously, this is a difficult task because for every trade there is always a buyer and a seller. What we hope to capture is which of the two parties is more anxious to consummate the trade and is therefore willing to pay for it by being closer to the bid price or the ask price. Perhaps the most obvious indicator is whether the transaction occurs at the ask price or at the bid price; if it is the former then the transaction is most likely a buy and if it is the latter then the transaction is most likely a sell. Unfortunately, a large number of transactions occur at prices strictly within the bid-ask spread, so that this method for signing trades will leave the majority of trades indeterminate.
llausman, Lo, and MacKinlay (1992) use the well-known algorithm of signing a transaction as a buy if the transaction price is higher than the fnean of the prevailing bid-ask quote (the most recent quote that is set at leaSt five seconds prior to the trade); they classify it as a sell if the price is lower. If die price equals the mean of the prevailing bid-ask quote, they classify the trade as an indeterminate trade. This method yields far fewer indetermmate trades than classifying according to transactions at the bid or at thej ask. Unfortunately, little is known about the relative merits of this method of classification versus others such as the tick test (which classifies a transaction as a buy, a sell, or indeterminate if its price is greater than, less than, or equal to the previous transactions price, respectively), simply because it is virtually impossible to obtain the data necessary to evaluate these alternatives, j
The Empirical Specification
To estimate the parameters of the ordered probit model via maximum likelihood, three specification decisions must be made: (i) the number of states m, (ii) the explanatory variables X*, and (iii) the parametrization of the variance ak.
In choosing от, we must balance price resolution against the practical constraint that too large an m will yield no observations in the extreme states ai and s, . For example, if we set m to 101 and define the states ii and s\o\ symmetrically to be price changes of -50 ticks and +50 ticks, respectively, we would find no V*s among our six stocks falling into these two states. Using the empirical distribution of the data as a guide, Hausman, Lo, and
Table 3.7. Summary statistics for transactions data of six stocks.
Summary statistics lor transaction prices ami corresponding ordered probit explanatory variables oflntcrnational Business Machines Corporation (IBM, 206,794 trades). Quantum Chemical Corporation (CUE, 26,927 trades), Foster Wheeler Corporation (FWC, 18.19!) trades). Handy and Harman Company (11NI I, 3,174 trades), Navistar International Corporation (NAV\ 96,1:37 trades), and American Telephone anil Telegraph Company (T. 180,726 trades), lor the perijid from January 4, 1988 to December 30, 1988.
MacKinlay (1992) scl m = 9 for the larger stocks, implying extreme slates of -i4 ticks or less and +4 ticks or more, and set m = 5 for the two smaller stocks, FWC and HNH, implying extreme states of -2 ticks or less and +2 ticks- or more.
The explanatory variables X* are selected to capture several aspects of transaction price changes: clock-time effects (such as the arithmetic Brow-
nian motion model), the effects of bid-ask bounce (since many transactions are merely movements from the bid price lo die ask price or vice versa), die si/.e of the transaction (so price impact can he determined as a function of the quantity traded), and the impact of systematic or inarkelwide movements on die conditional distribution of an individual stocks price changes. These aspects call for the following explanatory variables:
Alk: The time elapsed between transactions k- \ and ft, in seconds.
ABt i: The bid-ask spread prevailing at time in ticks.
Yu-\\ Three lags [I = 1, 2, 3] of the dependent variable Yk. Recall that for m - 9, price changes less than -4 ticks are set equal lo -4 licks (state .4), and price changes greater than +4 licks are set equal to +4 licks (state s.)), and similarly for m - 5.
Vt ,: Three lags [/ = 1, 2. 3] of the dollar volume of die (ft-.)th transaction, defined as the price of the (ft-/)(h transaction (in dollars, not licks) limes the number of shares traded (denominated in hundreds of shares); hence dollar volume is denominated in hundreds of dollars. To reduce the influence of outliers, if the share volume of a trade exceeds the 99.5 percentile of the empirical distribution of share volume for that stock, it is set equal to the 99.5 percentile.
SP.500*-/: Three lags [l = 1, 2. 3] of five-minute continuously compounded returns of the Standard and Poors (s&P) 500 index futures price, for the contract maturing in the closest month beyond the month in which transaction ft - / occurred, where the return is computed with the futures price recorded one minute before the nearest round minute prior to tk-i and the price recorded live minutes before this. IBS* /: Three lags [I - 1, 2, 3] of an indicator variable that lakes the value +1 if the (ft - />tli transaction price is greater than the average of the quoted bid and ask prices al time the value -1 if the (ft- i)lh transaction price is less than the average of the bid and ask prices al time and zero otherwise, i.e.,
l if i\-t > И-i + t,) ibs.., = о ii г,-, = +
-l if < £(/; , + / ,).
The specification of Xkj3 is then given by the following expression: Xkti = fli&k +/ ,У*-!! + /*.У.-.ч + №г)00 -1 + A;SP500.-2
+ /f7SP500* :t + /? IBS* , + A,IBSt 2 + , IBS* ., + Ли I 7i(V*-i) IBS*., I + fivi [ 7i(V 2) IBS* ) + fo:i( 71(V* ;()-IBS*-:i.
The variable Atk is included in Xk lo allow for clock-lime effects on the
conditional mean ol ). If prices arc stable in transaction time rather than dock lime, this cocflic ienl should he zero. Lagged price changes are included lo account for serial dependencies, and lagged returns of the SKP500 index futures price are included to account for market-wide effects on price changes.
lo measure ihe price impact of a trade per unit volume, the term /[ (V*-/) is included, which is dollar volume transformed according to the Box and Cox (l)li-l) transformation / ( ):
where t> e (), 11 is also a parameter to be estimated. The lAox-Cox transformation allows dollar volume to enter into the conditional mean nonlin-carly, a particularly important innovation since common intuition suggests that price impact may exhibit economies of scale with respect lo dollar volume; i.e., although total price impact is likely to increase with volume, the marginal price impact probably does not. The Box-Cox transformation captures the linear specification (i> = 1) and concave specifications up lo and including the logarithmic function (i> = 0). The estimated curvature of this transformation will play an important role in (he measurement ol price impact.
The transformed dollar volume variable is interacted with IBS , an indicator of whether the trade was buyer-initialed (IBSi=l), seller-initiated IBSt= - I I, o, indeterminate (IUS4=0). Л positive fit, would imply that buyer-initiated trades lend to push prices up and seller-initiated trades lend to drive prices down. Such a relation is predicted by several information-based models of trading, e.g., Kasley and OMara (1087). Moreover, the magnitude of/>ц is the per-unit volume impact on the conditional mean of Yk, which may be readily translated into the impact on the conditional probabilities of observed price changes. The sign and magnitudes ol fiVi and measure the persistence ol price impact.
finally, lo complete die specification the conditional variance al = Yii+ T. Y,~ N* must be parametrized. To allow for clock-lime effects At is included, and since there is some evidence linking bid-ask spreads to the in-fonnalion content and volatility of price changes (see, for example, Glostcn I 1987), llashtouck I I08H, l<)lia,b, and Petersen and Umlaut 1990]). the lagged spread ЛВ, , is also included. And since the parameter vectors a, fi, and 7 are unidentified without additional restrictions, is set lo one. This yields die specification
k = 1 т )гд/* + у.глв*.,.
Ill summary, the !>-slalc specific .ilion icciiires the estimation of 21 parameters: the partition boundaries tr,.....orK, the variance parameters yi and у-г.
Parameter IBM CUE FWC HNH NAV T
Maximum likelihood estimates of the partition boundaries of ihe ordered probit model for transaction price changes of International Business Machines Corporation (IBM, 206,794 trades), Quantum Chemical Corporation (CUE, 26,927 trades), Foster Wheeler Corporation (FWC, 18,199 trades), Handy and Harman Company (HNH, 3,174 trades), Navistar International Corporation (NAV, 96,127 trades), and American Telephone and Telegraph Company (T, 180,726 trades), for the period from January 4, 1988 to December 30,1988.
the coefficients of the explanatory variables fi\.....and the Box-Cox
parameter v. The 5-state specification requires the estimation of only 20 parameters.
77ti> Maximum Likelihood Estimates
Tables 3.8a and 3.10b report the maximum likelihood estimates of the ordered probit model for the six stocks. Table 3.8a contains the estimates of the boundary partitions a, and Table 3.8b contains the estimates of the slope coefficients /3. Entries in each of the columns labeled with ticker symbols are the parameter estimates for that stock; z-statistics, which are asymptotically standard normal under the null hypothesis that the corresponding coefficient is zero, are contained in parentheses below each estimate.
Table 3.8a shows that the partition boundaries are estimated with high precision for all stocks and, as expected, the z-stalistics are much larger for those slocks with many more observations. Note that the partition bound-
Table 3.8a. Estimates of ordered probit partition boundaries.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [ 24 ] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103