<pre>
-------------------------------------------------------------------------------
help for <b>xtabond2</b>
-------------------------------------------------------------------------------
<p>
<b><u>"Difference" and "system" GMM dynamic panel estimator</u></b>
<p>
        <b>xtabond2</b> <i>depvar</i> <i>varlist</i> [<b>if</b> <i>exp</i>] [<b>in</b> <i>range</i>] [<i>weight</i>] [<b>,</b> <b><u>l</u></b><b>evel(</b><i>#</i><b>)</b>
                <b><u>sv</u></b><b>mat</b> <b><u>two</u></b><b>step</b> <b><u>r</u></b><b>obust</b> <b><u>cl</u></b><b>uster(</b><i>varname</i><b>)</b> <b><u>noc</u></b><b>onstant</b> <b><u>sm</u></b><b>all</b>
                <b><u>nol</u></b><b>eveleq</b> <b><u>or</u></b><b>thogonal</b> <b>gmmopt</b> [<b>gmmopt</b> <i>...</i>] <b>ivopt</b> [<b>ivopt</b> <i>...</i>]
                <b>pca</b> <b><u>comp</u></b><b>onents(</b><i>#</i><b>)</b> <b><u>ar</u></b><b>tests(</b><i>#</i><b>)</b> <b><u>arl</u></b><b>evels</b> <b><u>h</u></b><b>(</b><i>#</i><b>)</b> <b><u>nod</u></b><b>iffsargan</b>
                <b><u>nom</u></b><b>ata</b>]
<p>
    where <b>gmmopt</b> is
<p>
        <b><u>gmm</u></b><b>style(</b><i>varlist</i> [<b>,</b> <b><u>lag</u></b><b>limits(</b><i>#</i> <i>#</i><b>)</b> <b><u>c</u></b><b>ollapse</b> <b><u>o</u></b><b>rthogonal</b> <b><u>e</u></b><b>quation(</b>{<b><u>d</u></b><b>iff</b>
                | <b><u>l</u></b><b>evel</b> | <b><u>b</u></b><b>oth</b>}<b>)</b> <b><u>p</u></b><b>assthru</b> <b><u>sp</u></b><b>lit</b>]<b>)</b>
<p>
    and <b>ivopt</b> is
<p>
        <b><u>iv</u></b><b>style(</b><i>varlist</i> [<b>,</b> <b><u>e</u></b><b>quation(</b>{<b><u>d</u></b><b>iff</b> | <b><u>l</u></b><b>evel</b> | <b><u>b</u></b><b>oth</b>}<b>)</b> <b><u>p</u></b><b>assthru</b> <b><u>mz</u></b>]<b>)</b>
<p>
    <b>aweight</b>s, <b>pweight</b>s, and <b>fweight</b>s are allowed. <b>fweights</b> must be constant
    over time. See help weights.
<p>
    <b>xtabond2</b> is for use with cross-section time-series data.  You must <b>tsset</b>
    your data before using <b>xtabond2</b>; see help tsset.
<p>
    All <i>varlist</i>s may contain time-series operators and, in Stata version 11
    or later, factor variables. See help varlist.
<p>
    <b>by</b> <i>...</i> <b>:</b> may be used with <b>xtabond2</b> if no time-series operators are used
    in the command line.  The <b>by</b> clause will not restrict the sample from
    which lags are drawn in building instruments. See help by.
<p>
    <b>xtabond2</b> shares features of all estimation commands; see help estcom.
<p>
    The syntax of predict following <b>xtabond2</b> is
<p>
        <b>predict</b> [<i>type</i>] <i>newvarname</i> [<b>if</b> <i>exp</i>] [<b>in</b> <i>range</i>] [<b>,</b> <i>statistic</i>]
                [<b><u>diff</u></b><b>erence</b>]
<p>
    where <i>statistic</i> is
<p>
        <b>xb</b>          bx_it, fitted values (the default)
        <b><u>re</u></b><b>siduals</b>   e_it, the residuals
<p>
<p>
<b><u>Description</u></b>
<p>
    <b>xtabond2</b> can fit two closely related dynamic panel data models.  The
    first is the Arellano-Bond (1991) estimator, which is also available with
    <b>xtabond</b>, though without the two-step standard error correction described
    below.  It is sometimes called "difference GMM." The second is an
    augmented version outlined by Arellano and Bover (1995) and fully
    developed by Blundell and Bond (1998).  It is known as "system GMM."
    Roodman (2009) provides a pedagogic introduction to linear GMM, these
    estimators, and <b>xtabond2</b>.  The estimators are designed for dynamic
    "small-T, large-N" panels that may contain fixed effects and--separate
    from those fixed effects--idiosyncratic errors that are heteroskedastic
    and correlated within but not across individuals.  Consider the model:
<p>
    y_it = x_it * b_1 + w_it * b_2 + u_it      i=1,...,N;     t=1,...,T
    u_it = v_i + e_it,
<p>
 where
<p>
    v_i are unobserved individual-level effects;
<p>
    e_it are the observation-specific errors;
<p>
    x_it is a vector of strictly exogenous covariates (ones dependent on
            neither current nor past e_it);
<p>
    w_it is a vector of predetermined covariates (which may include the lag
            of y) and endogenous covariates, all of which may be correlated
            with the v_i (Predetermined variables are potentially correlated
            with past errors.  Endogenous ones are potentially correlated
            with past and present errors.);
<p>
    b_1 and b_2 are vectors of parameters to be estimated;
<p>
    and E[v_i]=E[e_it]=E[v_i*e_it]=0, and E[e_it*e_js]=0 for each i, j, t, s,
    i&lt;&gt;j.
<p>
    First-differencing the equation removes the v_i, thus eliminating a
    potential source of omitted variable bias in estimation.  However,
    differencing variables that are predetermined but not strictly exogenous
    makes them endogenous since the w_it in some D.w_it = w_it – w_i,t-1 is
    correlated with the e_i,t-1 in D.e_it.  Following Holt-Eakin, Newey, and
    Rosen (1988), Arellano and Bond (1991) develop a Generalized Method of
    Moments estimator that instruments the differenced variables that are not
    strictly exogenous with all their available lags in levels.  (Strictly
    exogenous variables are uncorrelated with current and past errors.)
    Arellano and Bond also develop an appropriate test for autocorrelation,
    which, if present, can render some lags invalid as instruments.
<p>
    A problem with the original Arellano-Bond estimator is that lagged levels
    are poor instruments for first differences if the variables are close to
    a random walk.  Arellano and Bover (1995) describe how, if the original
    equation in levels is added to the system, additional instruments can be
    brought to bear to increase efficiency.  In this equation, variables in
    <i>levels</i> are instrumented with suitable lags of their own <i>first</i>
    <i>differences</i>.  The assumption needed is that these differences are
    uncorrelated with the unobserved country effects.  Blundell and Bond show
    that this assumption in turn depends on a more precise one about initial
    conditions.
<p>
    <b>xtabond2</b> implements both estimators--twice.  The version in Stata’s ado
    programming language is slow but compatible with Stata 7 and 8.  The Mata
    version is usually faster, and runs in Stata 10.0 or later.  The <b>xtabond2</b>
    option <b>nomata</b> prevents the use of Mata even when it is available.
<p>
    The Mata version also includes the option to use the forward orthogonal
    deviations transform instead of first differencing.  Proposed by Arellano
    and Bover (1995) the orthogonal deviations transform, rather than
    subtracting the previous observation, subtracts the average of all
    available future observations.  The result is then multiplied by a scale
    factor chosen to yield the nice but relatively unimportant property that
    if the original e_it are i.i.d., then so are the transformed ones (see
    Arellano and Bover (1995) and Roodman (2009)).  Like differencing, taking
    orthogonal deviations removes fixed effects.  Because lagged observations
    of a variable do not enter the formula for the transformation, they
    remain orthogonal to the transformed errors (assuming no serial
    correlation), and available as instruments.  In fact, for consistency,
    the software stores the orthogonal deviation of an observation one period
    late, so that, as with differencing, observations for period 1 are
    missing and, for an instrumenting variable w, w_i,t-1 enters the formula
    for the transformed observation stored at i,t.  With this move, exactly
    the same lags of variables are valid as instruments under the two
    transformations.
<p>
    On balanced panels, GMM estimators based on the two transforms return
    numerically identical coefficient estimates, holding the instrument set
    fixed (Arellano and Bover 1995).  But orthogonal deviations has the
    virtue of preserving sample size in panels with gaps.  If some e_it is
    missing, for example, neither D.e_it nor D.e_i,t+1 can be computed.  But
    the orthogonal deviation can be computed for every complete observation
    except the last for each individual.  (First differencing can do no
    better since it must drop the first observation for each individual.)
    Note that "difference GMM" is still called that even when orthogonal
    deviations are used.  We will refer to the equation in differences or
    orthogonal deviations as the <i>transformed</i> equation.  In system GMM with
    orthogonal deviations, the levels or <i>untransformed</i> equation is still
    instrumented with differences as described above.
<p>
    <b>xtabond2</b> reports the Arellano-Bond test for autocorrelation, which is
    applied to the differenced residuals in order to purge the unobserved and
    perfectly autocorrelated v_i.  AR(1) is expected in first differences,
    because D.e_i,t = e_i,t - e_i,t-1 should correlate with D.e_i,t-1 =
    e_i,t-1 - e_i,t-2 since they share the e_i,t-1 term.  So to check for
    AR(1) in levels, look for AR(2) in differences, on the idea that this
    will detect the relationship between the e_i,t-1 in D.e_i,t and the
    e_i,t-2 in D.e_i,t-2.  This reasoning does not work for orthogonal
    deviations, in which the residuals for an individual are all
    mathematically interrelated, thus contaminated from the point of view of
    detecting AR in the e_it.  So the test is run on differenced residuals
    even after estimation in deviations.  Autocorrelation indicates that lags
    of the dependent variable (and any other variables used as instruments
    that are not strictly exogenous), are in fact endogenous, thus bad
    instruments.  For example, if there is AR(s), then y_i,t-s would be
    correlated with e_i,t-s, which would be correlated with D.e_i,t-s, which
    would be correlated with D.e_i,t.
<p>
    <b>xtabond2</b> also reports tests of over-identifying restrictions--of whether
    the instruments, as a group, appear exogenous.  For one-step, non-robust
    estimation, it reports the Sargan statistic, which is the minimized value
    of the one-step GMM criterion function.  The Sargan statistic is not
    robust to heteroskedasticity or autocorellation.  So for one-step, robust
    estimation (and for all two-step estimation), <b>xtabond2</b> also reports the
    Hansen <i>J</i> statistic, which is the minimized value of the two-step GMM
    criterion function, and is robust.  <b>xtabond2</b> still reports the Sargan
    statistic in these cases because the <i>J</i> test has its own problem: it can
    be greatly weakened by instrument proliferation.  The Mata version goes
    further, reporting difference-in-Sargan statistics (really,
    difference-in-Hansen statistics, except in one-step robust estimation),
    which test for whether subsets of instruments are valid.  To be precise,
    it reports one test for each group of instruments defined by an <b>ivstyle()</b>
    or <b>gmmstyle()</b> option (explained below).  So replacing <b>gmmstyle(x y)</b> in a
    command line with <b>gmmstyle(x) gmmstyle(y)</b> will yield the same estimate
    but distinct difference-in-Sargan/Hansen tests.  In addition, including
    the <b><u>sp</u></b><b>lit</b> suboption in a <b>gmmstyle()</b> option in system GMM splits an
    instrument group in two for difference-in-Sargan/Hansen purposes, one
    each for the transformed equation and levels equations.  This is
    especially useful for testing the instruments for the levels equation
    based on lagged differences of the dependent variable, which are the most
    suspect in system GMM and the subject of the "initial conditions" in the
    title of Blundell and Bond (1998).  In the same vein, in system GMM,
    <b>xtabond2</b> also tests all the GMM-type instruments for the levels equation
    as a group.  All of these tests, however, are weak when the instrument
    count is high.  Difference-in-Sargan/Hansen tests are are computationally
    intensive since they involve re-estimating the model for each test; the
    <b>nodiffsargan</b> option is available to prevent them.
<p>
    As linear GMM estimators, the Arellano-Bond and Blundell-Bond estimators
    have one- and two-step variants.  But though two-step is asymptotically
    more efficient, the reported two-step standard errors tend to be severely
    downward biased (Arellano and Bond 1991; Blundell and Bond 1998).  To
    compensate, <b>xtabond2</b> makes available a finite-sample correction to the
    two-step covariance matrix derived by Windmeijer (2005).  This can make
    two-step robust estimations more efficient than one-step robust,
    especially for system GMM.
<p>
    Standard errors can also be "bootstrapped"--but not with the <b>bootstrap</b>
    command. That command builds temporary data sets by sampling the real one
    <i>with replacement</i>. And having multiple observations for a given
    observational unit and time period violates panel structure. Instead, use
    <b>jacknife</b>, perhaps with the <b>cluster()</b> option, clustering on the panel
    identifier variable, in order to drop each observational unit in turn.
<p>
    The syntax of <b>xtabond2</b> differs substantially from that of <b>xtabond</b> and
    <b>xtdpdsys</b>.  <b>xtabond2</b> almost completely decouples specification of
    <i>regressors</i> from specification of <i>instruments</i>.  As a result, most
    variables used will appear twice in an <b>xtabond2</b> command line.  <b>xtabond2</b>
    requires the initial <i>varlist</i> of the command line to include all
    regressors except for the optional constant term, be they strictly
    exogenous, predetermined, or endogenous.  Variables used to form
    instruments then appear in <b>gmmstyle()</b> or <b>ivstyle()</b> options after the
    comma.  The result is a loss of parsimony, but fuller control over the
    instrument matrix.  Variables can be used as the basis for "GMM-style"
    instrument sets without being included as regressors, or vice versa.
<p>
    The <b><u>gmm</u></b><b>style()</b> and <b><u>iv</u></b><b>style()</b> options also have suboptions that allow
    further customization of the instrument matrix.
<p>
<p>
<b><u>Citation</u></b>
    <b>xtabond2</b> is not an official Stata command.  It is a free contribution to
        the research community.  Please cite it as such:
        Roodman, D. 2009. How to do xtabond2: An introduction to difference
        and system GMM in Stata. <i>Stata Journal</i> 9(1): 86-136.
<p>
<p>
<b><u>Options</u></b>
<p>
    <b>level(</b><i>#</i><b>)</b> specifies the confidence level, in percent, for confidence
        intervals of the coefficients; see help level. The default is 95.
<p>
    <b><u>sv</u></b><b>mat</b> tells <b>xtabond2</b> to save the X, Y, Z, H, and weight matrices as e()
        return macros. These are not included by default because the matrices
        can be larger than the data set itself. If the <b>pca</b> option is used,
        <b>svmat</b> will also save the eigenvectors matrix as
        xtabond2_eigenvectors. This option is available only when using using
        the Mata implementation in Mata's speed-favoring mode.  Data are
        stored in balanced matrices and sorted by individual, equation (for
        System GMM), then time. Rows and columns are labelled for clarity.
        The instrument matrix typically contains all-zero columns, which do
        not affect estimation. For compatibility with Stata column-labeling
        conventions, instruments subject to the backward orthogonal
        deviations transform (see below) are still denoted with a "D."
        operator.
<p>
    <b>twostep</b> specifies that the two-step estimator is to be calculated instead
        of the one-step.
<p>
    <b>robust</b>: For one-step estimation, <b>robust</b> specifies that the robust
        estimator of the covariance matrix of the parameter estimates be
        calculated.  The resulting standard error estimates are consistent in
        the presence of any pattern of heteroskedasticity and autocorrelation
        within panels.  In two-step estimation, the standard covariance
        matrix is already robust in theory--but typically yields standard
        errors that are downward biased.  <b>twostep robust</b> requests
        Windmeijer’s finite-sample correction for the two-step covariance
        matrix.
<p>
    <b>cluster(</b><i>varname</i><b>)</b> overrides the default use of the panel identifier (as
        set by <b>tsset</b>) as the basis for defining groups. <b>cluster(</b><i>varname</i><b>)</b>
        implies <b>robust</b> in the senses just described. For example, in two-step
        estimation, it requests the Windmeijer correction. Changing the group
        identifier with this option affects one-step "robust" standard
        errors, all two-step results, the Hansen and difference-in-Hansen
        tests, and the Arellano-Bond serial correlation tests.
<p>
    <b>noconstant</b> suppresses the constant term in the levels equation.  By
        default, the term is included as a regressor and IV-style instrument.
        Unlike xtabond and DPD (the original implementation of these
        estimators), <b>xtabond2</b> does not include the constant term in the
        transformed equation in difference GMM.  Rather, the constant is
        transformed out.
<p>
    <b>small</b> requests <i>t</i> statistics instead of <i>z</i> statistics and an <i>F</i> test instead
        of a Wald chi-squared test of overall model fit.
<p>
    <b>noleveleq</b> specifies that level equation should be excluded from the
        estimation, yielding difference rather than system GMM.
<p>
    <b>nodiffsargan</b> prevents difference-in-Sargan/Hansen tests, which are are
        computationally intensive since they involve re-estimating the model
        for each test.  The option has no effect on the ado version of
        <b>xtabond2</b>, which does not perform difference-in-Sargan/Hansen testing
        anyway.
<p>
    <b>nomata</b> prevents the use of Mata code even when the language is available
        (in Stata 10.0 or later). It is not necessary in Stata 7-9.
        Ordinarily this switch does not affect results.  However, if some
        variables are collinear or nearly so, the two versions of the program
        may dropped different ones, which can affect the results.  They can
        even differ in how many they drop, since the versions use different
        routines and tolerances for determining collinearity.  In addition,
        the Mata version does not perfectly handle strange and unusual
        expressions like <b>gmm(L.x, lag(-1 -1))</b>. (Documentation for the
        <b>gmmstyle()</b> option is below.) This expression is the same as <b>gmm(x,</b>
        <b>lag(0 0))</b> in principle.  But the Mata code would interpret it by
        lagging x, thus losing the observations of x for <i>t=T</i>, then unlagging
        the remaining information.  The slow, ado version would not lose data
        in this way.
<p>
    <b>orthogonal</b> requests the forward orthogonal deviations transform instead
        of differencing.
<p>
    <b>ivstyle()</b> specifies a set of variables to serve as standard instruments,
        with one column in the instrument matrix per variable.  Normally,
        strictly exogenous regressors are included in <b>ivstyle</b> options, in
        order to enter the instrument matrix, as well as being listed before
        the main comma of the command line.  The <b>equation()</b> suboption
        specifies which equation(s) should use the instruments:
        first-difference only (<b>equation(diff)</b>), levels only
        (<b>equation(level)</b>), or both (<b>equation(both)</b>), the default.  Also by
        default, the instruments are transformed (into differences or
        orthogonal deviations) for use in the transformed equation and
        entered untransformed for the levels equation.  The suboption
        <b>passthru</b> may be used after <b>equation(diff)</b>, or when the option
        <b>noleveleq</b> is invoked, to prevent this transformation.  <b>equation()</b> is
        useful for proper handling of predetermined variables used as
        IV-style instruments in system GMM.  For example, if x is
        predetermined, it is a valid instrument for the levels equation since
        it is assumed to be uncorrelated with the contemporaneous error term.
        However, x becomes endogenous in first differences, so D.x is not a
        valid instrument for the transformed equation.  <b>ivstyle(x)</b> would
        therefore be inappropriate.  The use of x as an IV-style instrument
        in levels only could be specified by <b>iv(x, eq(level))</b>.
<p>
        If the suboption <b>mz</b> is included in an <b>ivstyle</b> option, missing values
        in the instruments are converted to zeroes.  <b>mz</b> does not change the
        precise moment conditions generated by <b>ivstyle</b>--they still apply only
        to the error terms of observations which have data for the
        instruments.  Rather, <b>mz</b> allows observations that are missing data
        for the instruments in question to nonetheless stay in the regression
        <i>if</i> the instruments are not also regressors.  (Observations missing
        values for regressors must still be dropped.)
<p>
    <b><u>gmm</u></b><b>style()</b> specifies a set of variables to be used as bases for
        "GMM-style" instrument sets described in Holtz-Eakin, Newey, and
        Rosen (1988) and Arellano and Bond (1991).  By default <b>xtabond2</b> uses,
        for each time period, all available lags of the specified variables
        in levels dated t-1 or earlier as instruments for the transformed
        equation; and uses the contemporaneous first differences as
        instruments in the levels equation. These defaults are appropriate
        for predetermined variables that are not strictly exogenous (Bond
        2000).  Missing values are always replaced by zeros.  The optional
        <b>laglimits(</b><i>a b</i><b>)</b> suboption can override these defaults: for the
        transformed equation, lagged levels dated t-<i>a</i> to t-<i>b</i> are used as
        instruments, while for the levels equation, the first-difference
        dated t-<i>a</i>+1 is normally used.  <i>a</i> and <i>b</i> can each be missing ("."); <i>a</i>
        defaults to 1 and <i>b</i> to infinity.  They can even be negative, implying
        "forward" lags.  If <i>a</i>&gt;<i>b</i> then <b>xtabond2</b> swaps their values.  (Note that
        if <i>a</i>&lt;=<i>b</i>&lt;0 then the first-difference dated t-<i>b</i>+1 is normally used as
        an instrument in the levels equation instead of that dated t-<i>a</i>+1,
        because it is more frequently in the range [1,T] of valid time
        indexes.  Or, for the same reasons, if  <i>a</i>&lt;=0&lt;=<i>b</i> or  <i>b</i>&lt;=0&lt;=<i>a</i>, the
        first-difference dated t is used.) Since the <b>gmmstyle()</b> <i>varlist</i>
        allows time-series operators, there are many routes to the same
        specification.  E.g., <b>gmm(w, lag(2 .))</b>, the standard treatment for an
        endogenous variable, is equivalent to <b>gmm(L.w, lag(1 .))</b>, thus
        <b>gmm(L.w)</b>.
<p>
        The <b><u>e</u></b><b>quation()</b> suboption of <b>gmmstyle()</b> works much like that of
        <b>ivstyle()</b> (see above), with one important exception.  In response to
        <b>equation(level)</b>, <b>xtabond2</b> generates the<i> full set</i> of available
        instruments for the levels equation since it is no longer the case
        that most are made mathematically redundant by the presence of the
        full set of moment conditions for the transformed equation.  To be
        precise, if the lag limits are <i>a</i> and <i>b</i>, then lags of the specified
        variables in differences dated t-<i>b</i> to t-<i>a</i> are used.  <b>equation(diff)</b>
        has no effect in difference GMM.
<p>
        The <b><u>p</u></b><b>assthru</b> suboption of <b>gmmstyle()</b> is meaningful only in system
        GMM, and only for variables for which <b>equation(level)</b> has also been
        specified.  It directs <b>xtabond2</b> to create instruments for the levels
        equation that use not the first-differences of the specified
        variables but the original levels of the same dates.  For example,
        <b>equation(level) passthru laglimits(1 .)</b> requests that all lagged
        levels be used as instruments.  Under the standard assumptions, these
        instruments are not valid.
<p>
        The <b><u>o</u></b><b>rthogonal</b> suboption tells <b>xtabond2</b> to apply the backward
        orthogonal deviations transform to the instruments for the
        transformed equation. Essentially, instruments are replaced with
        their deviations from past means. Since the resulting instruments
        depend on all past values of the underlying variables, the regressors
        in the transformed equation should not be similarly transformed.
        Otherwise the instruments may be correlated with the error. That is,
        if this suboption is used the <b><u>or</u></b><b>thogonal</b> <i>option</i> should also be
        included (outside a <b>gmmstyle()</b> option). In simulations, Hayakawa
        (2009) finds that "Difference GMM" with this combination--backword
        orthogonal deviations for the insturments and forward for the
        regressors--is less biased and more stable than traditional
        Difference GMM for a standard AR(1) model when <i>T</i>&gt;=10. (For an AR(p)
        model, he uses only the most recent p instrument lags, equivalent to
        <b>gmm(L.y, orthog lag(1 </b><i>p</i><b>))</b>.) This option does not affec the
        instruments for the levels equation.
<p>
        The <b><u>sp</u></b><b>lit</b> suboption of <b>gmmstyle()</b> is also meaningful only in system
        GMM, and then only when neither <b>eq(diff)</b> nor <b>eq(level)</b> is specified.
        Its sole effect is to split the specified instrument group in two for
        purposes of difference-in-Sargan/Hansen testing--one instrument set
        for the transformed equation and one for the levels equation.
<p>
        The <b><u>c</u></b><b>ollapse</b> suboption of <b>gmmstyle()</b> specifies that <b>xtabond2</b> should
        create one instrument for each variable and lag distance, rather than
        one for each time period, variable, and lag distance.  In large
        samples, <b>collapse</b> reduces statistical efficiency.  But in small
        samples it can avoid the bias that arises as the number of
        instruments climbs toward the number of observations.  (When
        instruments are many, they tend to overfit the instrumented variables
        and bias the results toward those of OLS/GLS.) <b>collapse</b> also greatly
        curtails computational demands by reducing the width of the
        instrument matrix, and (relevant for the ado version of the program)
        helps keep the matrix within Stata's size limit.
<p>
        For example, if a model assumes that E[w_is*D.e_it] = 0 for all s&lt;t,
        this is expressed in standard Arellano-Bond estimation as:
<p>
            sum_i (w_is * D.e_it) = 0 for each s and t, s&lt;t.
<p>
        This translates into columns in the instrument matrix of the form:
<p>
            w_i1  0    0    0    0    0   ...
             0   w_i1 w_i2  0    0    0   ...
             0    0    0   w_i1 w_i2 w_i3 ...
             .    .    .    .    .    .   ...
             .    .    .    .    .    .   ...
<p>
        <b>collapse</b> divides the "GMM-style" moment conditions into groups and
        sums the conditions in each group to form a smaller set of conditions
        of the form:
<p>
            sum_i,t (w_i,t-j * D.e_it)= 0 for each j&gt;0.
<p>
        This is equivalent to combining columns of the instrument matrix by
        addition, yielding:
<p>
            w_i1  0    0   ...
            w_i2 w_i1  0   ...
            w_i3 w_i2 w_i1 ...
             .    .    .   ...
             .    .    .   ...
<p>
        Similarly, the standard instruments for the levels equation (in
        system GMM) collapse from:
<p>
            D.w_i2    0      0   ...
               0   D.w_i3    0   ...
               0      0   D.w_i4 ...
               .      .      .   ...
<p>
        To the single column:
<p>
            D.w_i2
            D.w_i3
            D.w_i4
               .  
               .  
<p>
    <b>pca</b> tells <b>xtabond2</b> to replace the "GMM-style" instruments with their
        principal components in order to reduce the instrument count in a
        minimally arbitrary way (Kapetanios and Marcellino 2010; Bai and Ng
        2010; Mehrhoff 2009). Principal components analysis is run on the
        correlation, not covariance, matrix of the "GMM-style" instruments.
        By default <b>xtabond2</b> will select all components with eigenvalues at
        least 1, and will select more if necessary to guarantee that
        instruments are at least as numerous as regressors, favoring those
        with largest eigenvalues.
<p>
    <b><u>comp</u></b><b>onents(</b><i>#</i><b>)</b> allows the user to override the default number of
        components described just above.
<p>
    <b>artests(</b><i>#</i><b>)</b> specifies the maximum order of the autocorrelation tests to be
        reported. The default is 2.
<p>
    <b>arlevels</b> specifies that the autocorrelation tests should be applied to
        the residuals from the levels, not first-difference, equation.  It
        cannot be specified along with <b>noleveleq</b>.  If there are fixed
        effects, then autocorrelation in levels is expected and would not
        call the specification into the question.
<p>
    <b>h(</b><i>#</i><b>)</b> controls the form of H, the <i>a priori</i> estimate of the covariance
        matrix of the idiosyncratic errors.  In one-step linear GMM, the
        inverse of Z'HZ, where Z is the instrument matrix, proxies for the
        covariance matrix of the moments, and is used to weight the sample
        moments whose magnitudes are jointly minimized.  Since H merely
        controls the weights on instruments believed exogenous, for any
        non-degenerate choice of H, one-step estimates will be consistent.
        And two-step estimates will be asymptotically efficient (Baum,
        Schaffer, and Stillman 2003).  So the priority in designing H is
        minimizing arbitrariness.  H always has block diagonal form, with all
        blocks the same. Let * indicate variables transformed by orthogonal
        deviations or differencing and M be the (T-1)xT matrix that performs
        the chosen transform.  We assume for the purposes of designing H that
        var[e]=I, the identity matrix.  Then, for difference GMM, the
        (T-1)x(T-1) blocks of H by default are MM', which is var[u*]
        (= var[e*]) when var[e]=I (see Roodman 2009). For orthogonal
        deviations, MM'=I.  For differencing, it is:
<p>
             2 -1  0 ...
            -1  2 -1 ...
             0 -1  2 ...
             .  .  . ...
<p>
        To perform system GMM, <b>xtabond2</b> treats the transformed data as being
        for periods 2 to T and levels data as being for periods T+1 to 2T.
        The blocks of H are then (2T-1)x(2T-1) <i>a priori</i> estimates of the
        covariance of the compound vector [u*' u']'. If we assume, in
        addition to var[e]=I, that var[v]=0 (no fixed effects), then the
        blocks of H are
<p>
            MM'   M'
             M    I
<p>
        However, more than one choice for H is present in the literature.  In
        <b>xtabond2</b>, <b>h(3)</b>, the default, specifies the matrices described above.
        <b>h(2)</b> differs in that for system GMM the upper right and lower left
        quadrants of the depicted H are zeroed out.  This copies current
        versions of DPD for Gauss and Ox (Arellano and Bond 1998; Doornik,
        Arellano, and Bond 2002). <b>h(1)</b> specifies that H=I for both difference
        and system GMM.  H took this value in the original implementation of
        the system GMM estimator, in Blundell and Bond (1998).  In one-step
        GMM, setting H=I essentially gives 2SLS.
<p>
    The Mata system parameter matafavor influences the behavior of the Mata
        version of <b>xtabond2</b>.  Type <b>mata: mata set matafavor speed</b> or <b>mata:</b>
        <b>mata set matafavor space</b> before running <b>xtabond2</b> to influence the
        tradeoff it makes between speed and memory use.  Add the <b>, perm</b>
        option to these commands to make the change permanent.<b>  Note:</b>
        Increasing the amount of memory available for Stata data sets using
        the <b>set memory</b> command <i>reduces</i> that available to Mata.  So if Mata
        <b>xtabond2</b> is running out of memory, usually indicated by an unable to
        allocate real message, also try reducing Stata memory with <b>set</b>
        <b>memory</b>.
<p>
<b><u>Options for </u></b><b><u>predict</u></b>
<p>
    <b>xb</b>, the default, calculates the linear prediction.
<p>
    <b><u>re</u></b><b>siduals</b> calculates the residual error of the dependent variable from
        the linear prediction.
<p>
    <b><u>diff</u></b><b>erence</b> requests that the first-differences of the dependent variable,
        rather than the levels, be predicted.
<p>
<p>
<p>
<b><u>Return values</u></b>
<p>
   Scalars
       <b>e(N)</b>               Number of complete observations in untransformed data
&gt;  (system GMM) or transformed data (difference GMM)
       <b>e(sargan)</b>          Sargan statistic 
       <b>e(sar_df)</b>          Degrees of freedom for Sargan statistic
       <b>e(sarganp)</b>         p value of Sargan statistic
       <b>e(hansen)</b>          Hansen J statistic 
       <b>e(hansen_df)</b>       Degrees of freedom for Hansen statistic
       <b>e(hansenp)</b>         p value of Hansen statistic
       <b>e(artests)</b>         Number of AR tests requested
       <b>e(ar</b><i>i</i><b>)</b>             AR(<i>i</i>) test statistic 
       <b>e(ar</b><i>i</i><b>p)</b>            p value of AR(<i>i</i>) statistic
       <b>e(df_m)</b>            Model degrees of freedom
       <b>e(df_r)</b>            Residual degrees of freedom (if <b>small</b> specified)
       <b>e(chi2)</b>            Wald chi-squared statistic (if <b>small</b> not specified)
       <b>e(chi2p)</b>           p value of Wald statistic (if <b>small</b> not specified)
       <b>e(sig2)</b>            Estimated variance of the e_it
       <b>e(sigma)</b>           Square root thereof
       <b>e(F)</b>               F statistic (if <b>small</b> specified)
       <b>e(F_p)</b>             p value of F statistic (if <b>small</b> specified)
       <b>e(g_min)</b>           Lowest number of observations in an included individu
&gt; al
       <b>e(g_max)</b>           Highest number of observations in an included individ
&gt; ual
       <b>e(g_avg)</b>           Average number of observations per included individua
&gt; l
       <b>e(h)</b>               Value of <b>h()</b> option (default is 3)
       <b>e(j)</b>               Number of instruments
       <b>e(j0)</b>              Number of instruments, including collinear ones
       <b>e(N_g)</b>             Number of included individuals
       <b>e(N_clust)</b>         Number of clusters
       <b>e(components)</b>      Number of components extracted if pca option invoked
       <b>e(kmo)</b>             Kaiser-Meyer-Olkin measure of sampling adequacy if pc
&gt; a option invoked
       <b>e(pcaR2)</b>           Sum of eigenvalues of included components divided by 
&gt; sum of all
<p>
   Macros
       <b>e(predict)</b>         "xtab2_p"
       <b>e(artype)</b>          "first differences" or "levels"
       <b>e(vcetype)</b>         "Robust" for one-step <b>robust</b>, "Corrected" for <b>twostep</b>
<b>&gt;  robust</b>, empty otherwise
       <b>e(twostep)</b>         "twostep" for <b>twostep</b>
       <b>e(small)</b>           "small" for <b>small</b>
       <b>e(esttype)</b>         "system" or "difference"
       <b>e(pca)</b>             "pca" if pca option invoked
       <b>e(gmminsts</b><i>i</i><b>)</b>       Variables listed in <b>gmmstyle</b> group <i>i</i>
       <b>e(ivinsts</b><i>i</i><b>)</b>        Variables listed in <b>ivstyle</b> group <i>i</i>
       <b>e(transform)</b>       "first differences" or "orthogonal deviations" 
       <b>e(depvar)</b>          Dependent variable
       <b>e(clustvar)</b>        Clustering group identifier
       <b>e(tvar)</b>            Time variable
       <b>e(ivar)</b>            Individual (panel) variable
       <b>e(cmd)</b>             "xtabond2"
       <b>e(cmdline)</b>         Full command line
       <b>e(diffgroup</b><i>i</i><b>)</b>      variables in <i>i</i>th group subject to difference-Sargan/H
&gt; ansen testing
<p>
   Matrices
       <b>e(b)</b>               Coefficient vector
       <b>e(V)</b>               Variance-covariance matrix
       <b>e(A1)</b>              First-step GMM weighting matrix
       <b>e(A2)</b>              Second-step GMM weighting matrix (if <b>twostep</b> specifie
&gt; d)
       <b>e(Ze)</b>              Z'E where E=2nd-step residuals, used in computing Han
&gt; sen statistic
       <b>e(eigenvalues)</b>     Eigenvalues of principal components of GMM-style inst
&gt; ruments (if <b>pca</b> specified)
       <b>e(diffsargan)</b>      Table of difference-in-Sargan/Hansen tests
       <b>e(ivequation)</b>      Value of equation() suboption for each ivstyle() opti
&gt; on, in order
                             (0=level, 1=diff, 2=both)
       <b>e(ivpassthru)</b>      Value of passthru option for each ivstyle() option.
       <b>e(ivmz)</b>            Value of mz suboption for each ivstyle() option
       <b>e(gmmequation)</b>     Value of equation() suboption for each gmmstyle() opt
&gt; ion
                             (0=level, 1=diff, 2=both)
       <b>e(gmmpassthru)</b>     Value of passthru option for each gmmstyle() option
       <b>e(gmmpasscollapse)</b> Value of collapse option for each gmmstyle() option
       <b>e(gmmlaglimits)</b>    Lag limits for each gmmstyle() option
       <b>e(gmmorthogonal)</b>   Value of orthogonal option for each gmmstyle() option
       <b>e(X)</b>               Matrix of right-side variables used in estimation, if
&gt;  <b><u>sv</u></b><b>mat</b> invoked
       <b>e(Y)</b>               Column of dependent variable used in estimation, if <b><u>s</u></b>
<b><u>&gt; v</u></b><b>mat</b> invoked
       <b>e(Z)</b>               Instrument matrix used in estimation, if <b><u>sv</u></b><b>mat</b> invoke
&gt; d
       <b>e(H)</b>               H matrix used in estimation, if <b><u>sv</u></b><b>mat</b> invoked
       <b>e(wt)</b>              Weight vector used in estimation, if <b><u>sv</u></b><b>mat</b> invoked an
&gt; d weights used
       <b>e(eigenvectors)</b>    Principal component scores, if <b><u>sv</u></b><b>mat</b> and <b>pca</b> invoked
<p>
   Functions
       <b>e(sample)</b>          Marks estimation sample
<p>
<b><u>Examples</u></b>
<p>
    use http://www.stata-press.com/data/r7/abdata.dta
    xtabond2 n l.n l(0/1).(w k) yr1980-yr1984, gmm(l.n w k) iv(yr1980-yr1984,
        passthru) noleveleq small
    xtabond2 n l.n l(0/1).(w k) yr1980-yr1984, gmm(l.n w k) iv(yr1980-yr1984,
        mz) robust twostep small h(2)
    xtabond2 n l(1/2).n l(0/1).w l(0/2).(k ys) yr1980-yr1984, gmm(l.n w k)
        iv(yr1980-yr1984) robust twostep small
    <b>* Next two are equivalent, assuming id is the panel identifier</b>
    ivreg2 n cap (w = k ys rec) [pw=_n], cluster(ind) orthog(rec)
    xtabond2 n w cap [pw=_n], iv(cap k ys, eq(level)) iv(rec, eq(level))
        cluster(ind) h(1)
    <b>* Same for next two</b>
    regress n w k
    xtabond2 n w k, iv(w k, eq(level)) small h(1)
    <b>* And next two, assuming xtabond updated since May 2004 with</b> update
        <b>command.</b>
    xtabond n yr*, lags(1) pre(w, lags(1,.)) pre(k, endog) robust small
        noconstant
    xtabond2 n L.n w L.w k yr*, gmm(L.(w n k)) iv(yr*) noleveleq robust small
    <b>* And next two</b>
    xtdpd n L.n L(0/1).(w k) yr1978-yr1984, dgmm(w k n) lgmm(w k n)
        liv(yr1978-yr1984) vce(robust) two hascons
    xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, gmm(L.(w k n))
        iv(yr1978-yr1984, eq(level)) h(2) robust twostep
    <b>* Three ways to reduce the instrument count</b>
    xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, gmm(L.(w k n))
        iv(yr1978-yr1984, eq(level)) h(2) robust twostep pca
    xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, gmm(L.(w k n), collapse)
        iv(yr1978-yr1984, eq(level)) h(2) robust twostep
    xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, gmm(L.(w k n), lag(1 1))
        iv(yr1978-yr1984, eq(level)) h(2) robust twostep
    <b>* Estimation a la Hayakawa 2009</b>
    xtabond2 n L.n L(0/1).(w k) yr1979-yr1984, gmm(L.(w k n), lag(1 1)
        orthog) iv(yr1979-yr1984, eq(level)) h(2) robust twostep orthog
        noleveleq
<p>
    <b>Three sample files</b> are included with the package downloaded with this
    command. <b>abest.do</b> reproduces two sample file that comes with DPD for Ox,
    which in turn generate most of the GMM results in Arellano and Bond
    (1991).  <b>bbest.do</b> reproduces another sample file that comes with DPD for
    Ox, based on Blundell and Bond (1998).  To download them, type the
    following command or click on it:  ssc install xtabond2, all replace.
    This will save the files to your current directory, as set by the <b>cd</b>
    command.  <b>greene.do</b> reproduces an example in Greene (2002).
 
<b><u>References</u></b>
<p>
    Arellano, M. and S. Bond. 1991.  Some tests of specification for panel
        data: Monte Carlo evidence and an application to employment
        equations. <i>The Review of Economic Studies</i> 58: 277-97.
    Arellano, M. and S. Bond. 1998.  Dynamic Panel data estimation using
        DPD98 for Gauss: A guide for users.
    Arellano, M. and O. Bover. 1995.  Another look at the instrumental
        variable estimation of error-components models. <i>Journal of</i>
        <i>Econometrics</i> 68: 29-51.
    Bai, J., and S. Ng. 2010. Instrumental Variables Estimation in a Data
        Rich Environment.  <i>Econometric Theory</i> 26(6): 1577-1606.
    Baum, C.F., M.E. Schaffer, and S. Stillman. 2003. Instrumental variables
        and GMM: Estimation and testing. <i>Stata Journal</i> 3: 1-31.
    Blundell, R., and S. Bond. 1998.  Initial conditions and moment
        restrictions in dynamic panel data models. <i>Journal of Econometrics</i>
        87: 115-43.
    Bond, S. 2002.  Dynamic panel data models: A guide to micro data methods
        and practice. Working Paper 09/02. Institute for Fiscal Studies,
        London.
    Doornik, J.A., M. Arellano, and S. Bond. 2002.  Panel data estimation
        using DPD for Ox. http://www.nuff.ox.ac.uk/Users/Doornik.
    Greene, W.H. 2002<i> Econometric Analysis</i>, 5th ed. Prentice-Hall.
    Hayakawa, K. 2009. A simple efficient instrumental variable estimator for
        panel AR(p) models when both N and T are large.  <i>Econometric Theory</i>
        25: 873-90.
    Holtz-Eakin, D., W. Newey, and H.S. Rosen. 1988.  Estimating vector
        autoregressions with panel data.<i>  Econometrica</i> 56: 1371-95.
    Kapetanios, G., M. Marcellino. 2010. Factor-GMM estimation with large
        sets of possibly weak instruments.  <i>Computational Statistics &amp; Data</i>
        <i>Analysis</i> 54(11): 2655–75.
    Mehrhoff, J. 2009. A solution to the problem of too many instruments in
        dynamic panel data GMM.  Discussion Paper Series 1. No 31/2009.
    Roodman, D. 2009. How to Do xtabond2: An Introduction to "Difference" and
        "System" GMM in Stata. <i>Stata Journal</i> 9(1): 86-136.
    Windmeijer, F. 2005.  A finite sample correction for the variance of
        linear efficient two-step GMM estimators.<i>  Journal of Econometrics</i>
        126: 25-51.
<p>
<b><u>Author</u></b>
<p>
    David Roodman
    Senior Fellow
    Center for Global Development
    Washington, DC
    droodman@cgdev.org
<p>
<b><u>Also see</u></b>
<p>
    Manual: <b>[U] 23 Estimation and post-estimation commands</b>,
            <b>[U] 29 Overview of Stata estimation commands</b>,
            <b>[XT] xtabond</b>
<p>
    Online: help for xtabond, ivreg, ivreg2, estcom, postest; xtgee, 
</pre>