------------------------------------------------------------------------------- help fordirifit-------------------------------------------------------------------------------

Fitting a Dirichlet distribution by maximum likelihood

dirifitdepvarlist[weight] [ifexp] [inrange] [,{alphavar(varlist_a)alpha1|2|3|..|k(varlist_a_j)} | {muvar(varlist_m)phivar(varlist_p)mu1|2|3|...|k(varlist_m_j)baseoutcome(var)alternative}robustcluster(clustervar)level(#)maximize_options]

by...:may be used withdirifit; see help by.

fweights andaweights are allowed; see help weights.

Description

dirifitfits by maximum likelihood a Dirichlet distribution to a set of variablesdepvarlist. Each variable indepvarlistranges between 0 and 1 and all variables indepvarlistmust, for each observation, add up to 1: for example, they may be proportions.Note that cases will be ignored if the one or more of the dependent variables has a value less than or equal to zero or more than or equal to one or if the dependent variables don't add up to one.

dirifituses one of two parameterizations:A conventional parameterization with shape parameters

alpha_j> 0 (one for each variable indepvarlist) (e.g. Evans et al. 2000 or Kotz et al. 2000) will be used if onlydepvarlistis specified or if one or more ofalphavar()andalpha1|2|3|...|k()is specified.alpha_jis reported on the logarithmic scale to ensure that it remains positive. The conventional parameterization is especially useful when no covariates are present.An alternative parameterization with location parameters

mu_j(one for each variable indepvarlistexcept thebaseoutcome) and scale parameterphiwill be used if one or more ofmuvar(),mu1|2|3|...|k(),baseoutcome(), andphivar()is specified or if thealternativeoption is specified. The alternative parameterization is especially useful when covariates are present.mu_jare reported on the multinomial logit scale so that they stay between 0 and 1, and add up to one. In order to help interpretation, various types of marginal effects can be calculated with ddirifit.phiis reported on the logarithmic scale to ensure that it remains positive. This parameterization is analogous to the parameterization proposed by Paolino (2001), Ferrari and Cribari-Neto (2004), and Smithson and Verkuilen (2006) for the beta distribution.

Options

alphavar()andalpha1|2|3|...|k()allow the user to specify each parameter in the conventional parameterization as a function of the covariates specified in the variable list. The covariates inalphavar()are common toallparameters, whilealpha1|2|3|...|k()allow the user to specify (additional) covariates for the first, second, third, ...,kth parameter. The order of the parameters is determined by the order ofdepvarlist. A constant term is always included in each equation.

muvar(),mu1|2|3|...|k(), andphivar()allow the user to specify each parameter in the alternative parameterization as a function of the covariates specified in the respective variable list. The covariates inmuvar()are common toall muparameters, whilemu1|2|3|...|k()allow the user to specify (additional) covariates for the first, second, third, ...,kth mu parameter. The order of the parameters is determined by the order ofdepvarlist. A constant term is always included in each equation.As implied above, just one parameterization should be chosen.

alternativeensures that the alternative parameterization is used instead of the conventional parameterization if onlydepvarlistis specified. This option cannot be used withalphavar()oralpha1|2|3|...|k().

baseoutcomevariable indepvarlistthat will be the baseoutcome. The default is the first variable ofdepvarlist. This option cannot be used withalphavar()oralpha1|2|3|...|k().

robustspecifies that the Huber/White/sandwich estimator of variance is to be used in place of the traditional calculation; see[U] 20.14Obtaining robust variance estimates([U] 23.14in version 8).robustcombined withcluster()allows observations which are not independent within cluster (although they must be independent between clusters).

cluster(clustervar)specifies that the observations are independent across groups (clusters) but not necessarily within groups.clustervarspecifies to which group each observation belongs; e.g.,cluster(personid)in data with repeated observations on individuals. See[U] 20.14 Obtaining robust variance estimates([U] 23.14in version 8). Specifyingcluster()impliesrobust.

level(#)specifies the confidence level, in percent, for the confidence intervals of the coefficients; see help level.

nologsuppresses the iteration log.

maximize_optionscontrol the maximization process; see help maximize. If you are seeing many "(not concave)" messages in the log, using thedifficultoption may help convergence.

Saved resultsIn addition to the usual results saved after

ml,dirifitalso saves the following, as appropriate:

e(b_alpha1)toe(b_alphak)(wherekis the number of variables indepvarlist) are row vectors containing the parameter estimates from each equation in the conventional parameterization.

e(b_phi)ande(b_mu1)toe(b_muk)(wherekis the number of variables indepvarlist) except for thebaseoutcome, are row vectors containing the parameter estimates from each equation in the alternative parameterization.

e(length_b_alpha1)toe(length_b_alphak)ore(length_b_mu1)toe(length_b_muk)ande(length_b_phi)contain the lengths of these vectors. If no covariates are specified in an equation, the corresponding vector has length equal to 1 (the constant term); otherwise, the length is one plus the number of covariates.

Examples

use http://fmwww.bc.edu/repec/bocode/c/citybudget.dta, clear

dirifit governing safety education recreation social urbanplanning, ///mu(minorityleft noleft houseval popdens)

ddirifit, at(minorityleft 0 noleft 0 )(click to run)

AuthorsMaarten L. Buis, Universitaet Tuebingen maarten.buis@uni-tuebingen.de

Nicholas J. Cox, Durham University n.j.cox@durham.ac.uk

Stephen P. Jenkins, University of Essex stephenj@essex.ac.uk

Philipp Rehm provided a bug report.Acknowledgement

ReferencesEvans, M., Hastings, N. and Peacock, B. 2000.

Statistical distributions.New York: John Wiley.Ferrari, S.L.P. and Cribari-Neto, F. 2004. Beta regression for modelling rates and proportions.

Journal of Applied Statistics31(7): 799-815.Kotz, S., Balakrishnan, N., Johnson, N.L. 2000.

Continuous multivariatedistributions: Volume 1.New York: John Wiley.MacKay, D.J.C. 2003.

Information theory, inference, and learningalgorithms.Cambridge: Cambridge University Press (see pp.316-318). http://www.inference.phy.cam.ac.uk/itprnn/book.pdfPaolino, P. 2001. Maximum likelihood estimation of models with beta-distributed dependent variables.

Political Analysis9(4): 325-346. http://polmeth.wustl.edu/polanalysis/vol/9/WV008-Paolino.pdfSmithson, M. and Verkuilen, J. 2006. A better lemon squeezer? Maximum likelihood regression with beta-distributed dependent variables.

Psychological Methods11(1): 54-71.

Also seeOnline: help for dirifit_postestimation, betafit, fmlogit (if installed)