Today I want to talk about effect sizes such as Cohen’s d, Hedges’s g, Glass’s Δ, η2, and ω2. Effects sizes concern rescaling parameter estimates to make them easier to interpret, especially in terms of practical significance. Many researchers in psychology and education advocate reporting of effect sizes, professional organizations such as the American Psychological […]
↧
Measures of effect size in Stata 13
↧
Fitting ordered probit models with endogenous covariates with Stata’s gsem command
The new command gsem allows us to fit a wide variety of models; among the many possibilities, we can account for endogeneity on different models. As an example, I will fit an ordinal model with endogenous covariates. Parameterizations for an ordinal probit model The ordinal probit model is used to model ordinal dependent […]
↧
↧
Using resampling methods to detect influential points
As stated in the documentation for jackknife, an often forgotten utility for this command is the detection of overly influential observations. Some commands, like logit or stcox, come with their own set of prediction tools to detect influential points. However, these kinds of predictions can be computed for virtually any regression command. In particular, we […]
↧
How to simulate multilevel/longitudinal data
I was recently talking with my friend Rebecca about simulating multilevel data, and she asked me if I would show her some examples. It occurred to me that many of you might also like to see some examples, so I decided to post them to the Stata Blog. Introduction We simulate data all […]
↧
Using gsem to combine estimation results
gsem is a very flexible command that allows us to fit very sophisticated models. However, it is also useful in situations that involve simple models. For example, when we want to compare parameters among two or more models, we usually use suest, which combines the estimation results under one parameter vector and creates a simultaneous […]
↧
↧
Using gmm to solve two-step estimation problems
Two-step estimation problems can be solved using the gmm command. When a two-step estimator produces consistent point estimates but inconsistent standard errors, it is known as the two-step-estimation problem. For instance, inverse-probability weighted (IPW) estimators are a weighted average in which the weights are estimated in the first step. Two-step estimators use first-step estimates to […]
↧
Stata 14 announced, ships
We’ve just announced the release of Stata 14. Stata 14 ships and downloads starting now. I just posted on Statalist about it. Here’s a copy of what I wrote. Stata 14 is now available. You heard it here first. There’s a long tradition that Statalisters hear about Stata’s new releases first. The new forum is […]
↧
Bayesian modeling: Beyond Stata’s built-in models
This post was written jointly with Nikolay Balov, Senior Statistician and Software Developer, StataCorp. A question on Statalist motivated us to write this blog entry. A user asked if the churdle command (http://www.stata.com/stata14/hurdle-models/) for fitting hurdle models, new in Stata 14, can be combined with the bayesmh command (http://www.stata.com/stata14/bayesian-analysis/) for fitting Bayesian models, also new […]
↧
Introduction to treatment effects in Stata: Part 1
This post was written jointly with David Drukker, Director of Econometrics, StataCorp. The topic for today is the treatment-effects features in Stata. Treatment-effects estimators estimate the causal effect of a treatment on an outcome based on observational data. In today’s posting, we will discuss four treatment-effects estimators: RA: Regression adjustment IPW: Inverse probability weighting IPWRA: […]
↧
↧
Spotlight on irt
New to Stata 14 is a suite of commands to fit item response theory (IRT) models. IRT models are used to analyze the relationship between the latent trait of interest and the items intended to measure the trait. Stata’s irt commands provide easy access to some of the commonly used IRT models for binary and […]
↧
probit or logit: ladies and gentlemen, pick your weapon
We often use probit and logit models to analyze binary outcomes. A case can be made that the logit model is easier to interpret than the probit model, but Stata’s margins command makes any estimator easy to interpret. Ultimately, estimates from both models produce similar results, and using one or the other is a matter […]
↧
regress, probit, or logit?
In a previous post I illustrated that the probit model and the logit model produce statistically equivalent estimates of marginal effects. In this post, I compare the marginal effect estimates from a linear probability model (linear regression) with marginal effect estimates from probit and logit models. My simulations show that when the true model is […]
↧
Bayesian binary item response theory models using bayesmh
This post was written jointly with Yulia Marchenko, Executive Director of Statistics, StataCorp. Table of Contents Overview 1PL model 2PL model 3PL model 4PL model 5PL model Conclusion Overview Item response theory (IRT) is used for modeling the relationship between the latent abilities of a group of subjects and the examination items used for measuring […]
↧
↧
Testing model specification and using the program version of gmm
This post was written jointly with Joerg Luedicke, Senior Social Scientist and Statistician, StataCorp. The command gmm is used to estimate the parameters of a model using the generalized method of moments (GMM). GMM can be used to estimate the parameters of models that have more identification conditions than parameters, overidentified models. The specification of […]
↧
Vector autoregression—simulation, estimation, and inference in Stata
\(\newcommand{\epsb}{{\boldsymbol{\epsilon}}} \newcommand{\mub}{{\boldsymbol{\mu}}} \newcommand{\thetab}{{\boldsymbol{\theta}}} \newcommand{\Thetab}{{\boldsymbol{\Theta}}} \newcommand{\etab}{{\boldsymbol{\eta}}} \newcommand{\Sigmab}{{\boldsymbol{\Sigma}}} \newcommand{\Phib}{{\boldsymbol{\Phi}}} \newcommand{\Phat}{\hat{{\bf P}}}\)Vector autoregression (VAR) is a useful tool for analyzing the dynamics of multiple time series. VAR expresses a vector of observed variables as a function of its own lags. Simulation Let’s begin by simulating a bivariate VAR(2) process using the following specification, \[ \begin{bmatrix} y_{1,t}\\ y_{2,t} \end{bmatrix} […]
↧
How to generate random numbers in Stata
Overview I describe how to generate random numbers and discuss some features added in Stata 14. In particular, Stata 14 includes a new default random-number generator (RNG) called the Mersenne Twister (Matsumoto and Nishimura 1998), a new function that generates random integers, the ability to generate random numbers from an interval, and several new functions […]
↧
Fitting distributions using bayesmh
This post was written jointly with Yulia Marchenko, Executive Director of Statistics, StataCorp. As of update 03 Mar 2016, bayesmh provides a more convenient way of fitting distributions to the outcome variable. By design, bayesmh is a regression command, which models the mean of the outcome distribution as a function of predictors. There are cases […]
↧
↧
A simulation-based explanation of consistency and asymptotic normality
Overview In the frequentist approach to statistics, estimators are random variables because they are functions of random data. The finite-sample distributions of most of the estimators used in applied work are not known, because the estimators are complicated nonlinear functions of random data. These estimators have large-sample convergence properties that we use to approximate their […]
↧
ARMA processes with nonnormal disturbances
Autoregressive (AR) and moving-average (MA) models are combined to obtain ARMA models. The parameters of an ARMA model are typically estimated by maximizing a likelihood function assuming independently and identically distributed Gaussian errors. This is a rather strict assumption. If the underlying distribution of the error is nonnormal, does maximum likelihood estimation still work? The […]
↧
Understanding omitted confounders, endogeneity, omitted variable bias, and related concepts
Initial thoughts Estimating causal relationships from data is one of the fundamental endeavors of researchers. Ideally, we could conduct a controlled experiment to estimate causal relations. However, conducting a controlled experiment may be infeasible. For example, education researchers cannot randomize education attainment and they must learn from observational data. In the absence of experimental data, […]
↧