Quantcast
Channel: Statistics – The Stata Blog
Browsing latest articles
Browse All 90 View Live

Introduction to treatment effects in Stata: Part 2

This post was written jointly with David Drukker, Director of Econometrics, StataCorp. In our last post, we introduced the concept of treatment effects and demonstrated four of the treatment-effects...

View Article


Maximum likelihood estimation by mlexp: A chi-squared example

Overview In this post, I show how to use mlexp to estimate the degree of freedom parameter of a chi-squared distribution by maximum likelihood (ML). One example is unconditional, and another example...

View Article


Image may be NSFW.
Clik here to view.

Efficiency comparisons by Monte Carlo simulation

Overview In this post, I show how to use Monte Carlo simulations to compare the efficiency of different estimators. I also illustrate what we mean by efficiency when discussing statistical estimators....

View Article

Estimating parameters by maximum likelihood and method of moments using mlexp...

\(\newcommand{\epsilonb}{\boldsymbol{\epsilon}} \newcommand{\ebi}{\boldsymbol{\epsilon}_i} \newcommand{\Sigmab}{\boldsymbol{\Sigma}} \newcommand{\Omegab}{\boldsymbol{\Omega}}...

View Article

Probit model with sample selection by mlexp

Overview In a previous post, David Drukker demonstrated how to use mlexp to estimate the degree of freedom parameter in a chi-squared distribution by maximum likelihood (ML). In this post, I am going...

View Article


Fixed Effects or Random Effects: The Mundlak Approach

Today I will discuss Mundlak’s (1978) alternative to the Hausman test. Unlike the latter, the Mundlak approach may be used when the errors are heteroskedastic or have intragroup correlation. What is...

View Article

Using mlexp to estimate endogenous treatment effects in a probit model

I use features new to Stata 14.1 to estimate an average treatment effect (ATE) for a probit model with an endogenous treatment. In 14.1, we added new prediction statistics after mlexp that margins can...

View Article

xtabond Cheat Sheet

Random-effects and fixed-effects panel-data models do not allow me to use observable information of previous periods in my model. They are static. Dynamic panel-data models use current and past...

View Article


Image may be NSFW.
Clik here to view.

Understanding the generalized method of moments (GMM): A simple example

\(\newcommand{\Eb}{{\bf E}}\)This post was written jointly with Enrique Pinzon, Senior Econometrician, StataCorp. The generalized method of moments (GMM) is a method for constructing estimators,...

View Article


Using mlexp to estimate endogenous treatment effects in a heteroskedastic...

I use features new to Stata 14.1 to estimate an average treatment effect (ATE) for a heteroskedastic probit model with an endogenous treatment. In 14.1, we added new prediction statistics after mlexp...

View Article

probit or logit: ladies and gentlemen, pick your weapon

We often use probit and logit models to analyze binary outcomes. A case can be made that the logit model is easier to interpret than the probit model, but Stata’s margins command makes any estimator...

View Article

regress, probit, or logit?

In a previous post I illustrated that the probit model and the logit model produce statistically equivalent estimates of marginal effects. In this post, I compare the marginal effect estimates from a...

View Article

Bayesian binary item response theory models using bayesmh

This post was written jointly with Yulia Marchenko, Executive Director of Statistics, StataCorp. Table of Contents Overview 1PL model 2PL model 3PL model 4PL model 5PL model Conclusion Overview Item...

View Article


Testing model specification and using the program version of gmm

This post was written jointly with Joerg Luedicke, Senior Social Scientist and Statistician, StataCorp. The command gmm is used to estimate the parameters of a model using the generalized method of...

View Article

Vector autoregression—simulation, estimation, and inference in Stata

\(\newcommand{\epsb}{{\boldsymbol{\epsilon}}} \newcommand{\mub}{{\boldsymbol{\mu}}} \newcommand{\thetab}{{\boldsymbol{\theta}}} \newcommand{\Thetab}{{\boldsymbol{\Theta}}}...

View Article


How to generate random numbers in Stata

Overview I describe how to generate random numbers and discuss some features added in Stata 14. In particular, Stata 14 includes a new default random-number generator (RNG) called the Mersenne Twister...

View Article

Fitting distributions using bayesmh

This post was written jointly with Yulia Marchenko, Executive Director of Statistics, StataCorp. As of update 03 Mar 2016, bayesmh provides a more convenient way of fitting distributions to the outcome...

View Article


A simulation-based explanation of consistency and asymptotic normality

Overview In the frequentist approach to statistics, estimators are random variables because they are functions of random data. The finite-sample distributions of most of the estimators used in applied...

View Article

ARMA processes with nonnormal disturbances

Autoregressive (AR) and moving-average (MA) models are combined to obtain ARMA models. The parameters of an ARMA model are typically estimated by maximizing a likelihood function assuming independently...

View Article

Understanding omitted confounders, endogeneity, omitted variable bias, and...

Initial thoughts Estimating causal relationships from data is one of the fundamental endeavors of researchers. Ideally, we could conduct a controlled experiment to estimate causal relations. However,...

View Article

Gelman–Rubin convergence diagnostic using multiple chains

Overview MCMC algorithms used for simulating posterior distributions are indispensable tools in Bayesian analysis. A major consideration in MCMC simulations is that of convergence. Has the simulated...

View Article


Tests of forecast accuracy and forecast encompassing

\(\newcommand{\mub}{{\boldsymbol{\mu}}} \newcommand{\eb}{{\boldsymbol{e}}} \newcommand{\betab}{\boldsymbol{\beta}}\)Applied time-series researchers often want to compare the accuracy of a pair of...

View Article


Multiple equation models: Estimation and marginal effects using gsem

Starting point: A hurdle model with multiple hurdles In a sequence of posts, we are going to illustrate how to obtain correct standard errors and marginal effects for models with multiple steps. Our...

View Article

Multiple equation models: Estimation and marginal effects using mlexp

We continue with the series of posts where we illustrate how to obtain correct standard errors and marginal effects for models with multiple steps. In this post, we estimate the marginal effects and...

View Article

Unit-root tests in Stata

\(\newcommand{\mub}{{\boldsymbol{\mu}}} \newcommand{\eb}{{\boldsymbol{e}}} \newcommand{\betab}{\boldsymbol{\beta}}\)Determining the stationarity of a time series is a key step before embarking on any...

View Article


Flexible discrete choice modeling using a multinomial probit model, part 1

\(\newcommand{\xb}{{\bf x}} \newcommand{\betab}{\boldsymbol{\beta}} \newcommand{\zb}{{\bf z}} \newcommand{\gammab}{\boldsymbol{\gamma}}\)We have no choice but to choose We make choices every day, and...

View Article

Flexible discrete choice modeling using a multinomial probit model, part 2

Overview In the first part of this post, I discussed the multinomial probit model from a random utility model perspective. In this part, we will have a closer look at how to interpret our estimation...

View Article

Effects of nonlinear models with interactions of discrete and continuous...

I want to estimate, graph, and interpret the effects of nonlinear models with interactions of continuous and discrete variables. The results I am after are not trivial, but obtaining what I want using...

View Article

Doctors versus policy analysts: Estimating the effect of interest

\(\newcommand{\Eb}{{\bf E}}\)The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I am interested in estimating...

View Article



Probability differences and odds ratios measure conditional-on-covariate...

\(\newcommand{\Eb}{{\bf E}} \newcommand{\xb}{{\bf x}} \newcommand{\betab}{\boldsymbol{\beta}}\)Differences in conditional probabilities and ratios of odds are two common measures of the effect of a...

View Article

Multiple-equation models: Estimation and marginal effects using gmm

We estimate the average treatment effect (ATE) for an exponential mean model with an endogenous treatment. We have a two-step estimation problem where the first step corresponds to the treatment model...

View Article

Vector autoregressions in Stata

Introduction In a univariate autoregression, a stationary time-series variable \(y_t\) can often be modeled as depending on its own lagged values: \begin{align} y_t = \alpha_0 + \alpha_1 y_{t-1} +...

View Article

Exact matching on discrete covariates is the same as regression adjustment

I illustrate that exact matching on discrete covariates and regression adjustment (RA) with fully interacted discrete covariates perform the same nonparametric estimation. Comparing exact matching with...

View Article


Group comparisons in structural equation models: Testing measurement invariance

When fitting almost any model, we may be interested in investigating whether parameters differ across groups such as time periods, age groups, gender, or school attended. In other words, we may wish to...

View Article

Two faces of misspecification in maximum likelihood: Heteroskedasticity and...

For a nonlinear model with heteroskedasticity, a maximum likelihood estimator gives misleading inference and inconsistent marginal effect estimates unless I model the variance. Using a robust estimate...

View Article

Cointegration or spurious regression?

\(\newcommand{\betab}{\boldsymbol{\beta}}\)Time-series data often appear nonstationary and also tend to comove. A set of nonstationary series that are cointegrated implies existence of a long-run...

View Article


An ordered-probit inverse probability weighted (IPW) estimator

teffects ipw uses multinomial logit to estimate the weights needed to estimate the potential-outcome means (POMs) from a multivalued treatment. I show how to estimate the POMs when the weights come...

View Article


Structural vector autoregression models

\(\def\bfy{{\bf y}} \def\bfA{{\bf A}} \def\bfB{{\bf B}} \def\bfu{{\bf u}} \def\bfI{{\bf I}} \def\bfe{{\bf e}} \def\bfC{{\bf C}} \def\bfsig{{\boldsymbol \Sigma}}\)In my last post, I discusssed...

View Article

Quantile regression allows covariate effects to differ by quantile

Quantile regression models a quantile of the outcome as a function of covariates. Applied researchers use quantile regressions because they allow the effect of a covariate to differ across conditional...

View Article

Estimating covariate effects after gmm

In Stata 14.2, we added the ability to use margins to estimate covariate effects after gmm. In this post, I illustrate how to use margins and marginsplot after gmm to estimate covariate effects for a...

View Article

Solving missing data problems using inverse-probability-weighted estimators

We discuss estimating population-averaged parameters when some of the data are missing. In particular, we show how to use gmm to estimate population-averaged parameters for a probit model when the...

View Article


Long-run restrictions in a structural vector autoregression

\(\def\bfA{{\bf A}} \def\bfB{{\bf }} \def\bfC{{\bf C}}\)Introduction In this blog post, I describe Stata’s capabilities for estimating and analyzing vector autoregression (VAR) models with long-run...

View Article

Introduction to Bayesian statistics, part 1: The basic concepts

In this blog post, I’d like to give you a relatively nontechnical introduction to Bayesian statistics. The Bayesian approach to statistics has become increasingly popular, and you can fit Bayesian...

View Article


Introduction to Bayesian statistics, part 2: MCMC and the Metropolis–Hastings...

In this blog post, I’d like to give you a relatively nontechnical introduction to Markov chain Monte Carlo, often shortened to “MCMC”. MCMC is frequently used for fitting Bayesian statistical models....

View Article

Understanding truncation and censoring

Truncation and censoring are two distinct phenomena that cause our samples to be incomplete. These phenomena arise in medical sciences, engineering, social sciences, and other research fields. If we...

View Article


Estimation under omitted confounders, endogeneity, omitted variable bias, and...

Initial thoughts Estimating causal relationships from data is one of the fundamental endeavors of researchers, but causality is elusive. In the presence of omitted confounders, endogeneity, omitted...

View Article

Stata 15 announced, available now

We announced Stata 15 today. It’s a big deal because this is Stata’s biggest release ever. I posted to Statalist this morning and listed sixteen of the most important new features. Here on the blog I...

View Article

Nonparametric regression: Like parametric regression, but not

Initial thoughts Nonparametric regression is similar to linear regression, Poisson regression, and logit or probit regression; it predicts a mean of an outcome for a set of covariates. If you work with...

View Article

Estimating the parameters of DSGE models

Introduction Dynamic stochastic general equilibrium (DSGE) models are used in macroeconomics to model the joint behavior of aggregate time series like inflation, interest rates, and unemployment. They...

View Article


Bayesian logistic regression with Cauchy priors using the bayes prefix

Introduction Stata 15 provides a convenient and elegant way of fitting Bayesian regression models by simply prefixing the estimation command with bayes. You can choose from 45 supported estimation...

View Article

Browsing latest articles
Browse All 90 View Live