Endogeneity in Poisson regression
Wooldridge and Terza provide a methodology to both deal with and test for endogeneity within the exponential regression framework, which the following discussion follows closely.[8] While the example focuses on a Poisson regression model, it is possible to generalize to other exponential regression models, although this may come at the cost of additional assumptions (e.g. for binary response or censored data models).
Assume the following exponential regression model, where is an unobserved term in the latent variable. We allow for correlation between and (implying is possibly endogenous), but allow for no such correlation between and .
The variables serve as instrumental variables for the potentially endogenous . One can assume a linear relationship between these two variables or alternatively project the endogenous variable onto the instruments to get the following reduced form equation:
-
|
|
(1) |
The usual rank condition is needed to ensure identification. The endogeneity is then modeled in the following way, where determines the severity of endogeneity and is assumed to be independent of .
Imposing these assumptions, assuming the models are correctly specified, and normalizing , we can rewrite the conditional mean as follows:
-
|
|
(2) |
If were known at this point, it would be possible to estimate the relevant parameters by quasi-maximum likelihood estimation (QMLE). Following the two step procedure strategies, Wooldridge and Terza propose estimating equation (1) by ordinary least squares. The fitted residuals from this regression can then be plugged into the estimating equation (2) and QMLE methods will lead to consistent estimators of the parameters of interest. Significance tests on can then be used to test for endogeneity within the model.