Sample Problem for Assignment 2

Sample Problem for Assignment 2

Note: The following is an example of the spirit of a problem that might form the basis for your simulation project. Do not, however, feel that the problem you select must conform in any way to this example! E.g., you can study testing procedures rather than estimators and be motivated by completely different considerations. The write-up here is intended only for your information and should not be taken to be an example of the kind of write-up you should prepare.

The effect of estimating weights in regression: In regression with independent data, it is well-known that the ordinary least squares (OLS) estimator for the regression parameter assuming the variance of all observations is constant has asymptotically a normal distribution with mean equal to the true value of the regression parameter. When the variances are not constant, it is also well-known that the weighted least squares (WLS) estimator with the known, true weights is also asymptotically normally distributed, but with smaller variance. In real problems, the true weights are rarely known. Thus, a common practice is to model and estimate them and then substitute the estimated weights for the true weights in the WLS estimator, yielding what is often called the (estimated) generalized least squares (GLS) estimator. The large-sample theory states that the GLS estimator has asymptotically the same normal distribution as that of WLS, thus suggesting that there is no penalty to be paid for having to estimate the weights rather then knowing them, which seems rather optimistic (intuitively, one might expect GLS to be less precise than WLS).

It is natural to wonder if any of these results are relevant in the kinds of finite samples seen in practice. In particular, is the distribution of each estimator approximately normally distributed with mean equal to the truth and variance that can be estimated accurately by the expression from the large sample theory? Does the need to estimate the weights not matter in finite samples, or is the GLS estimator inefficient relative to the WLS estimator for which the weights are known, as intuition would suggest? What if the model for the weights used to construct the GLS estimator is incorrect? In this case, the estimated weights may not accurately reflect the true ones. What are the possible consequences? As WLS has smaller variance than OLS (at least asymptotically) only when the true weights are used, is it possible that one might be better off using OLS rather than using GLS with a misspecified model for the weights?