Instrumental Variables (IV) Estimator

Instrumental Variables (IV) Estimator

Instrumental Variables (IV) Estimator

IV Estimator - Resources

IV Estimator - Other

IV Estimator - Intuition & Derivation

assume we have:

the least squares estimate 𝜃1ˆ of true population parameter 𝜃1 is defined as:

  • 𝜃1ˆ = 𝛥𝑦/𝛥𝑥
  • 𝜃1ˆ = (𝛥𝑦𝑥 + 𝛥𝑦𝑒)/𝛥𝑥 # because of endogeneity
  • 𝜃1ˆ = (𝛥𝑦𝑥/𝛥𝑥) + (𝛥𝑦𝑒/𝛥𝑥)
  • 𝜃1ˆ = 𝜃1 + (𝛥𝑦𝑒/𝛥𝑥) # population parameter 𝜃= (𝛥𝑦𝑥/𝛥𝑥) by definition

PROBLEM: therefore, the least squares estimate 𝜃1ˆ is a BIASED estimate of the true population parameter 𝜃because of endogeneity

SOLUTION: introduce a third variable (instrumental variable) 𝑧𝑖 such that:

  • 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) ≠ 0
  • 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖) = 0 

next we define:

  • 𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) = 𝐶𝑜𝑣(𝑧𝑖,𝜃0 + 𝜃1𝑥𝑖 + 𝑒𝑖)
  • 𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) = 𝐶𝑜𝑣(𝑧𝑖,𝜃0) + 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) + 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖# by properties of covariance
  • 𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) = 0 + 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) + 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖# covariance with constant equals 0
  • 𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) = 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) + 0 # by above statement 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖) = 0

therefore:

  • 𝜃= 𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖)

therefore, the IV Estimate 𝜃1ˆ of the true population parameter 𝜃1 is defined as:

  • 𝜃1ˆ = 𝑆𝑎𝑚𝑝𝑙𝑒-𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) / 𝑆𝑎𝑚𝑝𝑙𝑒-𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖)

Resource Videos

 Click here to expand...

IV Estimator - Examples

 Click here to expand...

IV Estimator - Bad/Weak/Good Instrument Variables

IV TypeConditions of IV TypeIV Estimate is UnbiasedIV Estimate is Consistent
Good Instrument Variables
  • 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) ≠ 0
  • 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖) = 0
Bad Instrument Variables
  • 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) /= 0
  • 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖) ≠ 0
Weak Instrument Variables
  • 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) ≈ 0
  • 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖) = 0
?

Resource Videos

 Click here to expand...

IV Estimator - Biasness & Consistency

 bias

an explanation on why the IV Estimate 𝜃1ˆ of population parameter 𝜃1 is biased

assume we have:

then the IV Estimate 𝜃1ˆ of population parameter 𝜃1 is biased, in other words:

  • 𝐄[𝜃1ˆ] ≠ 𝜃1

from Instrumental Variable Estimate vs 2 Stage Least Squares Estimate we see that IV Estimate is similar to 2SLS Estimate. In 2SLS we have 2 stages of LS regression:

  1. 𝑥𝑖 = 𝛿0 + 𝛿1𝑧𝑖 + 𝜀𝑖
  2. 𝑦𝑖 = 𝜃0 + 𝜃1𝑥𝑖 + 𝑒𝑖 # original regression model
KNOWN {𝛿0, 𝛿1}UNKNOWN {𝛿0, 𝛿1}

Assume we KNOW the values of {𝛿0, 𝛿1}. Thus:

  • 𝑥𝑖𝑡𝑟𝑢𝑒 = 𝛿0 + 𝛿1𝑧𝑖

plug this into the original regression model:

  • 𝑦𝑖 = 𝜃0 + 𝜃1𝑥𝑖𝑡𝑟𝑢𝑒 + 𝑒𝑖

because the values of {𝛿0, 𝛿1} are KNOWN, 𝑥𝑖𝑡𝑟𝑢𝑒 contains none of 𝜀𝑖, thus 𝜀𝑖 is completely uncorrelated with 𝑧𝑖. Thus there is no correlation between 𝑒𝑖 and 𝑥𝑖𝑡𝑟𝑢𝑒

In reality, the values of {𝛿0, 𝛿1} are UNKNOWN and are estimated with {𝛿0ˆ, 𝛿1ˆ}. Thus:

  • 𝑥̂𝑖= 𝛿0ˆ + 𝛿1ˆ𝑧𝑖 + 𝜀𝑖

plugging into the original regression model:

  • 𝑦𝑖 = 𝜃0 + 𝜃1𝑥̂𝑖 + 𝑒𝑖

because the values of {𝛿0, 𝛿1} are UNKNOWN, 𝑥̂𝑖 contains 𝜀𝑖, because of sampling error {𝛿0, 𝛿1} ≠ {𝛿0ˆ, 𝛿1ˆ} and 𝑥̂𝑖 ≠ 𝑥𝑖𝑡𝑟𝑢𝑒. Thus there is some correlation between 𝜀𝑖 and 𝑧𝑖. Thus there is some correlation between 𝑒𝑖 and 𝑥𝑖𝑡𝑟𝑢𝑒

Resource Video

 consitency

an explanation on why the IV Estimate 𝜃1ˆ of population parameter 𝜃1 is consistent

assume we have:

then the IV Estimate 𝜃1ˆ of population parameter 𝜃1 is consistent, in other words:

  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = 𝜃1

PROOF

first let's take the definition an IV Estimate:

  • 𝜃1ˆ = 𝑆𝑎𝑚𝑝𝑙𝑒-𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) / 𝑆𝑎𝑚𝑝𝑙𝑒-𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖)

as 𝑛→∞ we have:

  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = 𝐶𝑜𝑣(𝑧𝑖,𝑦𝑖) / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖)
  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = 𝐶𝑜𝑣(𝑧𝑖,𝜃0 + 𝜃1𝑥𝑖 + 𝑒𝑖) / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) # because 𝑦𝑖 = 𝜃0 + 𝜃1𝑥𝑖 + 𝑒𝑖
  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = [𝐶𝑜𝑣(𝑧𝑖,𝜃0) + 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) + 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖)] / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) # by properties of covariance
  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = [0 + 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) + 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖)] / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) # covariance with a constant is 0
  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = [0 + 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) + 0] / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) # 𝐶𝑜𝑣(𝑧𝑖,𝑒𝑖) = 0 by condition of good instrumental variable
  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = 𝜃1𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖) / 𝐶𝑜𝑣(𝑧𝑖,𝑥𝑖)
  • 𝑝𝑙𝑖𝑚𝑛→∞ 𝜃1ˆ = 𝜃1

hence, proved

Resource Video