# AQRM part 2: causal analysis when endogeneity likely

0.0 / 5

- Created by: charlie
- Created on: 21-05-17 15:04

exogeneity holds (regressor uncorrelated with ui)

OLS will predict a CAUSAL relationship between outcome and regressors, OLS will be BLUE (consistent + unbias)

1 of 51

exogeneity doesn't hold/ endogeneity (regressor correlated with ui)

OLS will predict a NON-CAUSAL relationship between outcome and regressors, OLS inconsistent + bias

2 of 51

(4) reasons for endogeneity

(1) OVB (2) Measurement error (3) Simultaneity bias (4) Selection bias

3 of 51

reasons for endogeneity: (1) OVB

excluding relevant variables that are correlated with X's

4 of 51

reasons for endogeneity: (1) OVB (upwards bias/ overestimated)

SAME SIGNS some of effect of OV mistakenly attributed to X (occurs when +ve/+ve (or -ve/-ve): cov(OV,Y)>0 and important (coef not 0), and cov(OV,X)>0)

5 of 51

reasons for endogeneity: (1) OVB (downwards bias/ underestimated)

DIFFERENT SIGNS some of effect of OV mistakenly attributed to X (occurs when +ve/-ve (or -ve/+ve) cov(OV,Y)>0 and important (coef not 0), and cov(OV,X)

6 of 51

reasons for endogeneity: (2) measurement error

OLS will not estimate B2 consistently if Xi is likely to be measured with error (e.g permanent income as a proxy for current income/ asking people no. of cigarettes smoked per week)

7 of 51

reasons for endogeneity: (3) simultaneity bias

both equations determined simultaneously where they both depend on each other (regress doesn't take this into account) e.g. supply+demand/ police spending + crime

8 of 51

reasons for endogeneity: (4) selection bias (lecture example)

sample isn't representative of population (relationship between Y and X of sampled individuals is different to that of unsampled) e.g. people attending lecture after drinking will not be as affected by alcohol (self-select + miss most important data)

9 of 51

(2) solutions to endogeneity

(1) randomisation in OLS (2) use of instrument variables

10 of 51

solutions to endogeneity: (1) Randomisation in OLS

randomly assigning X (by some device) will make X uncorrelated with all characteristics (regressors and omitted variables in ui), will on avg assign same mix (in large samples everything else on avg is the same and variation in outcome only due to X)

11 of 51

solutions to endogeneity: (1) Randomisation: RCT healthcare example

treatment effect captured by including randomly assigned dummy variable =1 if treated (=0 if control)

12 of 51

solutions to endogeneity: (1) Randomisation: RCT (5) limitations on human subjects

(1) unethical to force people (2) only on people choosing to participate (3) impossible to strictly assign variable (control group will obtained by other means) (4) effect of having 'option' not actual treatment (5) limited duration of experiments

13 of 51

solutions to endogeneity: (2) IV (Z) reason

helps identify causal relationship as will observe some exogenous variation in X through Z

14 of 51

solutions to endogeneity: (2) IV (2) validity assumptions

(1) RELEVANCE Z must be correlated with X (can directly test) (2) EXOGENEITY Z must not determine Y + not be correlated with omitted variables that determine Y (ui) (can't directly test)

15 of 51

solutions to endogeneity: (2) IV 2SLS (stage 1)

(1) regress X on Z and obtain fitted values, separates 2 sources of variation (exog from Z, endog), creates fitted value that only captures exogenous variation in X (coming from Z)

16 of 51

solutions to endogeneity: (2) IV 2SLS (stage 2)

(2) [predict X(hat) from 1st stage] then regress Y on X(hat), X(hat) predicts causal effect of X on Y (only using variation in X that is unrelated to ui)

17 of 51

solutions to endogeneity: (2) IV 2SLS (1 command corrects s.e)

manually computing in stata = incorrect s.e as doesn't take into account X(hat) also estimated using OLS, STATA command: ivregress 2sls lnearn (S= sm sf siblings) female wexp...

18 of 51

solutions to endogeneity: (2) IV (4) additional issues

(1) how doe we test endogeneity of X? (2) RELEVANCE need to test strength of IV (3) EXOGENEITY need to test exogeneity of IV (4) what if multiple endogenous regressors?

19 of 51

IV: with additional exogenous variables: IV assumptions

(1) RELEVANCE (different): Z must be correlated with X after accounting for W (D2 cannot =0 holding W cst) (2) EXOGENEITY (same)

20 of 51

IV: with additional exogenous variables: 2SLS

(stage 1) include W in regression (stage 2) include W in regression

21 of 51

IV: with multiple instruments + additional exogenous variables: IV assumptions

(1) RELEVANCE (different): At least one IV must be correlated with X after accounting for W (cannot be D2=D3=0) (2) EXOGENEITY (different) both IV must not be correlated with ui

22 of 51

IV: with multiple instruments + additional exogenous variables: 2SLS

(stage 1) include all IV and W in regression (stage 2) include just fitted value and W in regression

23 of 51

IV: RELEVANCE testing for weak or strong instrument in stata

test in (stage 1) of 2SLS... (manual) regress 1st stage of 2SLS + test coef D2=D3=0 + check F-stat (quick) ivregress 2sls + estat firststage + check F-stat

24 of 51

IV: RELEVANCE weak IV

F-stat

25 of 51

IV: RELEVANCE strong IV

F-stat>10, 2SLS estimator will be closer to true parameter value (stronger for larger sample sizes)

26 of 51

IV: ENDOGENEITY testing validity of IV: multiple IV identification

UNDER ID (no. of IV < endogenous reg. cannot use 2SLS) EXACT ID ( no. of IV = endogenous reg. can use 2SLS) OVER ID (no. of IV > endogenous reg. can use 2SLS + indirectly test exogeneity)

27 of 51

IV: ENDOGENEITY testing validity of IV: multiple IV Sargan Test

H0: All IV valid (exogenous) H1: At least one IV is invalid, ivregress 2sls + estat overid + check p-value of Sargan score (if p

28 of 51

IV: Randomisation

IV assigned randomly as easier to defend exogeneity assumption

29 of 51

IV: Randomisation ITT effect (causal effect of being ASSIGNED treatment)

difference in E(Y|Z) when randomised dummy Z=1 and Z=0 (e.g. difference in marks when sent email link for video, and not sent link for video)

30 of 51

IV: Randomisation ITT effect problem

UNDERESTIMATES causal effect (ignores non-compliance of individuals (e.g. some who received video won't watch it + some who don't receive video may find it by some other means))

31 of 51

IV: Randomisation causal effect of ACTUAL treatment

ITT effect (effect of being assigned random IV on Y) / effect of being assigned random IV on X (coef of stage 1 of 2SLS)

32 of 51

natural experiment (quasi-experiment)

can't conduct randomisation (RCT) exploit 'natural' event as source for randomness that makes endogeneity unlikely (e.g. natural/ unexpected/ reform/ regulation)

33 of 51

Regression Discontinuity (RD): definition

a situation (natural experiment) which treatment D depends on an observed continuos variable Q (running variable)

34 of 51

Regression Discontinuity (RD): features

cut off point (q0) get treatment (D=1 Q>q0) and don't get treatment (D=0 Q

35 of 51

Sharp RD: (2) features (e.g. effect of alc restrictions on mortality)

treatment status D is deterministic function of q0 (strictly enforce at cut off)/ D is discontinuous function of q0 (100% jump at cut off either able to buy or not)

36 of 51

Sharp RD: OLS model characteristics

.

37 of 51

Sharp RD: problems

only ST effect (other factors begin to affect after cut off)/ need large sample size (introduce range of other factors attributing to Y)

38 of 51

Sharp RD: OVB taken care of by controlling for trend f(Q)

unobservables will be correlated with D, but D is determined solely by age and can control for age using trend (e.g. can't choose when you turn 21 so can be controlled for)

39 of 51

Sharp RD/ Fuzzy RD: non-linear relationship

include another term (quadratic f(Q)) if likely that the general trend is not linear, never know exact coef as don't know precise trend relationship

40 of 51

Sharp RD: cumulative effects

include another term (X-X0) that captures how many units past cut off you are (treatment effect will now be combination D at cut off + D past cut off), only credible if no other influential factors past cut off

41 of 51

Sharp RD: (3) important points

(1) study doesn't tell us about any other policy changes (2) validity depends on willingness to extrapolate away from cut-off (larger sample) (3) should be no other discontinuities at cut-off

42 of 51

Fuzzy RD: (2) features (e.g. effect of raising school leaving age on age leave education

Treatment status D isn't deterministic function of q (not strictly enforced/ indirect effect) (2) probability/ intensity of treatment jumps at cut-off (not 100% change as still have option)

43 of 51

Fuzzy RD: (stage 1) estimate 2 linear models using OLS

use D (treatment effect) to see how much it affects regressor of interest

44 of 51

Fuzzy RD: (stage 2) use D in 2SLS

D (treatment) now used as IV: RELEVANT (correlated with X)+ EXOGENOUS (only related to Y through X) and regressor of interest X is used as TREATMENT

45 of 51

Fuzzy RD: assumptions

(1) trend captures all relevant differences before + after cut-off (2) IV exogenous (after adjusting for running variable will ply affect Y through X) + (3) No other factors affecting D (other policies)

46 of 51

Diff-in-diff: features

observe 2 groups over time (treatment/comparison)/ comparison is different from treatment + untreated in both periods/ treatment experiences policy change

47 of 51

Diff-in-diff: assumptions

DONT ASSUME: identical DO ASSUME: both would have followed same trend if left untreated

48 of 51

Diff-in-diff: treatment effect equation (diff-in-diff)

removes confounding factors on treatment/ control group that would be assumed to affect both groups (TREND), leaving just the policy change

49 of 51

Diff-in-diff: treatment effect equation bias

(1) dont compare directly between years as many other factors changing over time (2) don't compare directly between groups as will be systematically different

50 of 51

Diff-in-diff: OLS estimation equation

constant + dummy variable (spatial trend) + dummy variable (time trend) + interaction (treatment effect)

51 of 51

## Other cards in this set

### Card 2

#### Front

OLS will predict a NON-CAUSAL relationship between outcome and regressors, OLS inconsistent + bias

#### Back

exogeneity doesn't hold/ endogeneity (regressor correlated with ui)

### Card 3

#### Front

(1) OVB (2) Measurement error (3) Simultaneity bias (4) Selection bias

#### Back

### Card 4

#### Front

excluding relevant variables that are correlated with X's

#### Back

### Card 5

#### Front

SAME SIGNS some of effect of OV mistakenly attributed to X (occurs when +ve/+ve (or -ve/-ve): cov(OV,Y)>0 and important (coef not 0), and cov(OV,X)>0)

#### Back

## Related discussions on The Student Room

- Predictions for A-level AQA psychology paper 2? »
- Mediation analysis and multiple correlation »
- Aqa a level psychology predictions for paper 2 - 8th june 2018 »
- Psychology assignment help »
- Reliability and Validity (Research Methods) PLEASE SOMEONE HELP! »
- Edexcel Psychology A2 - Research Methods 09/06/09 »
- Sociology Exam essay structure »
- Hi, could anyone please look over this essay for me? »
- AQA Psychology Paper 2 Predictions »
- Edexcel Unit 4 Psychology »

## Similar Economics resources:

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

## Comments

No comments have yet been made