Frisch-Waugh-Lovell Theorem: Animated

Update: The code for these animations is available here.

The Frisch-Waugh-Lovell theorem states that within a multivariate regression on x_1 and x_2, the coefficient for x_2, which is \beta_2, will be the exact same as if you had instead run a regression on the residuals of y and x_2 after regressing each one on x_1 separately.

The point of this post is not to explain the FWL theorem in linear algebraic detail, or explain why it’s useful (basically, it’s a fundamental intuition about what multivariate regression does and what it means to “partial” out the effects of two regressors). If you want to learn more about that, there’s some great stuff already on Google.

The point of this post is to simply provide an animation of this theorem. I find that the explanations of this theorem are often couched in lots of linear algebra, and it may be hard for some people to understand what’s going on exactly. I hope this animation can help with that.

Our Data

import numpy as np
import pandas as pd
import statsmodels.api as sm

np.random.seed(42069)

df = pd.DataFrame({'x1': np.random.uniform(0, 10, size=50)})
df['x2'] = 4.9 + df['x1'] * 0.983 + 2.104 * np.random.normal(0, 1.35, size=50)
df['y'] = 8.643 - 2.34 * df['x1'] + 3.35 * df['x2'] + np.random.normal(0, 1.65, size=50)
df['const'] = 1

model = sm.OLS(
    endog=df['y'],
    exog=df[['const', 'x1', 'x2']]
).fit()

model.summary()

The output of the above:

OLS Regression Results

Dep. Variable:yR-squared:0.977
Model:OLSAdj. R-squared:0.976
Method:Least SquaresF-statistic:997.5
Date:Sat, 26 Dec 2020Prob (F-statistic):3.22e-39
Time:17:11:39Log-Likelihood:-95.281
No. Observations:50AIC:196.6
Df Residuals:47BIC:202.3
Df Model:2
Covariance Type:nonrobust
coefstd errtP>|t|[0.0250.975]
const9.46730.54617.3370.0008.36910.566
x1-2.20030.128-17.2130.000-2.458-1.943
x23.19310.08139.6470.0003.0313.355
Omnibus:0.120Durbin-Watson:1.914
Prob(Omnibus):0.942Jarque-Bera (JB):0.279
Skew:-0.095Prob(JB):0.870
Kurtosis:2.687Cond. No.27.3

The Animation

Here is what would happen if we actually ran a univariate regression on the residuals after factoring out x_1.

(The animation takes a few seconds, so you might need to wait for it to restart to get the full effect.)

Notice that the slope in the final block ends up equaling 3.1931, which is the coefficient for x_2 in the multivariate regression.

Getting the coefficient for x_1 is more interesting; one thing that happens in the multivariate regression is the coefficient \beta_1 is negative despite the fact that x_1 is positively correlated with y. What gives? Well, the following animation helps to show where that comes from:

You can mostly see here what’s happening: After we take out the effect of x_2 on x_1, what we’re left over with is a negative relationship between y and x_1. Put another way: there is a negative correlation between x_1 and the residuals from the regression x_1 \sim x_2.

%d bloggers like this: