Regression Metrics — Study Notes

Letty Wu
2 min readJan 27, 2021

Residuals- we can plot residuals out and look for Scedasticity and Outliers, this is how a Homoscedasticity residuals plot look like:

source

Mean Absolute Error (MAE): goal is to get mae as close to 0 as possible:

  • Pros: Represents median distance from the predicted value; In the original units of Y; Is not heavily affected by outliers.
  • Cons: Depends on scale of Y; Punishes all errors with same “severity”.

Sum Squared Error (SSE) or Residual Sum of Squres (RRS): the basis for several other loss/optimization functions.

Mean Squared Error (MSE): goal is to get mse as close to 0 as possible:

  • Pros: Very Common; Represents average distance squared from the predicted value; Punishes outliers severely; Coincides directly with the metric used to fit OLS model.
  • Cons: Can be heavily affected by outliers; Not in the original units of Y; Depends on scale of Y; Uninterpretable to humans.

Root Mean Squared Error (RMSE): goal is to get rmse as close to 0 as possible:

  • Pros: Common; Represents (approximately) average distance from the predicted value; In the original units of Y.
  • Cons: Can be heavily affected by outliers; Depends on scale of Y; Only a little interpretable.

Coefficient of Determination, R²: goal is to get R² as close to 1 as possible:

  • score=0: Model explains non of the variability of the response data around its mean.
  • score=1: Model explains all the variability of the response data around its mean.
  • Pros: Easy interpretation (An R² value of 0.8 means that 80% of the variability in y is explained by the x-variables in our model); Common metric; Does not depend on the scale of Y; Works with more than just linear regression.
  • Cons: As you add more variables, R² will never decrease (with linear regression, adjusted R² can handle this assumption better); ONLY interpretable with linear regression! Outside linear regression values outside 0 and 1 are possible.

--

--