Once you have a regression line, how do you know it actually fits? A residual is the gap between an actual data point and what the line predicted, how wrong the model was for that point.
Positive residual: the point is above the line, the model under-predicted (e.g. predicted 70, scored 75 โ +5).
Negative residual: the point is below the line, the model over-predicted (e.g. predicted 80, scored 74 โ โ6).
For a least-squares line, the sum of all residuals is always 0, the over- and under-predictions balance out. Handy as a check.
Plot the residuals against x, with a zero line through the middle:
Random scatter around zero โ a linear model is appropriate. โ
A clear pattern (curve, fan shape) โ the relationship is non-linear, a straight line isn't the best model.
Using ลท = 4.5x + 38, find Cal's residual. Cal studied x = 6 hours and scored 67.
โข Residual = actual โ predicted, in that order (not predicted โ actual).
โข Read the residual plot's shape, not the original scatter, to judge the fit.
โข Sum of residuals near 0 is a sign your line is right.
โข A point far from the zero line is an outlier, mention it.
Left: residuals scattered randomly around the zero line, the straight-line model is fine. Right: residuals make a clear curve, the data is really non-linear.
The residual is the vertical gap from the actual point to the line: actual minus predicted.
Don't read yet, just have a go in your head:
Cal: x = 6, scored 67. Predicted = 4.5(6) + 38 = 65. Residual = 67 โ 65 = +2.
Brooke: x = 4, scored 54. Predicted = 4.5(4) + 38 = ?. Residual = 54 โ ? = ?
Dana: x = 8, scored 72. Find Dana's residual, and say if Dana is above or below the line. Check below.
A residual plot shows the points forming a clear downward-then-upward curve. What does this tell you about the model?
In one sentence, out loud: what does a residual plot tell you that the correlation alone doesn't?