A regression line is the line of best fit through a scatterplot. It gives you an equation, so you can predict y from any x. You won't draw it by hand, the calculator finds it, your job is to read what it means and use it.
ŷ ("y-hat") = the predicted value. The book gives the formulas the calculator uses: m = r·(sy/sx) and c = ȳ − mx̄.
Study hours (x) vs test score % (y) gives ŷ = 4.5x + 38, data range x = 2 to 10.
Interpolation = predicting inside the data range → reliable. Extrapolation = predicting outside it → risky, the pattern may not hold.
r² (coefficient of determination) = just square r. "r² = 0.98" means 98% of the variation in y is explained by x.
• Always interpret in context, name the variables and units. "Gradient is 4.5" gets no marks on its own.
• Check whether x = 0 is realistic before trusting the intercept.
• Predicting far outside the data is extrapolation, flag it as less reliable.
• r² is a percentage of variation explained, not the same as r.
The line sits as close as possible to all the points. The little orange gaps are residuals (actual minus predicted), the line minimises the total of their squares.
Predicting inside the green band is interpolation (reliable). Outside, in red, is extrapolation (less reliable).
Don't read yet, just have a go in your head:
Predict for x = 7: ŷ = 4.5(7) + 38 = 31.5 + 38 = 69.5. (x = 7 is inside 2 to 10, so it's reliable interpolation.)
Predict for x = 7: ŷ = 4.5(7) + 38 = ? + 38 = ?
Using ŷ = 4.5x + 38, how many study hours are predicted to give a score of 80? Solve for x. Check below.
The data covers x = 2 to 10 hours. A teacher uses ŷ = 4.5x + 38 to predict the score for someone studying 15 hours. Should they trust it?
In one sentence, out loud: what do the gradient and intercept each tell you in context? If you can say it, you've got it.