Correlation describes the relationship between two variables plotted on a scatter graph.
Types of correlation:
- Positive correlation: as one variable increases, so does the other — points slope upward left to right
- Negative correlation: as one variable increases, the other decreases — points slope downward
- No correlation: no visible pattern — points scattered randomly
Strength of correlation:
- Strong: points close to a straight line
- Weak: points loosely scattered around a trend
- Perfect: all points lie exactly on a line (rare in real data)
$$\text{Positive: } r > 0, \quad \text{Negative: } r < 0, \quad \text{No correlation: } r \approx 0$$
Important: correlation does not imply causation — two variables may be correlated due to a third (confounding) variable.
The line of best fit passes through the mean point $(\bar{x}, \bar{y})$ and minimises the overall distance of points from the line.
Common error: confusing correlation with causation, or drawing a line of best fit that does not pass through the mean point.
