Hypothesis Testing for Correlation

What is a hypothesis test for correlation?

You can use a t-test to test whether there is linear correlation between two normally distributed variables

If specifically testing for positive (or negative) linear correlation then a one-tailed test is used
If testing for any linear correlation then a two-tailed test is used

A sample will be taken and the raw data will be given

You might be asked to calculate the PMCC (Pearson's product-moment correlation coefficient)

What are the steps for a hypothesis test for correlation?

STEP 1: Write the hypotheses
- H₀: ρ = 0
  - Clearly state that ρ represents population correlation coefficient between the two variables
  - In words this means there is no correlation
- H₁: ρ < 0, H₁: ρ > 0 or H₁: ρ ≠ 0

STEP 2: Calculate the p-value or the PMCC

Choose a t-test for linear regression
Enter the data as two lists into GDC

STEP 3: Decide whether there is evidence to reject the null hypothesis

If the p-value < significance level then reject H₀
If the absolute value of the PMCC is bigger than the absolute value of the critical value then reject H₀

If you are expected to use the PMCC you will be given the critical value in the exam

STEP 4: Write your conclusion

If you reject H₀ then there is evidence to suggest that...

There is a negative linear correlation between the two variables (for H₁: ρ < 0)
There is a positive linear correlation between the two variables (for H₁: ρ > 0)
There is a linear correlation between the two variables (for H₁: ρ ≠ 0)

If you accept H₀then there is insufficient evidence to reject the null hypothesis which suggests that...

There is not a negative linear correlation between the two variables (for H₁: ρ < 0)
There is not a positive linear correlation between the two variables (for H₁: ρ > 0)
There is not a linear correlation between the two variables (for H₁: ρ ≠ 0)

Worked example

Jessica wants to test whether there is any linear correlation between the distance she runs in a day, $d$ km, and the amount of sleep she has the night after her run, $t$ hours. Over the period of a month she takes a random sample of 9 days, the results are recorded in the table.

Distance ( $d$ km)	1.2	2.3	1.5	1.3	2.5	1.8	1.9	2.0	1.1
Sleep ( $t$ hours)	7.9	8.1	7.6	7.3	8.1	8.4	7.8	7.9	6.8

Write down null and alternative hypotheses that Jessica can use for her test.

4-12-6-ib-ai-hl-hyp-test-for-correlation-a-we-solution

Perform the hypothesis test for linear correlation using a 5% significance level. Clearly state your conclusion.

4-12-6-ib-ai-hl-hyp-test-for-correlation-b-we-solution

Hypothesis Testing for Correlation (DP IB Maths: AI HL)

Revision Note