Unbiased Estimates (DP IB Maths: AI HL)

Revision Note

Dan

Author

Dan

Expertise

Maths

Did this video help you?

Unbiased Estimates

What is an unbiased estimator of a population parameter?

  • An estimator is a random variable that is used to estimate a population parameter
    • An estimate is the value produced by the estimator when a sample is used
  • An estimator is called unbiased if its expected value is equal to the population parameter
    • An estimate from an unbiased estimator is called an unbiased estimate
    • This means that the mean of the unbiased estimates will get closer to the population parameter as more samples are taken
  • The sample mean is an unbiased estimate for the population mean
    • x with bar on top equals fraction numerator sum x over denominator n end fraction
  • The sample variance is not an unbiased estimate for the population variance
    • s subscript n superscript 2 equals fraction numerator sum left parenthesis x minus x with bar on top right parenthesis squared over denominator n end fraction equals fraction numerator sum x squared over denominator n end fraction minus open parentheses x with bar on top close parentheses squared 
    • On average the sample variance will underestimate the population variance
    • As the sample size increases the sample variance gets closer to the unbiased estimate

What are the formulae for unbiased estimates of the mean and variance of a population?

  • A sample of n data values (x1, x2, ... etc) can be used to find unbiased estimates for the mean and variance of the population
  • An unbiased estimate for the mean μ of a population can be calculated using
    •  x with bar on top equals fraction numerator sum x over denominator n end fraction
  • An unbiased estimate for the variance σ² of a population can be calculated using
    • s subscript n minus 1 end subscript superscript 2 equals fraction numerator n over denominator n minus 1 end fraction s subscript n superscript 2
    • This is given in the formula booklet
    • This can also be written as s subscript n minus 1 end subscript superscript 2 equals fraction numerator sum left parenthesis x minus x with bar on top right parenthesis squared over denominator n minus 1 end fraction
      • Notice that dividing by n gives a biased estimate but dividing by n minus 1 gives an unbiased estimate
  • Different calculators can use different notations for s subscript n minus 1 end subscript superscript 2
    • sigma subscript n minus 1 end subscript superscript 2, s subscript blank superscript 2s with hat on top subscript blank superscript 2 are notations you might see
    • You may also see the square roots of these

Is sn-1 an unbiased estimate for the standard deviation?

  • Unfortunately sn-1 is not an unbiased estimate for the standard deviation of the population
  • It is better to work with the unbiased variance rather than standard deviation
  • There is not a formula for an unbiased estimate for the standard deviation that works for all populations
    • Therefore you will not be asked to find one in your exam

How do I show the sample mean is an unbiased estimate for the population mean?

  • You do not need to learn this proof
    • It is simply here to help with your understanding
  • Suppose the population of X has mean μ and variance σ²  
  • Take a sample of n observations
    • X1, X2, ..., Xn
    • E(Xi) = μ
  • Using the formula for a linear combination of independent variables:

table attributes columnalign right center left columnspacing 0px end attributes row cell straight E invisible function application open parentheses X with bar on top close parentheses end cell equals cell straight E invisible function application open parentheses fraction numerator X subscript 1 plus X subscript 2 plus blank horizontal ellipsis plus X subscript n over denominator n end fraction close parentheses end cell row blank equals cell fraction numerator straight E invisible function application open parentheses X subscript 1 close parentheses plus straight E invisible function application open parentheses X subscript 2 close parentheses plus blank horizontal ellipsis plus straight E open parentheses X subscript n close parentheses over denominator n end fraction end cell row blank equals cell fraction numerator mu plus mu plus blank horizontal ellipsis plus mu blank over denominator n end fraction end cell row blank equals cell fraction numerator n mu over denominator n end fraction end cell row blank equals mu end table
  • As table attributes columnalign right center left columnspacing 0px end attributes row cell straight E invisible function application open parentheses X with bar on top close parentheses end cell equals mu end table this shows the formula will produce an unbiased estimate for the population mean

Why is there a divisor of n-1 in the unbiased estimate for the variance?

  • You do not need to learn this proof
    • It is simply here to help with your understanding
  • Suppose the population of X has mean μ and variance σ²  
  • Take a sample of n observations
    • X1, X2, ..., Xn
    • E(Xi) = μ
    • Var(Xi) = σ2
  • Using the formula for a linear combination of independent variables:

table attributes columnalign right center left columnspacing 0px end attributes row cell Var invisible function application open parentheses X with bar on top close parentheses end cell equals cell Var invisible function application open parentheses fraction numerator X subscript 1 plus X subscript 2 plus blank horizontal ellipsis plus X subscript n over denominator n end fraction close parentheses end cell row blank equals cell fraction numerator Var invisible function application open parentheses X subscript 1 close parentheses plus Var invisible function application open parentheses X subscript 2 close parentheses plus blank horizontal ellipsis plus Var open parentheses X subscript n close parentheses over denominator n squared end fraction end cell row blank equals cell fraction numerator sigma squared plus sigma squared plus blank horizontal ellipsis plus sigma squared blank over denominator n squared end fraction end cell row blank equals cell fraction numerator n sigma squared over denominator n squared end fraction end cell row blank equals cell sigma squared over n end cell end table
  • It can be shown that straight E left parenthesis X with bar on top squared right parenthesis equals mu squared plus sigma squared over n
    • This comes from rearranging Var invisible function application open parentheses X with bar on top close parentheses equals straight E invisible function application open parentheses X with bar on top squared close parentheses minus open square brackets straight E invisible function application open parentheses X with bar on top close parentheses close square brackets squared
  • It can be shown that straight E left parenthesis X squared right parenthesis equals straight E open parentheses X subscript i squared close parentheses equals mu squared plus sigma squared
    • This comes from rearranging Var invisible function application open parentheses X close parentheses equals straight E invisible function application open parentheses X squared close parentheses minus open square brackets straight E invisible function application open parentheses X close parentheses close square brackets squared
  • Using the formula for a linear combination of independent variables:
table attributes columnalign right center left columnspacing 0px end attributes row cell straight E invisible function application open parentheses S subscript n superscript 2 close parentheses end cell equals cell straight E invisible function application open parentheses fraction numerator sum X subscript i superscript 2 over denominator n end fraction minus X with minus on top squared close parentheses end cell row blank equals cell fraction numerator sum straight E invisible function application left parenthesis X subscript i superscript 2 right parenthesis blank over denominator n end fraction minus straight E invisible function application open parentheses X with bar on top squared close parentheses end cell row blank equals cell fraction numerator sum open parentheses mu squared plus sigma squared close parentheses over denominator n end fraction minus open parentheses mu squared plus sigma squared over n close parentheses end cell row blank equals cell fraction numerator n open parentheses mu squared plus sigma squared close parentheses over denominator n end fraction minus open parentheses mu squared plus sigma squared over n close parentheses end cell row blank equals cell mu squared plus sigma squared minus open parentheses mu squared plus sigma squared over n close parentheses end cell row blank equals cell sigma squared minus sigma squared over n end cell row blank equals cell fraction numerator n sigma squared minus sigma squared over denominator n end fraction end cell row blank equals cell fraction numerator n minus 1 over denominator n end fraction sigma ² end cell end table

  • As straight E open parentheses S subscript n superscript 2 close parentheses not equal to straight sigma squared this shows that the sample variance is not unbiased
    • You need to multiply by fraction numerator n over denominator n minus 1 end fraction
    • straight E open parentheses S subscript n minus 1 end subscript superscript 2 close parentheses equals straight sigma squared

Exam Tip

  • Check the wording of the exam question carefully to determine which of the following you are given:
    • The population variance: sigma squared
    • The sample variances subscript n superscript 2
    • An unbiased estimate for the population variances subscript n minus 1 end subscript superscript 2

Worked example

The times, X minutes, spent on daily revision of a random sample of 50 IB students from the UK are summarised as follows.

n equals 50 sum x equals 6174 s subscript n superscript 2 equals 1384.3

Calculate unbiased estimates of the population mean and variance of the times spent on daily revision by IB students in the UK.

4-6-2-ib-ai-hl-unbiased-estimates-we-solution

Did this page help you?

Dan

Author: Dan

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.