Statistics (S1) - How Tos

How To revision cards on S1 Statistics including formulae, step-by-step instructions and examples.

Population and Sample Standard Deviation

Population

σ =  ∑(x - µ)^2
o.o       n

• Use the population stadard deviation if it specifies that it is the entire population of data
• Use the population standard deviation if all data is present (e.g number of passengers on a  us each day of the week: 102, 353, 401, 485, 209, 314, 175)

Sample

s =   | ∑(x - µ)^2
.. .  √       n-1

• Use the sample standard deviation when is specifies that it is a sample
• estimates the population based on a sample
1 of 12

Probability: Union and Intersection

• Event = set of outcomes
• Sample space = set of all possible outcomes
• Exhaustive events = set of outcomes cover all possible outcomes of the sample space
• Mutually exclusive events = 2 events can not occur at the same time
• Independent events = occurance of events do not affect others
• Complements = An event and the opposite of that event (e.g A and A')
• Conditional probability = given = [(e.g B|A) = independent if P(A) = P(A|B)]

Union= 'or', 'both'

• Addition Law = P(A B) = P(A) + P(B) - P(A∩B)
• Mutually Exclusive Law = P(AUB) = P(A) + P(B)

Intersection = 'overlap', 'and'

• Multiplication Law = P(A∩B)  = P(A) x P(B|A)
• Independent Law = P(A∩B) = P(A) x P(B)
2 of 12

Standard Deviation Summary

• A much more precise measure of spread than just the range or the interquartile range.
• For S1, there are 5 (unfortunately!) formulae that you need to be able to recognise - they will be on your formula sheet - and are each for different types of data
• The standard deviation is the difference between each piece of data and the mean

Symbols

σ = Sigma = population standard deviation

∑x = total of data

= sample mean

µ = population mean

n = number of pieces of data = ∑ƒ

s = sample standard deviation

3 of 12

Binomial Formula and Cumulative Binomial Tables

P(X = x) = (n r)Px(1-p)(n-x) = n! ÷ (n-X)!X!

Example

20 friends, probability that 5 will go swimming next week?     p= 1/7     n= 20     r= 5

(20 5)  1/75 x 6/715 = 0.0914

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Cumulative Binomial Tables

• This is the same concept, but instead of =, it is going to be < > ≤ ≥
• you will need to use the formula above, but add each term's applied formula together

Example

p= 0.25  n= 8   X~B(8, 0.25)

p(X<3)  =  P(X=0) + P(X=1) + P(X=2)

4 of 12

Binomial Distribution Summary

X = successes = variable = 'pass or fail'

n = number of trials (number of things) taken r at a time = fixed

P = probability of successes

Q = probability of failure (1-P)

X ~ B(n, p) = binomial probability           Bi = 2 outcomes

(n r) = nCr

Note: the formula can be used OR the statistics table, each labelled n=[  ] Either way, draw a numberline! No graphs are drawn.

When using the tables, you may be required to do '1-P(X)' if you are trying to find (P?)

or P(X) - P(X-1) when trying to find an exact value

or P(X) - P(Y) to find a range

5 of 12

Mean and Variance of Binomial Distribution

Mean

= µ

=  np

=  (number of trials x probability)

Variance

=  σ2

=  np(1-p)

=  (number of trials x probability)(1 - probability)

6 of 12

Standardising a Normal Variable

z = x - µ / σ

x = µ + zσ

Example

mean = 50

x = 51

σ^2 = 25 (5^2)

Therefore, 51 - 50 / 5 = 1/5

P(Z<0.2) = 0.579 (3sf)

7 of 12

Normal Distribution Types (1)

Finding a probability using table 3, given x

z = x-µ / σ

Finding an x value using table 4, given a probability

x = µ + zσ

Make x the opposite (e.g if you need to find the first 5%, switch to 95%) because p cannot <0.5

Setting up 2 equations with unknown µ and σ, solving simultaneously

Given 2 x values,    x = µ + zσ,   probability between the 2 x values

8 of 12

Normal Distribution Types (2)

Standard deviations away from the mean (Modelling Normal Distribution)

• 68% area lies in the ranges of   µ ± σ   (1 standard deviation)
• 95.5% area lies in the ranges of   µ ± 2σ   (2 standard deviations)
• 99.7% area lies in the ranges of   µ ± 3σ   (3 standard deviatoons)

Finding the sample mean

sample size n

x̅ distributed by µ and standard deviation σ/√n

z = x -  /  σ/√n

x = x̅ + z(σ/√n)

9 of 12

Normal Distribution Summary

• bell-shaped
• draw graphs!
• median = mode = mean = symmetrical
• total area = 1
• high population density close to the mean
• X ~ N(0, 1^2)

X is normally distributed with µ (mean) and σ2 (variance)

Standard Normal Distribution

Mean = µ = 0

Standard Deviation = 1

• Z ~ N(0, 1)2

Z is normally distributed with µ(=0) and σ2(=1)

10 of 12

Normal Distribution Tables

For this, we are using Table 3 and Table 4

Table 3 is to find the probability that Z (mean = 0 and variance = 1) is normally distributed to euqal less than or equal to z

Example

1.36 on a graph. P(Z≤1.36) = 0.913 on Table 3

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Table 4 is to find the value of z that satisfies P(Z≤z) = p

Example

p = 0.9

P(Z≤z) = 0.9

z = 1.2816 on Table 4

11 of 12

The Critical Limit Theorem

x̅ ~N(µ, (σ/√n)^2)

z =  x̅- µ / (σ/√n)^2

x̅ = µ + z(σ/√n)

Example

Weight of pebbles are distributed with mean 48.6g and standard deviation 8.5g. Random sample of 50.

P(x̅<49.0g)    n= 50

z  =  49.0 - 48.6 / 8.5/√50  =  0.33 (2dp)

P(x̅<49.0g)  =  P(Z<0.33)  =  0.629 (3sf)

12 of 12