Statistics revision cards

Here are some statistic revision cards for edexcel statistics 1 ( s1 ) paper.

In a box plot outliers are
1.5 interquartile ranges above the upper quartile
In a histogram, total area under bars is
PROPORTIONAL to total frequency.
1.5 interquartile ranges below the lower quartile
Always refer to outliers or extreme values but not
PMCC is not affected by linear coding. The three properties of a normal distribution are
So multiplying every value of x or y by the same (or 1. The curve is bell shaped
indeed, adding, subtracting, dividing) does not affect 2. The curve is symmetrical about the mean
the PMCC 3. mean = mode = median
Positive skew mode < median < mean
Events are independent if they do not affect each
Median closer to lower quartile than higher quartile
others outcomes
Higher frequencies at lower values
P(AB) = P(A) x P(B)
Or you could say
P(A|B)= P(A) or
P(B|A) = P(B)
Negative skew mean < median < mode
Median closer to upper quartile than lower quartile
Higher frequencies at higher values When finding the lower quartile work out ¼ n
If this is a decimal use the first value after (e.g. 6.5
you would use 7th value)
If this isn't a decimal use the average of this value
and the one after (e.g. 7 you would average 7th
and 8th value)

Page 2

Modelling: 7 steps! RDPECER
A statistical model is Step 1: Recognise a real world problem
Step 2: Devise a statistical model
A statistical process to describe or Step 3: model used to make Predictions
Step 4: Experimental data is collected
make predictions about the expected Step 5: Comparisons made against devised model
bahaviour of a realworld problem Step 6: Evaluation: Statistical concepts used to test
how well model describes realworld problem
Step 7: Refine model
Why use statistical models?
Use a histogram if the data is continuous, and if the
data is grouped unevenly. Also can be written as
If a distribution is skewed then it has extreme values
Put your answer in context
Median/ Interquartile range better than mean/
standard deviation because they are not affected by Positive correlation (not enough)
outliers. However the mean takes into account all
values! Positive linear correlation (better)
Normal distribution not suitable for modeling skewed Strong positive linear correlation as temperature
It is not suitable to extrapolate (estimate
An event is using a line of best fit outside the range of data
collected) as a linear relationship may not
Capital F just means the probability to the left of
and inclusive! It is the cumulative probability.
F(0.6) = P(X<0.6)
Always remember that
When finding the median work out ½ n F(LAST POSSIBLE VALUE) = 1
If this is a decimal use the first value after (e.g. 6.5 you
would use 7th value)
e.g. if F(x) = for x = 0,1,2,3 then a table
If this isn't a decimal use the average of this value and would be
A discrete uniform distribution is a probability
distribution with the same probability of all outcomes
e.g rolling a fair die
The assumption that all probabilities are equal is
Useful in theory ­ allows problems to be modelled
If in a venn diagram, you are given that
However it is not necessarily true in practice something has occurred, look only at the
Some features of a box plot are
To work out E(X ) just do the first bit of the variance
formula 1. Allows comparisons
2. Shows outliers
3. Indicates spread/range/IQR
4. Shows max/min/median/quartiles
