# Statistics

- Created by: Lucie
- Created on: 16-03-13 12:47

## Mean, Median and Modes

*Mode = the most frequent occuring value. (aka.. modal value)*

*Median = the central value when arranged in ascending order*

*Mean = the average value (sum of all x's divided by n)
If the data consists of the whole population it is *

**μ**

*Otherwise if the data is just a sample, then the mean is ***x̄**

*In grouped data, the mode is the modal class, *

*To estimate the mean we have to assume that each length is the mid-mark length of each class, e.g 100 -150 then the mid-mark is 125*

*Estimating median will be, dividing n/2 then counting up to see where that value lies in the table this will therefore be the median value. (May be a decimal). *

## Interquartile Range and Standard Deviation

The median is (n+1)/2

The lower quartile is the median of the lower half of data, ther upper quartile is the upper half of the data.

The interquartile range is Q3 - Q1 and covers 50% of the data.

**Standard Deviation**

Where x̄ is the sample mean.

The variance is the Standard Deviation squared. Cannot be used to measure spread as it is in inappropiate units!

## Probability

**Mutually Exclusive Events** (seperate events which cannot happen together)

*e.g, rolling a die.*

*P(A ∪ B) = P(A) + P(B)*

* A or B *

**Two independent events** - probability of A happening **given that** B already has

*P(A | B) = P(A)*

*Also, if the events are independent*

*P(A ∩ B) = P(A) x P(B)*

*The probability that A and B will happen (if they are independent)*

## Laws of Probability

**The Addition Law:**

This law does not depend on whether two events are mutually exclusive or not.

*P*(*A* ∪ *B*) = *P*(*A*) + *P*(*B*) - *P*(*A* ∩ *B*)

**The Multiplication Law:**

*P*(*A* ∩ *B*) = *P*(*A*)*P*(*B *| *A*)** **

**Probability is measured on a scale of 0 to 1**

**If a trial can result in one of n equally likely outcomes and an event consists of r of these, then the probability is r/n**

## Binomal Distribution

*There must be:*

*Fixed number of trials - n*

*Two outcomes for each trial
The probability of each outcome is the same for each trial
The trials are independent
*

*P(r) = (nCr)(p^r)(1-p)^n-r*

*Variance = np(1-p)
Mean(mu)= np*

*Y ~ B (n,p) *

*Remember pascals triangle!*

*Practise using binomial tables in formulae book.*

## Normal Distribution

Normal distribution is a bell shaped curved.

**Every graph must have a mean of 0 to work with the normal tables**

Z = (x - *μ)/ *σ

Sigma being the standard deviation.

About 66% of all data is plus or minus 1

95% is plus or minus 2

99.5% is plus or minus 3*X~N ( μ,σ^{2} ) = (Pop mean, variance)*

*The disribution of the means*

*σ = x*σn = the SD for whole pop

s = xσn-1 = the SD for sample of pop

*The central limit theorem states that if sample is over 30 (n) then the means are approximately normal in respect to their original distribution. *

*Also if the population from which the sample is taken from is normally distributed the means will also be normal distributed.*

## Confidence Intervals

Confidence interval = ± (Z)/√ *N*

*This is for when the x bar is the mean of a random sample of size N from a normal distribution with an unknown mean (mu) and a known standard deviation.*

*If a large sample is available:*

*It can be used to provide a good estimate of the population standard deviation**It is safe to assume the mean is normal distributed.*

## Correlation

**Product moment correlation coefficient (PMCC) **

*It is always true that -1 ≤ r ≤ +1
r = +1 .. all points lie on a line w/ postive gradient
r = -1 .. all points lie on a line w/ negative gradient*

*r = 0 .. no linear correlation between 2 sets of data*

*To use this you need *∑x² .. ∑y² .. ∑xy* *

*Then it is entered into the equation (in AQA formulae book)*

## Regression

*y = a+bx <- the line of y on x*

*Residuals are the distance between the point on the graph and the line of best fit.*

*Formula for regression is
(y - y(bar)) = (Sxy)/(Sxx) x (x - x̅ ) *

*To plot this on a graph, (*x̅,y(bar)) *lies on the regression line, *

*When using x you need to work out **ŷ (just use a calculator**) *

*To find the equation for a and b,
- Put numbers off table in calculator, mode , stat,*

*a+bx, reg, a (write value down) then do b*

*To predict
- AC either x or y = .. shift, stat, reg, x(hat)/y(hat)*

*Extrapolation = a prediction outside the set of data is unreliable (interpolation is the opposite).*

## Related discussions on The Student Room

- Root mean square speed »
- Statistics in economics? »
- I don't understand statistics »
- The central limit theorem and its importance to statistical estimation »
- Additional Maths question help Cosine formula »
- pupillage statistics?? »
- What graphs to draw for Statistics Controlled Assessment? »
- Determining anomalous results »
- GCSE Statistics journey times »
- A2 Economics - UK Labour Market »

## Comments

No comments have yet been made