You can clearly see the shape of the distribution without any of the original data being lost

What are histograms good for?

When frequency tables have unequal class widths

Mode is...

Easy to calculate and isn't affected by extreme values, but has no useful mathematical properties.

Median is...

Easy to calculate and isn't affected by extreme values, and has a restricted use

Mean is...

Very important for mathematical properties but can be affected by extreme values as it makes use of all values

IQR is...

Easy to calculate and is not affected by extreme values, but has no mathematical properties that are useful

Standard Deviation measures...

The spread of the data, is affected by extreme values but NOT addition/subtraction after coding!

NEGATIVE SKEW

MODE>MEDIAN>MEAN Q2-Q1>Q3-Q2

What do box plots show?

The range, IQR, median and skewness, so are good for comparing

How do you know when data is a DISCRETE RANDOM VARIABLE, X?

The probabilities add up to 1

E(aX+b) =

aE(X) + b

Var(aX+b) =

a^2VarX

A DISCRETE UNIFORM DISTRUBUTION is...

When each value has an equal probability. Eg, a dice.

What is a summary statistic?

Mean and SD or Median and IQR.

Choose mean and SD when...

Data isn't skewed and extreme values aren't present to make use of all the data

Choose median and IQR when...

when data is skewed and extreme values are present

Forming a statistical model...

1. Observe real world problem. 2. Devise a statistical model and collect data. 3. Compare observed outcomes against expected outcomes and test model. 4. Refine model if necessary and test again.

PROS/CONS of a statistical model...

PROS - quick, cheap, enables predictions. CONS - does not replicate the real world situation in every detail.

What is a statistical experiment?

A test/investigation/process adopted for collecting data to provide evidence for/against a hypothesis

4 Features of a normal distribution

1. Bell shaped curve, 2. Symmetrical about mean, 3. mean=median=mode, 4.95% of data lies within 2SDs of the mean

The independent variable is...

EXPLANATORY - it is independent of the other variable and is plotted along the x axis

The dependent variable is...

RESPONSE - it is dependent of the independent variable and is plotted along the y axis.

