# Statistics

4.0 / 5

HideShow resource information

- Created by: Kate Anning
- Created on: 03-06-14 17:21

a statistical model/experiment

a test in which you collect data to procide evidence for/against a hypothesis

1 of 32

an event

a subset of possible outcomes of an experiment

2 of 32

statistical model process

1) OBSERVE the real world 2) DEVISE a statistical model 3) use model to PREDICT outcomes 4) COLLECT data 5) COMPARE data with predicitions/hypothesis 6) TEST how well the model describes reality 7) REFINE the model if necessary

3 of 32

Advantages of statistical models

1) It simplifies the real world 2) It allows you to predict future outcomes 3) It is quicker and easier than using real data

4 of 32

Disadvantage of statistical models

The model doesn't replicate the real world in every detail

5 of 32

Comment on the assumption that A and B are independent

If A and B affect each other, they are not independent

6 of 32

Type of data in a normal distribution

Type of data: continuous

7 of 32

Discrete uniform distribution

Type of data: discrete - almost certainly going to have equal probabilities for each outcome

8 of 32

Shape of distribution

Describe the skew, usually back up with mean/median comparison or upper quartile/median vs lower quartile and median comparison / could say 'most of the data is at the lower/upper end'

9 of 32

Reason to justify the use of a histrogram to represent data

The variable (eg: height) is continuous, and the data is grouped

10 of 32

Describe the main feature of a histogram

The area of each bar is proportional to the frequency of that group

11 of 32

Main features of comparison for box plots

1) Outliers 2) Skew 3) IQR 4) Range 5) Median - make comparison then put into context

12 of 32

An outlier

1) A value which is very different from the rest of the data 2) Should be treated with caution 3) It is usually identified as anything above Q3+3/2(Q3-Q1), or below Q1-3/2(Q3-Q1)

13 of 32

When to use the median

If data has any skew, because the median is not affected by skew but the mean is pulled up/down by particularly high/low data

14 of 32

Effect on the mean if we add similar data

Adding/removing similar data will improve/reduce the validity of the mean/standard deviation, but may not affect their value

15 of 32

Effect on the mean if we add outliers

Will pull the mean up/down and increase/decrease the standard deviation

16 of 32

Effect on the mean if we code the data

Effects the mean/standard deviation as in decrete random variables E(aX+b) and Var(aX+b)

17 of 32

Main features of a normal distribution

1) Bell shaped curve 2) The curve is asymtotic to the z axis 3) Symmetrical about the mean 4) mean=mode=median 5) 68% of the data lies within mean +/- one sd 6) 95% of the data lies within mean +/- 2 sd 7) Almost all data lies within mean and 3 sd

18 of 32

Why is the normal distribution a suitiable model

1) The data is continuous (height,weight, length, width etc) 2) Data is clustered around a central value 3) Data isnot skewed

19 of 32

Why is the nomral distribution not a suitibe model

1) Data is skewed 2) Data is bimodal

20 of 32

Say whether a linear regression model is suitible

The PMCC is close to 1/-1, so the data lies close to a line so a linear regression model is suitible OR The PMCC is close to 0 so the data does not lie close to a line so a linear regression model is not suitible

21 of 32

Interpret the PMCC

The PMCC is positive/negative so th regression line has a positive/egative gradient/. As (name of x thing) increases, (name of y thing) increases/decreases

22 of 32

Explain why x is the independent (explanatory) variable

Because it influences y

23 of 32

Explain why y is the dependent (response) variable

Because it is influenced by x

24 of 32

Interpret the value of a in the regression line (the 'y' intercept)

(a) is the amount of (y-what y is) when (x-what x is) is zero

25 of 32

Why this value may be unrealistic

(x-what x is) being zero is well outside the range of the data

26 of 32

Interpret the value of b in the regression (the gradient of the line)

For each increase of (unit of x + what x is), y (what y is) increases/decreases by b (value)

27 of 32

What the b value allows you to do

Allows you to work out how much extra y will be gained when x increases by a certain amount

28 of 32

Is the regression line suitible to make a prediction for this value of x?

Only if the value is within the original data set. Otherwise, if it is outside the original data set, then there is no evidence the model will apply

29 of 32

Effect of adding/removing data or coding on the PMCC/regression line

The PMCC/ gradient of the regression line are not affected by coding but will be affected by addine/removing outliers from the data set. The intercept of the regression line will be affected by coding and by adding/removing outliers

30 of 32

wedrfgh

erfghj

31 of 32

erftgh

wedrfgh

32 of 32

## Other cards in this set

### Card 2

#### Front

a subset of possible outcomes of an experiment

#### Back

an event

### Card 3

#### Front

1) OBSERVE the real world 2) DEVISE a statistical model 3) use model to PREDICT outcomes 4) COLLECT data 5) COMPARE data with predicitions/hypothesis 6) TEST how well the model describes reality 7) REFINE the model if necessary

#### Back

### Card 4

#### Front

1) It simplifies the real world 2) It allows you to predict future outcomes 3) It is quicker and easier than using real data

#### Back

### Card 5

#### Front

The model doesn't replicate the real world in every detail

#### Back

## Similar Mathematics resources:

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

0.0 / 5

2.5 / 5

## Comments

No comments have yet been made