Statistics 01

Types of Data

Nominal:

Unrelated categories
No numerical relationship/order
Example = What type of pet do you own?

Ordinal:

Has an order or sequence
Cannot do maths with it
Example = How is your health (good, bad, reasonable)

Scale:

Goes in a specific order
Can do maths with it
Example = What is your height?

1 of 20

Measures of Central Tendency

Mode = Most

Most frequent number
One mode = Unimodal
Two modes = Bimodal

Median = Middle

Sort the data from lowest to highest, find the middle
Cannot have more than one media, instead take the mid value

Mean = Average

Add all the scores together and divide by how many there is

What are Outliers?

Description of the data can be affected by extreme scores
Use mode or median or remove extreme values

2 of 20

Which data to use?

Nominal = Mode

Ordinal = Mode/Median

Scale = Any

3 of 20

Measures of Dispersion

Range

Highest value - Lowest value
Dispersion of a score - How much spread there is

Variance

Sum of squared differences from the mean divided by n-1
Working out variance = Work out the mean, Take it from each score and square the number, Add up all the square mean totals, Divide by how many scores in total.

Standard Deviation

The square root of Variance

4 of 20

Histograms & Distributions

Shows how an attribute is distributed
A histogram is not a bar chart
Plot the number of percentage of observations at each level of the measure
Different histograms - Different bins (2,3,4, or 10-20), Split by another variable

Known attributes:

Symmetrical around the mean
Mean, median, and mode are equal
Bell shaped curve

Normal Distribution:

Population data is often assumed to be normally distributed
This means that if we pull out a sample from a population, it is likely to be somewhere around the mean
For this reason, we can use sample means as an estimate of population means

5 of 20

Asking questions about people

Associations

Do more intelligent people have more facebook friends?
Is there a relationship between study hours and exam results?

Differences

Do those who break the law have a higher level of extraversion
Are there sex differences in IQ?
These are questions about differences in populations

6 of 20

Testing the Null Hypothesis

Null = true

We don't know what the population looks like
If the null is true, then its highly likely that both of our means would come out somewhere in the middle
So we might get a difference, but be unable to reject the null as our data could easily have come from populations with no difference
If the null is true, It's highly unlikely that our means would come out at the extremes
So if they do then we have to conclude that the null is not true and reject it as a model of the data

How do we do this?

We never know whether sample mean is higher, low or same as the population mean
Inferential statistics use the size of the sample difference, variability in the sample data and number of participants to tell us...

"The probability of getting the observed or more extreme results, given that the null hypothesis is true"

7 of 20

Likelihood = Probability

100% chance = 'p = 1.00'

50% chance = 'p = .50'

10% chance = 'p = .10'

If P>.05 then the difference is not significant, because the chance of pulling these two samples from two identical populations is more than 5%

5% chance = 'p = .05'

1% chance = 'p = .01'

<5% chance = 'p = <.05'

If P<.05 then the difference is signficant, because the chance of pulling these two samples from two identical populations is less than 5%

8 of 20

Two outcomes: Significant/Not significant

Significant

In our sample, we get a big difference between the two sample means with low variance:
The likelihood of getting this data from a population with no real difference would be very low (p<0.5 - less than 5%)
So it's unlikely enough that the populations are the same that we can reject the null
Our difference is significant and evidence there is a difference in the population

Not Significant

In our sample, we get a small difference between the two sample means, with high variance:
The likelihood of getting this data from a population with no real difference would be high (p>.05 - more than 5%)
So it's possible the population means are the same, and we fail to reject the null
Our difference is not significant and no evidence that there is a difference in population

9 of 20

Z-Scores

What is a Z-score?

A particular value expressed as the number of standard deviations that it lies away from the mean
Example = Mean (10); SD (2); Your score (8); Z-score (-1)

What if your Z-score is not clearly shown on the histogram?

Using look-up tables
For positive z-scores, read off the probability of obtaining that z-score or below
For negative z-scores, take "1-" probability to get the probability to get the probability of obtaining that negative z-score or lower
Example for negative - Z-score (-1.52); P(0.936) "1 - 0.936 = 0.064" - 6.4% chance of scoring z of -1.52 or below

What else can we do with Z-scores?

Example = Calculate probability of earning between A "Z-score of -2 & 1"
Below 1 = 84%, Below -2 = 2.2% (Below 1 - Below 2 = 81.8%)

10 of 20

Choosing Inferentials

The test you want depends on:

The type of data you have
Whether you are looking for a difference or relationship between variables
How many conditions you have
Whether the data for those conditions come from different groups of people (between subjects) or the same people (within subjects)

11 of 20

Independent t-test

Independent t-test:

Scale data
Looking for a difference
Two conditions
Between subjects

What matters?

The size of the difference - Bigger difference = more likely to be significant
The variance within each group - Smaller variance = more likely to be significant difference

12 of 20

T-test

A t-test essentially compares the within condition variance with the between condition variance
Difference between groups ÷ Variance within groups = t
Big difference between groups ÷ Small variance within groups = Big t value

All you need for a t-test

Hypothesis
Scale data from two groups
For each: Mean, variance, number of values

Calculating the t-test

Collect sample data
Test the null hypothesis
How likely is it that we would get the observed sample difference from a population in which the null hypothesis was true?
If it's very unlikely p<.05, then we can reject the null

13 of 20

Degrees of Freedom

Degrees of freedom come up with most statistics
Calculation varies by statistic
Broadly a measure of sample size

If n = 10...

For independent t-test

DF = (n1 - 1)+(n2 - 1)
DF = (10-1)+(10-1)
DF = 9+9
DF = 18

For paired t-test

DF = n - 1
DF = (10 - 1)
DF = 9

14 of 20

Assumptions of the Independent t-test

Types of variable, Random sampling, Normal distribution, Homogeneity of variance

Types of variable

IV must be categorical
DV must be scale

Random sampling

Quasi-random selection from the population
Not truly random, but no bias in allocation to groups or inclusion in experiment
No participant can be in both conditions

Normal Distribution

The DV should be normally distributed in each group
Much of our rational depends on this

Homogeneity of Variance

The two groups should have similar variances

15 of 20

Paired t-test

Average size of change for each individual
Don't have to worry about individual differences
Every value is hooked up to its equivalent in the other condition
Dealing with differences between values
Score change ÷ variance of change = t

Calculating a paired sample t-test

Mean difference = condition 1 - condition 2
Did everyone get exactly the same difference?
Did difference vary widely?
Lots of variance means we can't be sure that the difference will go in the same direction in the population
Look up on table, t & DF

Reporting t-test - Cabers were thrown significantly further when contestants wore trainers (M = 11.70, SD = 3.86) than when they wore high heels (M = 5.00, Sd = 2.40), t(9) = 3.87, p<.01

16 of 20

Assumptions of the Paired t-test

Types of variable, random sampling, normal distribution, homogeneity of variance

Types of variable

Iv must be categorical
DV must be scale

Random sampling

Quasi-random selection from the population
Not truly random, but no bias in inclusion in experiment
Every participant must be in both conditions

Normal distribution

The differences should be normally distributed

Homogeneity of variance

The two conditions should have similar variances

17 of 20

The steps in Chi-Squared

1. Calculate frequencies (observed values) - Add up how many in each combination

2. Calculate frequencies we would expect if the null is true (expected values)

3. Calculate how far observed are from expected (x squared)

4. Calculate DF

5. Look up critical (x squared) in look-up table

18 of 20

Chi Squared - DF

Calculate x squared

We need to think about how the observed values differ from the expected values

Calculate Degrees of Freedom

(number of columns - 1) x (number of rows - 1) = 1x1 = 1

Reporting your x squared results -

Analysis using a Chi-Square test shows no significant relationship between sex and smoking, x squared (1, N = 50) = 0.927, p = 0.34

19 of 20

Assumptions of Chi Squared

Random sampling, Sample size, and expected cell count

Random sampling

The sample data is a random sampling from a population

Sample size

A sample with a sufficiently large size is assumed. If a chi-squared test is conducted on a sample of a smaller size, then the chi-squared test will yield an inaccurate inference

Expected cell count

Adequate expected cell counts. Some require 5 or more, and others 10 or more. A common rule is 5 or more in all cells of a 2-by-2 table and 5 or more in 80% of cells in larger tables, but no cells with zero expected count

20 of 20

Get Revising

Statistics 01

Types of Data

Measures of Central Tendency

Which data to use?

Measures of Dispersion

Histograms & Distributions

Asking questions about people

Testing the Null Hypothesis

Likelihood = Probability

Two outcomes: Significant/Not significant

Z-Scores

Choosing Inferentials

Independent t-test

T-test

Degrees of Freedom

Assumptions of the Independent t-test

Paired t-test

Assumptions of the Paired t-test

The steps in Chi-Squared

Chi Squared - DF

Assumptions of Chi Squared

Comments

Similar Psychology resources:

Types of Data

Measures of Central Tendency

Which data to use?

Measures of Dispersion

Histograms & Distributions

Asking questions about people

Testing the Null Hypothesis

Likelihood = Probability

Two outcomes: Significant/Not significant

Z-Scores

Choosing Inferentials

Independent t-test

T-test

Degrees of Freedom

Assumptions of the Independent t-test

Paired t-test

Assumptions of the Paired t-test

The steps in Chi-Squared

Chi Squared - DF

Assumptions of Chi Squared

Comments

Related discussions on The Student Room

Similar Psychology resources: