HideShow resource information

Descriptive statistics

Psychology aims to answer big questions - can't test every person on Earth so solution is to take sample and generalise

Variables = things that vary: 

Categorical/nominal (divided into categories)

Discrete/ordinal (ranked low to high) continuous (measured on scale) 

Continuous/ scale (measured on scale) 

1 of 36


For example: gender or hair colour, political party, smoker/non-smoker

- Equivalent items given same name or number

- Data points (people) counted in each category = the frequency

2 of 36


- Data have logical order but no scaling

- Does not tell us size/difference between labels

e.g. position in race, the gap between 1st and 2nd could have been seconds or minutes

3 of 36

Continuous (scale)

- Logical scaling

- Tells us about difference between values

- sometimes divided into ratio/interval

e.g. difference between 1kg and 2kg is same as between 45kg and 46kg (1kg) 

4 of 36

Transforming data

Data can be transformed into different variables 

e.g. for depression scored

Categorical: Depression group A, B, C

Ordinal: not depressed (0-15), mildly depressed (16- 30), very depressed (31-45)

Continuous: Depression score on scale from 0-45 for example

Ordinal often rescaled to nominal or continuous because more types of analysis can be performed

5 of 36

Central Tendency

Guess regarding what is typical of our data (averages)

Mean: sum of all scores divided by number of scored

Median: order data numerically, middle value = median

Mode: most frequently occuring value 


  • Mode rarely used
  • Mean preferred over median because takes into account all scores, BUT influenced by outliers, therefore median used when scores are not evenly spread
6 of 36

Measures of dispersion: range and variance

Related to spread of values

Range - highest - lowest score. Simplest method, dependent on who is selected in sample, only takes into account highest and lowest score. 

Variance - how much each score deviates from mean

Involves calculating sum of squared deviance

  • calculate mean
  • deduct mean from each score
  • square each deviance score
  • add all squared scores 

BUT depends on number of people in sample, so we divide sum of squared deviance by number in sample minus one

Problem that it is difficult to interpret, e.g. unit would be in IQ squared, so we square root to eliminate this - this is called Standard Deviation (SD)

7 of 36

Reporting Descriptive statistics

- report participants (N=...) 

- report mean

- report SD and range in brackets (SD =..., Range = ...)

8 of 36

Inferential statistics


How typical value and spread of scores relate, within the spread of scores, where is typical value?

Normal distribution = bell-shaped curve, scores evenly distributed around mean, but spread can be smaller or bigger

  • most analyses rely on data being more or less normally distributed
  • most variables tend to be normally distributed (most average and less to the extremes)

- in normal distribution nearly all observations lie within 3 SD (σ) from mean

  • 68% in 1σ
  • 95% in 1.96σ (2σ)
  • 99.7% in 3σ 

68-95-99 rule

e.g. can use this to calculate IQ score of randomly selected person in population

9 of 36

Confidence interval

We can't be 100% in predicting person's score so we calculate confidence interval

Mean +/- 1.96 (if confidence interval is 95%) x σ


BUT we do not normally know the population mean and population SD, can use mean and standard deviation from sample to make predictions

Predicting population mean and SD

- use sample mean as predictor for population mean

- SD predictor takes into account sample size (SD divided by square root of n) = Standard error

- population mean will stay the same when sample size increases

- for SD if sample size gets bigger, the smaller the interval will get (therefore more precisely you can predict numbers in population)

10 of 36

Why 95%?

- we want to be as certain as possible, but want confidence interval to be as small as possible

95% = best compromise

11 of 36

Experimental design


  • Formulate question
  • Formulate hypotheses 
  • Choose variables
  • Choose research design
  • Choose analysis
  • Answer research question

Step 1 - research question - Is there a relationship between smoking and cardiovascular disease - Relationship - Does cognitive training help reduce effects of amnesia (training vs. no training) - Difference Step 2 - hypotheses - goal is to falsify null hypothesis - what is the expected answer to question based on past research?  - Experimental/research hypothesis: There is a relationship/difference between...  - Null hypothesis (taken as default hypothesis) There is no relationship/... does not increase... 

12 of 36

Mutually exclusive hypotheses

Relationship questions:

Null: There is no relationship between ... and ...

Research: There is a relationship between ... and ...

Difference questions:

There is no difference between ... and ...

There is a difference between ... and ...

Direction of results is inferred based on later results

(e.g. smoking increases risk of heart disease)

13 of 36


Step 3 - Variables

- choose style of measurement e.g. questionnaire vs naturalistic observation

Independent variable - the one you change

  • e.g. placing people into groups, fixed categories (gender etc.), difference in behaviour between people

Dependent - variable you observe changes in 

  • In changeable categories, differences in behaviour between people

any variable could be IV or DV, dependent on direction, could be unclear in relationship questions

Confounding variables = to ensure that effects on DV are solely due to IV we control everything else as much as possible and if any variables will influence, try and keep them the same across groups

14 of 36

Research design

Step 4 - research design

Between subjects - 

Compare performance of different participants, both assigned to one condition 

DV = mean score of condition A vs Condition B

Within-Subjects - 

Compare performance of same partcipants across all conditiona

DV - for each person take difference of overall score in condition A and B and take overall average

Within-subjects design is preferable as less chance of confounding factors and fewer people are needed


15 of 36


Carry-over effect - difficult to unlearn or forget 

Order effect - effect of presentation of conditions one after the other

Counterbalancing - 

Half first in Condition A, the Condition B

Then vice versa

16 of 36

Hypothesis testing

- In statistics it is agreed that if the sample mean is less that 5% (if null hypothesis is true) it is so unlikely that the population mean is 0 we can reject our null hypothesis

Therefore can accept our experimental hypothesis

17 of 36

Independent t-test

Two types of t-test

Independent = Between-subjects design

Paired t-testwithin subjects 

think independent and paired groups

- used as a way of standardising original scores (e.g. mean difference) in order to compare them

If our t-value is bigger than our critical t-value the mean difference will fall outside confidence interval for null hypothesis - therefore can reject it

18 of 36


  • That data is approx normally distributed (if not it is a non-parametric tests) 
  • Spread of scores (SD) is relatively equal - if not use 'equal variances not assumed'
  • we use this if Levene's test significance is larger than 0.05 
19 of 36

Reporting Independent T-tests

1.IV and DV - which groups did you compare on what measure

2. Means and SD

3. Difference between groups significant t(df = (t-value), p = (p-value)

4. Interpretation of result if significant - e.g. can reject null hypothesis and assume... 

20 of 36

Degrees of freedom (df)

df = number of scores that are free to vary 


  • one person's salary (mean salary = 30k)
  • their salary must be 30k 
  • df = 0 because no freedom to vary


  • two salaries, mean = 30k 
  • can freely choose one person's salary, other must be adjusted to make 30k 
  • e.g. one is 20k, other must be 40k to bring mean to 30
  • df = 1

typically df = number of people (n) - 1

because independent t-test is two groups, we must deduct one from each group

21 of 36

Type 1 and type 2 errors

Type 1 - Reject null hypothesis but no effect (false postive)

Type 2 - Do not reject null hypothesis but should have (negative)


Type 1 - false positive

Type 2 - false negative

22 of 36

Paired t-test

- Within groups, differences design, takes individual differences into account

- t-value must be larger than critical value for null hypothesis to be rejected

Assumptions - the data is normally distributed (if not - non-parametric tests) 

23 of 36


- Occurs in relationship design, when DV AND IV are continuous

- Correlation has no causality - linear relationship

Positive correlation:

  • when score on variable A increases, score on B also increases

Negative correlation:

  • When Variable A increases, Variable B decreases


- each dot = one person, each person has score on x and y saxi

24 of 36

Strength of correlation

  • The more data resembles a straight line = stronger correlation
  • Closer correlation is to 1/-1 the stronger it is


Very strong - 0.7-0.99 (or negative)

Strong - 0.4-0.69

Moderate - 0.3 - 0.39

Weak - 0.01 - 0.29

25 of 36

Significance testing

df = sample size - 2


  • Both variables approx. normally distributed
  • Linear relationship
  • Check for extreme outliers

- When number of 1 shows it is due to correlation of one variable with itself - only need to look at one corner

- Pearson correlation (co-effcient r) = strength of correlation to 1 or -1 

- if below 0.05 shows us that there is a small chance that the two variables are NOT correlated

Reporting results:

Results show a ...... correlation between ... and ... r(df) = ..., p = .... Therefore... 

If there is no correlation - don't write down direction

26 of 36

Shared Variance (R2)

- Tells us the proportion of variance between two variables, option in between correlation and regression (attributing causality)

  • If r = 1, the spread of y is entirely explained by spread in x, and vice versa
  • Formula = r squared (x by 100 for percentage) 


- association can be non-linear 

- even if r value is low, can be quadratic, important to plot data

27 of 36

Chi Squared

For relationship data when the IV and DV are categorical

  • e.g. relationship between gender and autism
  • This leads us to make inferences about likelihood e.g. how likely is it that a boy has autism
  • A significant chi-squared result means it matters whether you are part of one specific category in the first variable, for how it affects your likelihood to be part of a category in the second variable

- may take percentages instead of actual figures if there are different group sizes

- compare observed values to expected values (what we'd expect figures to be if no relationship, if values were equally divided in table) - if observed values are different

Chi-squared calculates differences between observed and expected values of each cell, larger chi-squared value = more likely to reject null hypothesis

28 of 36

Degrees of freedom (df)

df = (number of rows -1) x (number of columns -1)

29 of 36

Assumptions and reporting results

  • No more than 25% of cells should have expected values of less than 5
  • no individual cell should have expected value less than 1

Reporting results:

e.g. in this sample ...% had conduct disorder compared to ...% of girls. A Pearson' chi-square test shows a significant relationship X2(df, n= participants=..., p = signif... 

Suggests that boys are more likely to have conduct disorder than girls. 

30 of 36

Non-parametric tests


  • used when data is not normally distributed
  • uses ranks
  • less powerful


  • normally distributed
  • uses actual scores
  • more powerful
31 of 36

Testing normal distribution


- can check for normal distribution

- does it look like a normal distribution? Unlikely you will ever get a perfect one 

- Keep in mind sample size - small sample may not look like normal distribution 

Q-Q plot:  

- if data normally distributed would be in straight line based on observed and expected value 


- if Shapiro-Wilk is not signif = normally distributed 

32 of 36


- Disregard actual value of score and just focus on its relation to other scores

- Median is most appropriate descriptive statistic to use, calculate mean of their ranking, e.g. if they have a ranking of 4 and 5, median = 4.5

33 of 36


Independent T-test - compares means of two independent groups

Mann-Whitney - compares mean ranks on independent groups

- we average rankings of same scores, then calculate mean rank for each group

Reporting results: 

Brand one was given a higher average liking (median = ...) compared to brand two (Median = ...) A mann-whitney-u test showed brand one was like significantly more (U= ..., p=...) 

34 of 36


Parametric test:

  • compare means of two paired groups (within subjects)


  • Compare mean ranks of those who did better in condition 1 to mean ranks of that in condition two

- calculate before and after difference and make them all positive values, ignore any ties (no difference between before and after scores), then rank 

- average duplicates

Reporting Results:

Significant difference between before (Median = ...) and after (Median = ..., Z = ....) p = ... 


35 of 36

Spearman's Rho


  • correlation between two exact scores of continuous variables


  • Correlation between ranked scores of continuous variables

- again only look at top right box in SPSS output

Reporting results:

Spearman's correlation showed a ... relationship between .. and ... r.. =, p=...

This shows that... 

36 of 36


No comments have yet been made

Similar Psychology resources:

See all Psychology resources »See all Statistics resources »