Normality Testing, Distribution Fitting and Contingency Tables

?
  • Created by: rosieevie
  • Created on: 08-01-18 13:25

Distribution Testing

Normality testing and goodness-of-fit testing both test against distributions

Several specific tests developed - conceptually not different as all test experimental data against theoretically expected distribution

1 of 5

Testing for Normality

Parametric tests require a normal distribution and equal variance

  • Small sample size often reason for non-normal distributions

Compare data to normal distribution to determine what test to use:

  • H1 - data differs from normal (non-parametric test)
  • H0 - no difference between data and normal (parametric test)

Several (non-identical) tests to test normality:

  • Kolmogorov-Smirnov - simplest and most commonly used
    • Compares two sets of data to determine whether they come from same distribution
    • One-sample - compares experimental data with expected distributions
    • Two-sample - compares two sets of experimental data to determine similarity
  • Anderson-Darling - more powerful and gives more weight to tails
  • Shapiro-Wilk

Which test to use depends on data = prior research necessary

Often good idea to use visual fit of data to confirm test results

2 of 5

Testing for Equal Variance

Used when comparing homogeneity of variances of two samples

Levene's test most common:

  • H1 - two variances different (non-parametric test)
  • H0 - no difference between variables (parametric test)
3 of 5

Goodness-of-fit Testing

Tests against any distribution - how expected differs from actual

Two main tests - chi-squared and G-test

  • Always choose one and stick with it

G-test - use where observed frequencies of various categories and expected proportions for those categories weren't derived from data themselves

Chi-sqaured goodness-of-fit 

  • Present data in table - observed and expected frequencies for various categories
  • Ensure none of expected values <1
  • Expected values derived from a distribution e.g. poisson, negative binomial, flat, ratio
  • N0 - observed and expected frequencies are no different
4 of 5

Contingency Tables

Used when explanatory and response variables are categorical

In simplest scenario, explanatory and response variables both have two options

Common contingency table test are X2 and G-tests

If X2orG>P, then there is a significant difference

The p-value is more accurate the larger the sample size is (higher numbers). 

  • No expected values <1 = categories should be combined
  • No more than 20% expected values <5 = Fisher exact test

Fisher exact test - calculates p-value in 2x2 table directly = more accurate with smaller sample sizes (lower numbers)

  • Uses factorials

Factorial - proudct of an integer and all integers below it e.g. 3! = 3x2x1

5 of 5

Comments

No comments have yet been made

Similar Biology resources:

See all Biology resources »See all Statistics resources »