Choice of Tests

?
  • Created by: rosieevie
  • Created on: 08-01-18 21:52

The Process of Science

  • Observe the world
  • Develop theory, run exploratory simulations (develop hypotheses)
  • Define statistical models to test theoretical predictions
  • Run experiments
  • Run statistical models
  • Draw conclusions

Science needs breakthroughs at all these levels

1 of 13

Why do we need Statistics?

Need to infer existence of predictable patterns in space/time, based on samples from real world and given assumptions about nature of data

With stats you can interpret biological meaning, identify associations/patterns and infer causes of variation

With careless use of stats you will mess up data, confuse biological understanding, fail to see patterns and identify wrong causes

Data on behaviour, ecology and evolution need statistical treatment because variation occurs and individuals are inherently unpredictable

Simplest way to identify patterns and make predictions is to convincingly reject the null hypothesis that everything is random

2 of 13

Biological Knowledge and Contradictory Issues - TB

How best to reduce the prevalence of TB in cattle - disease hard to identify and causes farmers to kill entire herds to prevent spread

  • Badgers sustain endemic infection of TB
  • Transmit it to cattle
  • 1975-1997 >20,000 badgers were culled on ad-hoc basis as part of British TB control policy = conflict between conservation and farmer groups

Nationwide experiment was completed to establish cause of TB outbreaks

3 of 13

TB and Badgers - 'Dry' Experiments

Simulation of alternatives to reactive culling conducted - determine if best method of prevention was extermination or vaccination of badgers

Effects on regional prevalence of TB in badgers following vaccinations of badgers - predicted to reduce herd breakdowns of TB

  • Cheap
  • Non-invasive
  • No oral vaccine yet available
  • Slow to take effect

Reactive culling, is already operational:

  • Immediate effect
  • Popular w/ farmers
  • Expensive
  • Invasive
  • Untested consequences
4 of 13

TB and Badgers - 'Wet' Krebs Experiments

Suggested by John Krebs - leader of food standards agency

Investigate if badgers are reservoir for TB

National-scale field manipulation of badger numbers - test H1 = badgers cause TB in cattle

Reactive and no culling trial areas superimposed over 1998 testing intervals for cattle in areas with difference incidences of TB (annual testing throughout the trail)

Experimental manipulation - remove badgers in replicate regionsa and compare TB incidence to control sites

Model - TB = Region + Treatment + Region*Treatment

Threshold of probability was P<0.05

F-statistic of ANOVA used

5 of 13

TB and Badgers - Results

Significant effect of culling despite differences between regions in TB incidence

Culling increases incidence

  • Destabilisation of territorial grounds surrounding culled sites
  • Creates vaccum in social grouns
  • Prompts long-distance movement of surviving badgers
  • Outbreaks caused often by individual badgers entering cattle housing - better husbandry techniques required
6 of 13

Choosing Statistical Tests

3 main kinds of test:

  • 1 sample of frequencies divided into classes
    • Chi-squared or G-test
    • Test for goodness of fit to theoretical distribution
    • Test for iependence between two categories in a contingency table
  • 1 sample measuring two variables on continous scales
    • Correlation or regression
    • Test for association between variables
    • Test for cause and effect between variables
  • Two or more samples measuring a continous response
    • ANOVA
    • Used when taking several samples which allows testing for causes of variation against different alternatives
7 of 13

Non-Parametric or Parametric Statistics

Non-parametric tests robust - rough but reliable estimate, work on data with unknown underlying distribution

Parametric in preference - more powerful and versatile

Limitations of non-parametric statistics:

  • Test hypotheses but don't always give estimates for parameters of intrest
  • Cannot test two-way interactions, or categorical combined with continuous effects
  • Work in different ways, own quirks and no grand scheme
  • In situations of moderate complexity - not always non-parametric stats available

Advantages of parametric statistics:

  • More powerful - use actual data
  • Flexible - cope with incomplete data and correlated effects
  • Test two-way interactions and categorical combined with continous effects
  • Built around single theme - ANOVA = grand scheme and single framework
8 of 13

Wilcoxon and Fisher

Frank Wilcoxon 1892-1965

  • Invented Wilcoxon test
  • Contributed to pyrethrin-based insecticide development
  • Wanted simple and easier ways to test insecticide effectiveness
  • Great human - greatness from diversity

Sir Ronald Fisher 1890-1962

  • Invented experimental design and ANOVA
  • Prof of Eugenics at UCL
  • Facist tendencies but terrible eyesight prevented him acting on them
  • Founded modern synthesis of Mendelian genetics with Darwinian evolution
  • Great statistician - greatness from unity
9 of 13

Analysis of Variance

ANOVA uses statistical models, simplest of which is

Y = X + 

Variation along y-axis l Acounted for by l Variation along x-axis l Plus l Residual variation

10 of 13

Hypothesis

Use statistics to fit models to data

  • Data are sacrosanct (sacred) - never fit data to models
  • Do compare alternative models to test competing statistical hypotheses

Reject the N0 in favour of a test model - need a refutable null model and interesting test model

  • Null model = no true pattern
  • Hypotheses concern truth, not significance

Accept statistical model only on basis of rejecting simpler alternative w/ some acceptably small probability e.g. P<0.05

"A good hypothesis is a falsifiable hypothesis" - Karl Popper

Science proceeds by falsification of simple hypotheses in favour of more informative alternatives

11 of 13

Data Plotting

Always plot data first!

  • Need to see what data looks like - tranformation potential

Can calibrate observed data from predicted data

12 of 13

Transforming Data

Transform data if necessary to meet parametric assumptions

  • Not cheating
  • Only use if it makes sense biologically

Example - growth to body weight can be plotted in inverse of body weight so there is a linear relation

Use correct tools for job:

  • Excel = data management
  • R = statistics
    • R - language and environment for statistical computing and graphics

Why do we need stats?

  • Apply statistics to samples to predict populations of a distributed variable
13 of 13

Comments

No comments have yet been made

Similar Biology resources:

See all Biology resources »See all Statistics resources »