Distributions, Means and Variance

?
  • Created by: rosieevie
  • Created on: 06-01-18 18:02

Statistics

Three types of stats:

  • Descriptive - deal w/ pattern and properties of data
    • Freqently plotted to aid data interpretation
    • Necessary for making desicions on inferential statistics
    • Organising, summarizing, describing
  • Inferential - methodology to test null hypothesis
    • Evidence-based analysis - make statements and assumptions about population from sample
    • Generalizing
  • Correlational - relationships
    • Correlation does not mean causation
1 of 9

Descriptive Statistics

Uses summaries - simple but useful as they can raise questions and turned into a research hypothesis

Describe what data looks like and look at trends

Mode - most frequently occuring score

  • Qucik and easy
  • Unaffected by extremities
  • Only useful with discrete data
  • Large effect of a non-representative subgroup

Median - middle value

  • Unaffected by extremeties
  • Not restricted to discrete data
  • Only considers order, values are ignored
  • Can change median by changing data structure

Mean - arithmatic average, next page

2 of 9

Means

  • Very sensitive measure - considers all information
  • Can be combined with group means to give overall mean
  • Very sensitive measure to extremes
  • Not useful if data not-normally distributed

Notation of population means:

  • X or Y = indivdual means
  • N = no. individuals
  • u or X(hat) = mean

Notation of sample means:

  • x or y = Individual means
  • n = no. individuals
  • x(hat) = mean
3 of 9

Measures of Variability

Range - difference between smallest and largest observation

  • Uses two data points - limited use and dependent on outliers

Variance/standard deviation = better measure of variability

Variability - deals with depature from mean (deviation) 

  • Longer lines on graph = more variability
  • Use sum of squares to avoid issues with negative depatures
  • ** = Σ(y-ÿ)2
  • ** dependent on sample size so divide by no. samples for mean squared deviations -  (Σ(y-ÿ)2)/n)

To calculate variance we divide ** by degrees of freedom and then square it to get standard deviation

SD gives an idea of reliability/accuracy of data points

4 of 9

Distribution Patterns

Negatively skewed - mode highest value, mean lowest value

Normal - mean, median and mode all same value

Positively skewed - mode lowest value, mean highest value 

Normallity of distribution determines which stats test used

5 of 9

Normal (Gaussian) Distributions

Displayed graphically where x-axis (or z-axis) = measured variable and y-axis = frequency

Central limit theorem - natural populaions should follow normal distribution around mean

Normal distributions important - determine which measure of central tendency (mean/mode/mediain) to use and measure of variability to use

Determine further choices of statistical analysis - parametric or non-parametric

Calcaulate mean and SD from z-axis

Normally distributed sample = >70% data falls around mean = low variability

ALWAYS TEST NORMALITY FIRST BEFORE OTHER TESTS

6 of 9

95% Confidence Intervals

Central limit theorem = 95% of all data points must lie w/in 1.96 SD of mean 

Determines if population is 'natural' or not

Useful - tell probability that a certain value belongs to a population of a certain mean

7 of 9

Binomial Distributions

= data is distributed to one side

  • Each of n trails have two possible oucomes (success/failure)
  • Each trail has same probability of success (p)
  • Probability of failure is 1 - p
  • n trails are independent
  • Mean = np

Binomial random variable X = number of successes in n trails

  • Approximates normal distribution when n is large
8 of 9

Poisson Distribution

Occurs in data w/ many 0 values = abnormal distribution

Applies when:

  • Even is something that can be counted in whole numbers
  • Occurances are independent
  • Average frequency of occurance in time period is known and possible to count how many events have occured
  • Mean is same as vairance

Used as model for number of evens in given time period e.g. fleas on a rat

9 of 9

Comments

No comments have yet been made

Similar Biology resources:

See all Biology resources »See all Statistics resources »