# Experimental design and analysis

HideShow resource information
What is probability?
How likely an event will happen
1 of 119
How is probability expressed?
A number (P-value)
2 of 119
What is the maximum and minimum value probability can be?
0 and 1
3 of 119
What does the sum of probabilities for all outcomes equal to?
1
4 of 119
How can you calculate probability?
number of times an event happens / number of repeats
5 of 119
What does the null hypothesis state?
That there is no relationship between two measurements
6 of 119
What does a small p-value suggest?
That there is strong evidence that the null hypothesis is wrong
7 of 119
What does a large p-value suggest?
That there is strong evidence that the null hypothesis is correct
8 of 119
What would a p-value above 0.05 suggest about the difference?
Its due to chance
9 of 119
When are two events mutually exclusive?
When they can't occur at the same time
10 of 119
What is conditional probability?
The probability of event A occurring if event B has occured
11 of 119
What is the rule of subtraction?
Finding the probability of event A occurring by doing 1-(P A not occuring)
12 of 119
What is the rule of multiplication?
Finding the probability of both events A and B occurring. (pA x pB if A has occured)
13 of 119
What does statistical interference mean?
Drawing conclusions about an observation based on data
14 of 119
What is random sampling?
Sampling when the probability of including an individual from a population is the same for all individuals and is independent of other samples
15 of 119
What is bias sampling?
Sampling when the probability of an individual being picked to be sampled from a population is increased or decreased due to its characteristics
16 of 119
What is non-independant sampling?
Sampling when the probability of an individual being picked from a population to be sampled is dependant on other individuals from the population
17 of 119
What is a type I error?
When you reject a null hypothesis which is actually correct. You think theres statistical difference when there isn't
18 of 119
When are type I errors most commonly made?
When there is multiple testing
19 of 119
When would you use a bonferoni correction?
In order to reduce type I errors. When an experiment is testing many hypotheses
20 of 119
What does the bonferoni correction do?
Tests hypotheses to a certain statistical significance. 0.05 / no. of hypotheses
21 of 119
How does the bonferroni correction work?
Because the probability of getting at least one significant result increases as the number of hypotheses increases
22 of 119
What is a type I error also known as?
A false negative
23 of 119
What is a type II error?
When you accept the null hypothesis but its actually false. When you think theres no significant difference but there is
24 of 119
How can you reduce the chance of making a type II error?
Ensuring your test has enough statistical power
25 of 119
What is statistical error?
The likelihood that a study will detect an effect if there is a true difference to be seen
26 of 119
What is statistical error mainly effected by?
Same size and size of the effect
27 of 119
What is the effect size?
How big the effect of a variable had on samples
28 of 119
Does binomial distribution had a certain number of repeats?
Yes
29 of 119
What are the outcomes of a binomial experiment?
Success or failure
30 of 119
What is Q in a binomial experiment?
The probability of failure (1-P)
31 of 119
What is the binomial probability?
The probability that n-number of trials results in exactly X successes when the probability of a success is P
32 of 119
Does binomial distribution had a certain number of repeats?
No it has infinite trials
33 of 119
What has to be know in a poisson experiment?
The average number of successes that occurs within a specific region
34 of 119
What can a 'region' be in a poisson distribution?
Time, length, volume
35 of 119
What is the poisson probability?
The probability that exactly x successes will occur when there is a mean number of successes
36 of 119
Wha type of data do you use a chi-squared test for?
Nominal data that falls into particular categories
37 of 119
What does a one-factor chi-squared test do?
Compares expected values against observations of one variable
38 of 119
What does a two-factor chi-squared test do?
It tests two variables and identifies whether the factors are independent of each other
39 of 119
What is the null hypothesis in this test?
That there is no relationship
40 of 119
What is continuous data?
Data that has a unit and can be measured
41 of 119
What does the curve of normally distributed data depend on?
The mean and standard deviation from the mean
42 of 119
What doe the mean determine on the graph?
The location of the centre of the graph
43 of 119
What does the standard deviation determine?
The width and height of the graph
44 of 119
What is the total area under the graph equal to?
1
45 of 119
What % falls under 1 standard deviation of the mean?
68%
46 of 119
What % falls under 2 standard deviation of the mean?
95%
47 of 119
What % falls under 3 standard deviation of the mean?
99.7%
48 of 119
What does the Shapiro-wilk test for?
Normality of data
49 of 119
What does it allow you to see?
50 of 119
hIf the P-value is below 0.05 is the null hypothesis accepted/ rejected?
rejected
51 of 119
Is data normally distributed with a P-value of more than 0.05
Yes
52 of 119
What is the t-test used for?
To estimate the population parameters when the sample size is small or the population variance is unknown
53 of 119
What are parameters?
Numbers that summers data of a population. The mean and the standard deviation from the mean
54 of 119
Can a population distribution be directly calculated?
No, only estimated
55 of 119
What does the central limit theorem state?
That the sample mean of a statistic will follow a normal distribution as long as the sample size is large
56 of 119
If your sample size is small or you don't know the standard deviation of your population, what can you do?
Look at the distribution of the t-statistic
57 of 119
What does the standard error of the mean tell you?
It tells you how different the sample means are from the population mean
58 of 119
What is a confident interval?
A range of estimated values that will likely contain the population mean
59 of 119
What are confident limits?
The upper and lower boundaries of the confident interval
60 of 119
Is the standard deviation of large samples always the same?
Yes at 1.96. over 30 samples
61 of 119
What if you have a small sample size?
The standard deviations need to be adjusted
62 of 119
Is the standard deviation of a small sample size bigger or smaller than that of a big sample size?
Bigger
63 of 119
What is a t-value?
The difference between the standard deviation of the population and the sample.
64 of 119
What is a confidence level?
A percentage that tells you how many of the samples you can expect to include the true population mean within their confident intervals
65 of 119
What does a wide confidence interval suggest?
That you need a bigger sample size?
66 of 119
Is it good to have a narrow confidence interval and why?
Yes because it is more precise
67 of 119
How do you calculate variance?
Squared standard deviation
68 of 119
How do you calculate the confidence interval?
The sample mean + and - the standard error
69 of 119
What does a two-sample t-test measure?
It measures if themes of the two samples differer from the population mean and from each other
70 of 119
What is the test statistic of this test?
Z
71 of 119
When do you use ANOVA?
When you are comparing two or more variables
72 of 119
What are the assumptions of ANOVA?
The variance of all the errors are the same, The errors are independent and they are normally distributed
73 of 119
What does ANOVA test?
If the value of the variables differers significantly among 3 or more values
74 of 119
What do you need to calculate in ANOVA?
The grand mean and group mean
75 of 119
What is the F statistic ratio?
The between group variance / The within group variance
76 of 119
What effect does a higher between group variance have on F?
Increases it
77 of 119
What effect does a large amount of variance, due to chance, have on F?
Decreases it
78 of 119
What does a high F value suggest?
Moe significance and difference is not due to chance
79 of 119
How do you calculate sum of squares?
Data point - mean ^2
80 of 119
When do you use a Mann whitney U test?
When you're data is not normal and you are testing one variable
81 of 119
What parametric test is the Mann whitney U test compared to?
T-test
82 of 119
How do you display data for a Mann whitney U test?
Rank it in ascending order
83 of 119
Whats the next step?
You compare the two data points in each rank and find the difference. Ua and Ub
84 of 119
Are the differences separated between when the X column is bigger and then when the Y column is bigger?
Yes
85 of 119
What is the U value?
The ratio of Ua to Ub
86 of 119
when do you use a Kruskall wallis test?
When the assumption of ANOVA are not met. Data is not normally distributed and there are not equal variances
87 of 119
What is the null hypothesis?
The sample are from the same population
88 of 119
Does Kruskall wallis involve ranking data too?
Yes
89 of 119
What is the test statistic?
H
90 of 119
What is the kruskall wallis test used for?
When two factors are being tested
91 of 119
What is the limitation of the kruskall wallis test?
It only tells you that there is a difference or not between two groups. It doesn't tell you which groups
92 of 119
What does two-way ANOVA test?
It compares the mean difference between groups that have multiple factors
93 of 119
What types of variables must two-way ANOVA include?
Two nominal variables and one measurement variable
94 of 119
Wat do you test?
What effect each nominal variable has on the measurement variable
95 of 119
When would you transform data?
When it is not normally distributed
96 of 119
How do you transform data?
Apply a log transformation. Base 10 (log10)
97 of 119
Are you able to do statistical tests on transformed data?
Yes
98 of 119
What does a correlation (r) measure?
The extent to which two variables change together
99 of 119
What is the measurement of a correlation?
The correlation coefficient r
100 of 119
What does r range from?
-1 to +1
101 of 119
What would -1 and +1 tell you about the line?
Its a completely straight line where all points are on the line
102 of 119
What doe a correlation look for?
A linear effect
103 of 119
What unit does r have?
No units
104 of 119
How do you calculate a correlation?
Calculate the sum of squares for the Y and X values then calculate the sum of cross products by adding them together
105 of 119
What is r^2?
A coefficient of determination
106 of 119
How many samples does a test need t give a significant r?
3
107 of 119
What is regression?
It calculates the 'best' straight line through the plots in order to express the relationship of the correlation coefficient
108 of 119
What is the regression line?
A line that minimises the sum of squared deviation of the plots from the line
109 of 119
What is the intercept symbolised as?
c
110 of 119
What is the intercept?
The point at which the line crosses the x-axis
111 of 119
What is the slope symbolised as?
B
112 of 119
What is the slope?
113 of 119
How do you calculate the intercept?
The mean of Y - (mean X x b)
114 of 119
What statistical test do you use to test the significance of regression?
Pearsons correlation
115 of 119
Can you use not normally distributed data in regression?
No you must first transform the data
116 of 119
What do yu do if the data does show relationship but its not linear?
Use a spearmint rank correlation test
117 of 119
What type of data is linear models used on?
Normal data
118 of 119
What type of data is generalised linear models used on?
Non-normal data/ count data
119 of 119

## Other cards in this set

### Card 2

#### Front

How is probability expressed?

#### Back

A number (P-value)

### Card 3

#### Front

What is the maximum and minimum value probability can be?

### Card 4

#### Front

What does the sum of probabilities for all outcomes equal to?

### Card 5

#### Front

How can you calculate probability?