# statistics

HideShow resource information
what is population?
universe of units of the people being recorded, collection of units, individuals
1 of 92
what is a sample?
subset of the population that represents a proprtion of the population
2 of 92
what is a variable?
characrertics of individuals which different from one another
3 of 92
what is observation?
a correaltional measuremnet that observes a particular individuals behaviour
4 of 92
What is random sampling?
characteristics of the study site are approx homogenous, equal chance of selection
5 of 92
What is systematically sampling?
even/ regular temporal/spatial intervals, sample along an environmental gradient, variability is captured
6 of 92
What is stratified sampling?
significant heterogentiy exists, even weighting must be given to each sub set
7 of 92
what is accuracy?
difference between sample estimates and true population value
8 of 92
What is precision?
the ability of a measurement to be consistently reproduced
9 of 92
what is bias?
a systematic variation from the population power of interest
10 of 92
what are discontinous varibales?
usually intergers, fractions not possible, counts (frequences)
11 of 92
What are continous variables?
varies at any point along an interrupted scale (allow for fine sub divison), eg length, mass and temperature
12 of 92
What is nominal measurement?
arranges data largest to smallest, categorically discrete data, the most elementary scale of measurement, identifying categories, cateogires must be mutually exclusive, can be coded(replaced by number) have no numerical meaning, eg counts(frequences)
13 of 92
What is ordinal data?
pre defined, refers to quantites that exhibit a natural ordering (ranking), incorportates classification and labelling, arrange observations from highest to lowest, arithmitic is not possible, not possible to state intervals between values are equal
14 of 92
what is interval data?
intervals between each data are equally split, recognition of the distance between units leads to mathematical power, ability to add and subtract, but interval scale variables have no absolute 0
15 of 92
What is ratio data?
interval data with an absolute, natural (unique and non-arbitary) zero point, all artihmic procedures are possible(power), highest form of measuremnet which incorporates all lesser scales(complexity)
16 of 92
what is a bar chart?
discrete categories,(nominal or ordinal data), lengths of bars are proprtional to size of cateogry represented, x asis has no scale as it represents categories, y axis has units of measurement
17 of 92
What is a histogram?
continous(interval or scale data), areas of bars are proportional to size of cateogry represented, both x and y have a scale, categories are used to avoid gaps in the data and ease interpretation, when categories differ, width is adjusted proportiona
18 of 92
more on histograms- what is univariate analysis?
concerned with describing the distibriton of a single variable, potentiall described through visual or quantitative means, basic but essential
19 of 92
what are averages?
a set of descriptions(paramters) used very frequently in everyday life, mean, median and mode
20 of 92
what does the median and IQR do?
useful measure of the center and spread of observations, infomred when data doesnt follow a normal distribution, can be employed when scale is ordinal(not interval/ratio), sample size is small
21 of 92
What does standard deviation do?
Chaacterisies the spread(width of a distribution) around the central value (mean), higher sd greater spread expressed in units of measurement, Point of infelction can be concave/convex
22 of 92
sd equation
E(X-xbar)^2/n-1 SQUARE ROOTED 1 std from the mean is 68% of all cases and the base line is divided into equal segments
23 of 92
What is the variance?
mean square standard deviation s^2=E(X-XBAR^2)/N-1, always positive, cannot be used for nominal/ordinal data
24 of 92
how does sample statistics and population parameters differ?
sample- xbar and s^2 and population has u(mew) and variance squard o-
25 of 92
what is the cooefficent of variation?
compasion of variables measured in different units or with different orders of magnitude, expressed as %, only for ratio data, ratio of SD to the mean
26 of 92
What is skewness?
a measure of syymetry in a distrubtuion, characterisitng its shape, zero skewness occurs when mean=median and symetry around the mean
27 of 92
netaive skewness happens when and positive when?
negative= concentrated to the right of the mean with extreme values tothe left, positive= most values are concentrated to the left with extreme values to the right
28 of 92
what is kurtosis?
for symetrical data it is a measure of peakedness, recognised visually or quantitively, increasing it is a movement of probability MASS FROM THE SHOULDERS OF A DISTRUBTUION TO ITS Center and tails
29 of 92
what does mesokurtic look like?
like a normal bell shaped curve
30 of 92
what is leptokurtic look like?
high bell shaped curve
31 of 92
what is the name a gentle sloping wide one?
platykurtic
32 of 92
what is the first moment?
central values(mean),in certain cases the median/mode
33 of 92
what is the second moment?
a measure of spread(variability)around the central value(SD/variance)
34 of 92
what is the third moment?
skewness, measure of symetry and non dimensional(not linear in circulation, weaker than 1/2 moments
35 of 92
what is the fourth measurement?
kurtosis-measure of peakedness(non dimensional)
36 of 92
what are the conditions for parametric data?
data must be interval/ratio scale, data must be normally distrubted, numbers of obsservations hsould exceed 30
37 of 92
what are the conditions for non parametric data?
all data that are nominal/ordinal, interval/ratio data that are not normally distrubted, small sample size, less than 30
38 of 92
what determines weather the test should be parametric or non parametric?
scale of measuremnet, degree of normality, sample size
39 of 92
what is subjective(personal) probability?
numerical, but based on the judgemnet on the individual, how likely you think it is for an event to occur
40 of 92
what is theoretical probability?
assume that all n possibly outcomes are equally likely, know sampling space (n), logical reasoning and controlled experiment
41 of 92
what is experimental probability?
(empirical, relative frequency), probability based on observation(actual measurements that have been collected (sample)
42 of 92
what is sample space?
all the possible outcomes of an experiment
43 of 92
what is an event?
any subset of a sample space- simple and compound events
44 of 92
what is the equation of probability?
number of outcomes corresponding to event over total number of outcomes
45 of 92
what are mutual exclusive events?
zero possibility of 2 events occuring together at the same time
46 of 92
what are independent events?
events that have no influence on each other
47 of 92
What is standardisation?
makes sure each person is treated fairly, standised means make sure each group has some conditions to be fair
48 of 92
z scores will occue what % of the time, + or -1.96, what does actual raw value of x mean?
5%, x=mean of x +/-1.96 of the standard deviation
49 of 92
what are the types of hypothesis h0=
there is no staisticaclly significant difference betweensample and population
50 of 92
what what does h1 mean?
there is statistically significant difference between sample a, b,c
51 of 92
what happens if the significant value falls to the left or right ?
moves to left you accept h0, if it falls to the right you reject h0
52 of 92
what is type 1 and type 2 error
type 1) rejecting the null hypothesis when it is in fact true, type 2, not rejecting the null hypothesis when the alternative hypothesis is true
53 of 92
what does confidence level mean?
0%- 100% probability that results due too chance, 99.9% - 0.1% that results due to chance
54 of 92
the higher the n the...
lower the probability that the result could have occured due to chance
55 of 92
what is the kolmogarov smirnov test?
comparing sample with a theoreti/hypothesis distrubtuion, it uses ordinal data and involved calculation of cumulative/frequencies
56 of 92
what is the shapiro- wilk test?
it isa test of normality,uses hypothesis to test if it came from a normally distrubuted population
57 of 92
what is the number in a one tailed test and a two tailed test?
one tailed test is - or +1.675 and two tailed, -or+ 1.96
58 of 92
what is the T test?
difference between a sample mean and a population mean eg age distrubtion bs population and income distrubtion vs population
59 of 92
what do we need to do ebfore doing our t test
pre selected our significance level, report test statistic and the significance,
60 of 92
what is anova
analysis of variance, comparing the means of two samples, compre any number of sample means and compare means of several samples
61 of 92
what does a positive/negative correaltion mean
positive means tha if you increase one ariable it increases anotother direct, negative is if you increase one varibale it causes a decrease in the other (inverse)
62 of 92
what is common usage
correlation refers to any type of relatiomship between objects and events, quantitative realtionship in staistics
63 of 92
what is PMCC
product moment corelation coefiienct
64 of 92
what is called when all the points are sampled
absolute correlation coeffeicnt parameter (rho)
65 of 92
most common, simplier to understand, inferential and relational, ewer assumptions, less powerful, parametric is emplyed when not appropriate
66 of 92
what is an inferential test?
ONE WAY chi square test, tests for differences in nominal/ordinal data between a sample and a population=(observed-expecte^2/expected, can be nominal
67 of 92
how to work out expected value in a chi test?
raw total times column total/overall total
68 of 92
what is a two way chi test?
expected values are proportional to sample size and the distribution between sample is equal, tests between 2 or more samples, nominal but can be ordinal
69 of 92
what is the mann u witney test?
comparison of means, equal/unequal sample sizes, limited number of observations required but large N, ranking of observations
70 of 92
what is the name of the test which is a non parametric counterpart to ANOVA, will three or more sample means, applies ranking to observations?
kruskall-wallis test
71 of 92
benefits and limitations kruskall-wallis test?
relational, suitable for ordinal, interval and ratio data but unhelpful scatter plot
72 of 92
what is the spearman rank correlastion cooefiient?
data or both variables are ranked and the statistic is based on the squared differences between the ranks, 1 to -1- correlation
73 of 92
what tests are used for non paramteric tests?
chi squared, mann witney, kruskall-wallis, spearamns rank correlation coeficnet
74 of 92
what tests are used in parametric tests?
t test and analysis of variance (ANOVA), data must exceed 30, pearsons correlation for relationship, linear and non linear regression, interval/ratiodata , normally distributed data
75 of 92
what does regression allow?
to make a numerical prediction of one variable by reference to another -explanatory power
76 of 92
what are the assumptions for simple linear regression?
must be continous, interval or ratio, normally distributed, no skewness, more than 30
77 of 92
what is the equation of a straight line?
y=mx+c
78 of 92
what is less square criterion?
the line tests relationship between two variables, calculate distance from points to line
79 of 92
what is the residual?
difference between observed and predicted value of y
80 of 92
how to work out variance?
sum of squares/degrees of freedom
81 of 92
if the test statistic exceeds the critical value then what do you do
reject the null hypothesis
82 of 92
what are the two sources of error in the regression model
standard error of the residuals and the standard error of the prediction
83 of 92
what is homoscedasticity?
probability having equal statically value, either side of the line
84 of 92
what is heteroscedasticity?
regression equation may be unstable
85 of 92
what is autocorrelation?
correlation between the elements of a series and others from the same series separated from them by a given interval.
86 of 92
what is the name of the test to calculate auto correlation?
Durbin-Watson d statistic, sum of successive squared differences/sum of squared residuals
87 of 92
what can you do to your statistic?
reject, region of acceptance or indeterminate
88 of 92
difference between non linear regression and linear regression?
linear is a straight line, non linear is a curved graph,( does no represent relationship)non linear=addresses complex equations/not uniform
89 of 92
how does y=mx+c change for curves?
uses E(exponential), b is either a multiplication actor or a power and a is either a multiplicative or an addictive y=aX^b b=degress of curvature, a=location of curve
90 of 92
what is spss
software for analysis non linear functions
91 of 92
changes in predictor variable not matched by uniform changes in y, what type of regression?
non-linear, multiple forms of non linear relationships
92 of 92

## Other cards in this set

### Card 2

#### Front

what is a sample?

#### Back

subset of the population that represents a proprtion of the population

### Card 3

#### Front

what is a variable?

### Card 4

#### Front

what is observation?

### Card 5

#### Front

What is random sampling?