what is population?

universe of units of the people being recorded, collection of units, individuals

what is a sample?

subset of the population that represents a proprtion of the population

what is a variable?

characrertics of individuals which different from one another

what is observation?

a correaltional measuremnet that observes a particular individuals behaviour

What is random sampling?

characteristics of the study site are approx homogenous, equal chance of selection

What is systematically sampling?

even/ regular temporal/spatial intervals, sample along an environmental gradient, variability is captured

What is stratified sampling?

significant heterogentiy exists, even weighting must be given to each sub set

what is accuracy?

difference between sample estimates and true population value

What is precision?

the ability of a measurement to be consistently reproduced

what is bias?

a systematic variation from the population power of interest

what are discontinous varibales?

usually intergers, fractions not possible, counts (frequences)

What are continous variables?

varies at any point along an interrupted scale (allow for fine sub divison), eg length, mass and temperature

What is nominal measurement?

arranges data largest to smallest, categorically discrete data, the most elementary scale of measurement, identifying categories, cateogires must be mutually exclusive, can be coded(replaced by number) have no numerical meaning, eg counts(frequences)

What is ordinal data?

pre defined, refers to quantites that exhibit a natural ordering (ranking), incorportates classification and labelling, arrange observations from highest to lowest, arithmitic is not possible, not possible to state intervals between values are equal

what is interval data?

intervals between each data are equally split, recognition of the distance between units leads to mathematical power, ability to add and subtract, but interval scale variables have no absolute 0

What is ratio data?

interval data with an absolute, natural (unique and non-arbitary) zero point, all artihmic procedures are possible(power), highest form of measuremnet which incorporates all lesser scales(complexity)

what is a bar chart?

discrete categories,(nominal or ordinal data), lengths of bars are proprtional to size of cateogry represented, x asis has no scale as it represents categories, y axis has units of measurement

What is a histogram?

continous(interval or scale data), areas of bars are proportional to size of cateogry represented, both x and y have a scale, categories are used to avoid gaps in the data and ease interpretation, when categories differ, width is adjusted proportiona

more on histograms- what is univariate analysis?

concerned with describing the distibriton of a single variable, potentiall described through visual or quantitative means, basic but essential

what are averages?

a set of descriptions(paramters) used very frequently in everyday life, mean, median and mode

what does the median and IQR do?

useful measure of the center and spread of observations, infomred when data doesnt follow a normal distribution, can be employed when scale is ordinal(not interval/ratio), sample size is small

What does standard deviation do?

Chaacterisies the spread(width of a distribution) around the central value (mean), higher sd greater spread expressed in units of measurement, Point of infelction can be concave/convex

sd equation

E(X-xbar)^2/n-1 SQUARE ROOTED 1 std from the mean is 68% of all cases and the base line is divided into equal segments

What is the variance?

mean square standard deviation s^2=E(X-XBAR^2)/N-1, always positive, cannot be used for nominal/ordinal data

how does sample statistics and population parameters differ?

sample- xbar and s^2 and population has u(mew) and variance squard o-

what is the cooefficent of variation?

compasion of variables measured in different units or with different orders of magnitude, expressed as %, only for ratio data, ratio of SD to the mean

What is skewness?

a measure of syymetry in a distrubtuion, characterisitng its shape, zero skewness occurs when mean=median and symetry around the mean

netaive skewness happens when and positive when?

negative= concentrated to the right of the mean with extreme values tothe left, positive= most values are concentrated to the left with extreme values to the right

what is kurtosis?

for symetrical data it is a measure of peakedness, recognised visually or quantitively, increasing it is a movement of probability MASS FROM THE SHOULDERS OF A DISTRUBTUION TO ITS Center and tails

what does mesokurtic look like?

like a normal bell shaped curve

what is leptokurtic look like?

high bell shaped curve

what is the name a gentle sloping wide one?

platykurtic

what is the first moment?

central values(mean),in certain cases the median/mode

what is the second moment?

a measure of spread(variability)around the central value(SD/variance)

what is the third moment?

skewness, measure of symetry and non dimensional(not linear in circulation, weaker than 1/2 moments

what is the fourth measurement?

kurtosis-measure of peakedness(non dimensional)

what are the conditions for parametric data?

data must be interval/ratio scale, data must be normally distrubted, numbers of obsservations hsould exceed 30

what are the conditions for non parametric data?

all data that are nominal/ordinal, interval/ratio data that are not normally distrubted, small sample size, less than 30

what determines weather the test should be parametric or non parametric?

scale of measuremnet, degree of normality, sample size

what is subjective(personal) probability?

numerical, but based on the judgemnet on the individual, how likely you think it is for an event to occur

what is theoretical probability?

assume that all n possibly outcomes are equally likely, know sampling space (n), logical reasoning and controlled experiment

what is experimental probability?

(empirical, relative frequency), probability based on observation(actual measurements that have been collected (sample)

what is sample space?

all the possible outcomes of an experiment

what is an event?

any subset of a sample space- simple and compound events

what is the equation of probability?

number of outcomes corresponding to event over total number of outcomes

what are mutual exclusive events?

zero possibility of 2 events occuring together at the same time

what are independent events?

events that have no influence on each other

What is standardisation?

makes sure each person is treated fairly, standised means make sure each group has some conditions to be fair

z scores will occue what % of the time, + or -1.96, what does actual raw value of x mean?

5%, x=mean of x +/-1.96 of the standard deviation

what are the types of hypothesis h0=

there is no staisticaclly significant difference betweensample and population

what what does h1 mean?

there is statistically significant difference between sample a, b,c

what happens if the significant value falls to the left or right ?

moves to left you accept h0, if it falls to the right you reject h0

what is type 1 and type 2 error

type 1) rejecting the null hypothesis when it is in fact true, type 2, not rejecting the null hypothesis when the alternative hypothesis is true

what does confidence level mean?

0%- 100% probability that results due too chance, 99.9% - 0.1% that results due to chance

the higher the n the...

lower the probability that the result could have occured due to chance

what is the kolmogarov smirnov test?

comparing sample with a theoreti/hypothesis distrubtuion, it uses ordinal data and involved calculation of cumulative/frequencies

what is the shapiro- wilk test?

it isa test of normality,uses hypothesis to test if it came from a normally distrubuted population

what is the number in a one tailed test and a two tailed test?

one tailed test is - or +1.675 and two tailed, -or+ 1.96

what is the T test?

difference between a sample mean and a population mean eg age distrubtion bs population and income distrubtion vs population

what do we need to do ebfore doing our t test

pre selected our significance level, report test statistic and the significance,

what is anova

analysis of variance, comparing the means of two samples, compre any number of sample means and compare means of several samples

what does a positive/negative correaltion mean

positive means tha if you increase one ariable it increases anotother direct, negative is if you increase one varibale it causes a decrease in the other (inverse)

what is common usage

correlation refers to any type of relatiomship between objects and events, quantitative realtionship in staistics

what is PMCC

product moment corelation coefiienct

what is called when all the points are sampled

absolute correlation coeffeicnt parameter (rho)

advantages and disadvantages of non parametric test?

most common, simplier to understand, inferential and relational, ewer assumptions, less powerful, parametric is emplyed when not appropriate

what is an inferential test?

ONE WAY chi square test, tests for differences in nominal/ordinal data between a sample and a population=(observed-expecte^2/expected, can be nominal

how to work out expected value in a chi test?

raw total times column total/overall total

what is a two way chi test?

expected values are proportional to sample size and the distribution between sample is equal, tests between 2 or more samples, nominal but can be ordinal

what is the mann u witney test?

comparison of means, equal/unequal sample sizes, limited number of observations required but large N, ranking of observations

what is the name of the test which is a non parametric counterpart to ANOVA, will three or more sample means, applies ranking to observations?

kruskall-wallis test

benefits and limitations kruskall-wallis test?

relational, suitable for ordinal, interval and ratio data but unhelpful scatter plot

what is the spearman rank correlastion cooefiient?

data or both variables are ranked and the statistic is based on the squared differences between the ranks, 1 to -1- correlation

what tests are used for non paramteric tests?

chi squared, mann witney, kruskall-wallis, spearamns rank correlation coeficnet

what tests are used in parametric tests?

t test and analysis of variance (ANOVA), data must exceed 30, pearsons correlation for relationship, linear and non linear regression, interval/ratiodata , normally distributed data

what does regression allow?

to make a numerical prediction of one variable by reference to another -explanatory power

what are the assumptions for simple linear regression?

must be continous, interval or ratio, normally distributed, no skewness, more than 30

what is the equation of a straight line?

y=mx+c

what is less square criterion?

the line tests relationship between two variables, calculate distance from points to line

what is the residual?

difference between observed and predicted value of y

how to work out variance?

sum of squares/degrees of freedom

if the test statistic exceeds the critical value then what do you do

reject the null hypothesis

what are the two sources of error in the regression model

standard error of the residuals and the standard error of the prediction

what is homoscedasticity?

probability having equal statically value, either side of the line

what is heteroscedasticity?

regression equation may be unstable

what is autocorrelation?

correlation between the elements of a series and others from the same series separated from them by a given interval.

what is the name of the test to calculate auto correlation?

Durbin-Watson d statistic, sum of successive squared differences/sum of squared residuals

what can you do to your statistic?

reject, region of acceptance or indeterminate

difference between non linear regression and linear regression?

linear is a straight line, non linear is a curved graph,( does no represent relationship)non linear=addresses complex equations/not uniform

how does y=mx+c change for curves?

uses E(exponential), b is either a multiplication actor or a power and a is either a multiplicative or an addictive y=aX^b b=degress of curvature, a=location of curve

what is spss

software for analysis non linear functions

changes in predictor variable not matched by uniform changes in y, what type of regression?

non-linear, multiple forms of non linear relationships

