Statistics Key Words 9-1


Raw Data

Data just as it has been collected

1 of 77

Quantitative data

Numerical observations or measurements such as 10, 5.5 or 39cm

2 of 77

Qualitative data

non-numerical observations such as blue or cat

3 of 77

Continuous data

Quantitative data which can take any value on a continuous scale such as length or mass

4 of 77

Discrete data

Quantitative data which can only take particular values such as shoe size or number of pets

5 of 77

Categorical data

Can be sorted into non-overlapping categories

6 of 77

Bivariate data

Involves pairs of related data e.g height and weight

7 of 77

Multivariate data

Involves sets of three or more related data values e.g a plants colour, leaf size and height

8 of 77

Ordinal data

Can be written in order or can be given a numerical rating scale

9 of 77

Primary data

Data collected by, or for, the person who is going to use it

10 of 77

Secondary data

Has been collected by someone else e.g websites, newspapers ect.

11 of 77

Advantages of Primary Data

- collection method known

- accuracy known

- can be very specific questions

12 of 77

Disadvantages of Primary Data

- Time consuming

- Expensive to collect

13 of 77

Advantages of Secondary Data

- easy to obtain

- cheap to obtain

- data from organisations can be more reliable

14 of 77

Disadvantages of Secondary Data

- method of collection unknown

- data might be out of date

- may contain mistakes

- may be unreliable

- may be difficult to find specific answers

15 of 77


everything or everybody that could possibly be involved in an investigation e.g a delivery company wants information about the number of miles travelled by its lorries. its population would  be all the company lorries.

16 of 77


a survey or investigation with data taken from every member of a population

17 of 77


you can take a saple from a population. It contains information about part of the population and can be used to make conclusions for the whole population.

18 of 77

Advantages of a Census

- unbiased

- accurate

- takes whole populace into account

19 of 77

Disadvantages of a Census

- time-consuming

- expensive

- difficult to ensure the whole population is used

- lots of data to handle

20 of 77

Advantages of a sample

- cheaper

- less time consuming

- less data to be considered

21 of 77

Disadvantages of a sample

- not completely represented

- may be biased

22 of 77

Sampling Units

people or items that are to be sampled

23 of 77

Sampling Frame

is a list of all sampling units

24 of 77

Capture Recapture

first capture/total number = number tagged/second capture

25 of 77

Capture recapture Assumptions

- the population hasn't changed

- the probability of being caught is equal for each individual

- marks/tags are not lost and are always recognisable

- the sample size is large enough to be representative of the population

26 of 77

Random Sampling

every member of the population has an equal chance of being picked

advantage: its fair and unbiased

Diadvantage: needs large sample size

27 of 77

Judgement Sampling

non-random sampling where you use your judgement to select a representative of the population

28 of 77

Opportunity sampling

non random sampling where you use the people available at the time

29 of 77

Cluster Sampling

Non random sampling when the data naturally splits into groups

30 of 77

Systematic sampling

Non random sampling where you choose a random starting point from the frame and then choose regular intervals e.g every 5th person

31 of 77

Quota Sampling

non random sampling where you group the population by characteristics like hair or gender and interview a group from each group

32 of 77

Stratified sampling

Non random sampling which contains members of each group in proportion to the size of the group

33 of 77

How to decide which sampling method to use

- biased?

- Sensible sample size?

- Quick and easy?

- Expensive?

34 of 77

Direct observation

collect primary data systematically as you observe them

35 of 77

data collection sheet

a table or tally chart for recording results

36 of 77

explanatory/independant variable

what you change

37 of 77

response/dependant variable

the one that is affected

38 of 77

extraneous variable

variables you are not interested in nut could change the results

39 of 77


if repeating the experiment gives you very similiar results, they are likely to be reliable

40 of 77


you can use simulation to model random real life events to help you predict what would actually happen. simulation can be easier and cheaper than analysing real data.

41 of 77

laboratory experiments

conducted in control environments

advantage: easy to be replicated, you can control extraneous variables

disadvantage: test subjects may behave differently

42 of 77

field experiments

carried out in subjects everyday environment, controls one or more variables

advantages: more likely to reflect real life behavior

disadvantages: cant control some extraneous variables

43 of 77

natural experiments

carried out in subjects everyday environment with no variables controlled

advantages: more likely to reflect real life behaviour

disadvantages: cant control any variables, harder to replicate

44 of 77


Set of questions designed to obtain data

45 of 77


Person completing questionnaire

46 of 77

Open and Closed Questions

Open: has no suggested answers

Closed: gives answers to choose from

47 of 77

Open and Closed Questions

Open: has no suggested answers

Closed: gives answers to choose from

48 of 77

Pilot survey

Conducted on a small sample to test the design and methods of the survey

49 of 77

Closed questionnaires

often involve an opinion scale. The problem with an opinion scale is that most people answer somewhere in the middle

50 of 77


Advantages: interviewer can explain questions, respondant can explain answers, high response rate.

disadvantages: respondants may be less honest, can take a long time, sample size is smaller, respondants could try to impress interviewer

51 of 77

Anonymous Questionnaire

advantages: more likely to be honest, takes much less time, large sample size, less bias

disadvantages: respondant may not understand question, lower response rate due to ability to skip question

52 of 77

anomalous data value

a value that does not fit the pattern of the data

53 of 77

Cleaning data

identifying and either correcting or removing inaccurate data values or extreme values. Removing units or other symbols from data. you decide what to do with the data.

54 of 77

Control group

to test effectiveness of a treatment

55 of 77

Matched pairs

Two groups of people are used to test effects of a particular factor. They are paired with someone similiar to them in the opposite group (e.g same hair, intelligence, gender ect.).

Advantage: control extraneous variables

Disadvantage: finding matched pairs

56 of 77


An idea that can be tested by collecting and analysing data

57 of 77

Factors in designing investigations

- time

- cost

- ethical issues

- confidentiality

- convenience

- how to select population/ sample

- how to deal with non-response

- how to deal with unexpected results

58 of 77

Two-Way Tables

Shows information in two categories

59 of 77

Composite Bar Chart

Each bar shows how the frequency for that category is made up from different component groups. the total frequencies and the frequencies of each component group can be compared.

60 of 77

Comparative Pie Charts

Can be used to compare two sets of data. The areas of the two circles should be in the same ratios as the two total frequencies. To compare the total frequencies, compare the areas. To compare proportions, compare the individual angles.

61 of 77

Index number

index number = price/base year price x 100

62 of 77

retaill price index

rate of change of prices in everyday life (ie. morgage, food, heating)

63 of 77

consumer price index

same as retail price index but does not include morage payments

64 of 77

gross domestic product

the value of goods and services a country produces within a time period

65 of 77

weighted index number

weighted index number = current weighted mean price/base year weighted mean price x 100

66 of 77

chain base index number

chain base index number = price/last years price x100

67 of 77

crude rate

crude rate = number of (deaths/births/etc.)/total population x1000

68 of 77

standard population

standard population = number in age group/total population x 1000

69 of 77

standardised rate

standardised rate = crude rate/1000 x standard population

70 of 77


number of trials where the event happens/ total number of trials

71 of 77

relative risk

risk for those in that group/ risk for those not in the group

72 of 77


all possible outcomes for a set of mutually exclusive, exhaustive events, the probability must add to 1

73 of 77

binomial distribution


n = the number of trials

p= the probability of success

it follows the pattern of pascals triangle

74 of 77

seasonal effect

real value - value from the trend line

75 of 77

persons product moment correlation coefficient

 measures the linear correlation between bivariate data.

measured between -1 and 1

76 of 77

standardised score

x - mean / standard deviation

77 of 77




thanks tigs **

Similar Statistics resources:

See all Statistics resources »See all Key Words resources »