- Created by: pandyay.15
- Created on: 20-01-20 14:26
Statistics and Mechanics
1 of 86
2 of 86
What is a population?
The whole set of items that are of interest
3 of 86
What is a census?
A census observes or measures every member of a population. An advantage is that it should give a completely accurate result but a disadvantage is that it is time consuming
4 of 86
What is a sample?
A selection of observations taken from a subset of the population. An advantage is that it is quick and there is less data to process but a disadvantage is that it may not be as accurate
5 of 86
What are the individual units of a population?
6 of 86
What do sampling units form?
A sampling frame
7 of 86
What are the three types of random sampling?
Simple random; systematic and stratified sampling
8 of 86
What is simple random sampling?
One where sample of size n has an equal chance of being selected. An advantage is that it has no bias but a disadvantage is that it is not suitable when there is a large population size
9 of 86
What is systematic sampling?
The required elements are chosen at regular intervals from an ordered list. An advantage is that it is suitable for large populations but a disadvantage is that it can introduce bias if the sampling frame is not random
10 of 86
What is stratified sampling?
The population is divided into mutually exclusive strata and a random sample is taken from each. An advantage is that it accurately reflects the population structure but a disadvantage is that the population must be classified into distinct strata
11 of 86
What are the two types of non-random sampling?
Quota and opportunity sampling
12 of 86
What is quota sampling?
A researcher selects a sample that reflects the characteristics of the whole population. An advantage is that it allows for easy comparison between groups but a disadvantage is that it can introduce bias
13 of 86
What is opportunity sampling?
Consists of taking the sample from people who are available at the time the study is carried out and fit the criteria. An advantage is that it is inexpensive but a disadvantage is that it is unlikely to provide a representative sample
14 of 86
What is quantitative data?
Variables or data associated with numerical observations
15 of 86
What is qualitative data?
Variables or data associated with non-numerical observations
16 of 86
What is a continuous variable?
A variable that can take any value in a given range
17 of 86
What is a discrete variable?
A variable that can take only specific values in a given range
18 of 86
When data is presented in a grouped frequency table what are the groups known as?
They are more commonly known as classes
19 of 86
What are class boundaries?
Class boundaries tell you the maximum and minimum values that belong in each class
20 of 86
What is the midpoint?
The average of the class boundaries
21 of 86
What is the class width?
The difference between the upper and lower class boundaries
22 of 86
Measures of location and spread
23 of 86
What is the mode?
The value or class that occurs most often
24 of 86
What is the median?
The middle value when the data values are put in order
25 of 86
What is the formula for the mean?
X bar = Sum of the data values/ the number of data values
26 of 86
What is the formula for the lower quartile?
Q1 = n/4
27 of 86
What is the formula for the upper quartile?
Q3 = 3n/4
28 of 86
What is the range?
The difference between the largest and smallest values in the data set
29 of 86
What is the interquartile range?
The difference between the upper quartile and lower quartile: Q3-Q1
30 of 86
What is the interpercentile range?
The difference between the values for two given percentiles
31 of 86
What is the formula for variance?
Variance = The sum of each data value squared/ the number of data values - (The sum of data values/ the number of data values) squared
32 of 86
What is the formula for standard deviation?
Standard deviation = The square root of variance
33 of 86
If data is coded using the formula y = x - a/ b, what is the mean and standard deviation of the coded data? of the coded data?
Y bar = x - a/ b Standard deviation = Standard deviation of the original data/ b
34 of 86
Representations of data
35 of 86
What is a common definition of an outlier?
Any value that is: Either greater than Q3 + k(Q3 - Q1) or less than Q1 - k(Q3 - Q1)
36 of 86
When drawing a boxplot what is the order of each line?
Lowest value; lower quartile; median; upper quartile; highest value
37 of 86
What is the process of removing anomalies from a data set?
It is known as cleaning the data
38 of 86
What is the formula for the area of the bar?
Area of bar = k x Frequency
39 of 86
What is the formula for frequency density when k = 1?
If k = 1, then frequency density = frequency/ class width
40 of 86
What forms a frequency polygon?
Joining the middle of the top of each bar in a histogram with equal class widths forms a frequency polygon
41 of 86
When comparing data sets what can you comment on?
A measure of location and a measure of spread
42 of 86
43 of 86
What is bivariate data?
Data which has pairs of values for two variables
44 of 86
What is correlation?
It describes the nature of the linear relationship between two variables
45 of 86
Which variable is plotted on which axis?
The independent variable is usually on the x-axis and the dependent variable is usually on the y-axis
46 of 86
What is meant by the term causal relationship?
Two variables have a causal relationship if a change in one variable causes a change in the other
47 of 86
What is the regression line of y on x?
y = a + bx
48 of 86
What does the coefficient b tell us?
It tells us the change in y for every unit change in x
49 of 86
50 of 86
What is an experiment?
A repeatable process that gives rise to a number of outcomes
51 of 86
What can a Venn diagram be used for?
To represent events graphically, frequencies and probabilities can be placed in the regions of the Venn diagram
52 of 86
What is an event?
A collection of one or more outcomes
53 of 86
What is a sample space?
The set of all possible outcomes
54 of 86
On Venn diagram what does intersection of A and B mean?
The region that only satisfies both A and B (The middle)
55 of 86
On a Venn diagram what does union of A and B mean?
The region that satisfies A or B (Everything inside the two circles)
56 of 86
On a Venn diagram what does complement of A mean?
The region that satisfies not A (Everything except the middle and A)
57 of 86
What is meant by mutually exclusive?
When events have no outcomes in common (On a diagram this means the two circles are separate and have no middle)
58 of 86
What is meant by independent?
When one event has no effect on another
59 of 86
What is a tree diagram used for?
To show the outcomes of two or more events happening in succession
60 of 86
61 of 86
What is a random variable?
A variable whose value depends on the outcome of a random event
62 of 86
What is the term for the range of values that a random variable can take?
63 of 86
What is meant by the term discrete?
The variable is discrete if it can only take certain numerical numerical values
64 of 86
What is meant by the term random?
If the outcome is not known until the experiment is carried out
65 of 86
What does a probability distribution do?
It fully describes the probability of any outcome in the sample space
66 of 86
How are random variables represented?
By upper case letters like X or Y
67 of 86
How are particular values the random variable can take represented?
By lower case letters like x or y
68 of 86
What does P(X = x) mean?
The probability that the random variable takes a particular value x
69 of 86
What is term used when all of the probabilities are the same?
Discrete uniform distribution
70 of 86
What is the formula for binomial distribution?
P(X = r) = Number of trials Choose Desired Value x Chance of success to the power of number of successful outcomes x Chance of failures to the power of number of failure outcomes
71 of 86
What is the numerical formula for binomial distribution?
P(X = r) = nCr x (P)^r x (1-P)^n-r
72 of 86
What does a cumulative probability function tell you?
The sum of all the individual probabilities up to and including the given value of x in the calculation for P(X _< x)
73 of 86
How can binomial distribution be written in a question?
X ~ B (n,p)
74 of 86
75 of 86
What is a hypothesis?
A statement made about the value of a population parameter
76 of 86
What is a test statistic?
The result of the experiment or the statistic that is calculated from the sample
77 of 86
What is the null hypothesis?
The hypothesis that you assume to be correct
78 of 86
What is the alternative hypothesis?
It tells you about the parameter if your assumption is shown to be wrong
79 of 86
What is the difference between one-tailed and two-tailed tests?
In one-tailed the alternative hypothesis can be one of either < or > not both but in two-tailed it has two critical regions and so can be both
80 of 86
What is meant by the significance level?
The given threshold you measure against
81 of 86
What is meant by a critical region?
A region of the probability distribution which, if the test statistic falls within it, would cause you to reject the null hypothesis
82 of 86
What is meant by a critical value?
The first value to fall inside of the critical region
83 of 86
What is the acceptance region?
The region where we accept the null hypothesis
84 of 86
What is meant by the actual significance level?
The probability of incorrectly rejecting the null hypothesis
85 of 86
When using the significance level for a two-tailed test what do you need to consider?
You need to halve the significance level at the end you are testing
86 of 86
Other cards in this set
What is a population?
What is a census?
What is a sample?
Similar Mathematics resources: