# Maths (Applied)

Maths (Applied)
Statistics and Mechanics
Chapter 1
Data Collection
What is a population?
The whole set of items that are of interest
What is a census?
A census observes or measures every member of a population. An advantage is that it should give a completely accurate result but a disadvantage is that it is time consuming
What is a sample?
A selection of observations taken from a subset of the population. An advantage is that it is quick and there is less data to process but a disadvantage is that it may not be as accurate
What are the individual units of a population?
Sampling units
What do sampling units form?
A sampling frame
What are the three types of random sampling?
Simple random; systematic and stratified sampling
What is simple random sampling?
One where sample of size n has an equal chance of being selected. An advantage is that it has no bias but a disadvantage is that it is not suitable when there is a large population size
What is systematic sampling?
The required elements are chosen at regular intervals from an ordered list. An advantage is that it is suitable for large populations but a disadvantage is that it can introduce bias if the sampling frame is not random
What is stratified sampling?
The population is divided into mutually exclusive strata and a random sample is taken from each. An advantage is that it accurately reflects the population structure but a disadvantage is that the population must be classified into distinct strata
What are the two types of non-random sampling?
Quota and opportunity sampling
What is quota sampling?
A researcher selects a sample that reflects the characteristics of the whole population. An advantage is that it allows for easy comparison between groups but a disadvantage is that it can introduce bias
What is opportunity sampling?
Consists of taking the sample from people who are available at the time the study is carried out and fit the criteria. An advantage is that it is inexpensive but a disadvantage is that it is unlikely to provide a representative sample
What is quantitative data?
Variables or data associated with numerical observations
What is qualitative data?
Variables or data associated with non-numerical observations
What is a continuous variable?
A variable that can take any value in a given range
What is a discrete variable?
A variable that can take only specific values in a given range
When data is presented in a grouped frequency table what are the groups known as?
They are more commonly known as classes
What are class boundaries?
Class boundaries tell you the maximum and minimum values that belong in each class
What is the midpoint?
The average of the class boundaries
What is the class width?
The difference between the upper and lower class boundaries
Chapter 2
What is the mode?
The value or class that occurs most often
What is the median?
The middle value when the data values are put in order
What is the formula for the mean?
X bar = Sum of the data values/ the number of data values
What is the formula for the lower quartile?
Q1 = n/4
What is the formula for the upper quartile?
Q3 = 3n/4
What is the range?
The difference between the largest and smallest values in the data set
What is the interquartile range?
The difference between the upper quartile and lower quartile: Q3-Q1
What is the interpercentile range?
The difference between the values for two given percentiles
What is the formula for variance?
Variance = The sum of each data value squared/ the number of data values - (The sum of data values/ the number of data values) squared
What is the formula for standard deviation?
Standard deviation = The square root of variance
If data is coded using the formula y = x - a/ b, what is the mean and standard deviation of the coded data? of the coded data?
Y bar = x - a/ b Standard deviation = Standard deviation of the original data/ b
Chapter 3
Representations of data
What is a common definition of an outlier?
Any value that is: Either greater than Q3 + k(Q3 - Q1) or less than Q1 - k(Q3 - Q1)
When drawing a boxplot what is the order of each line?
Lowest value; lower quartile; median; upper quartile; highest value
What is the process of removing anomalies from a data set?
It is known as cleaning the data
What is the formula for the area of the bar?
Area of bar = k x Frequency
What is the formula for frequency density when k = 1?
If k = 1, then frequency density = frequency/ class width
What forms a frequency polygon?
Joining the middle of the top of each bar in a histogram with equal class widths forms a frequency polygon
When comparing data sets what can you comment on?
A measure of location and a measure of spread
Chapter 4
Correlation
What is bivariate data?
Data which has pairs of values for two variables
What is correlation?
It describes the nature of the linear relationship between two variables
Which variable is plotted on which axis?
The independent variable is usually on the x-axis and the dependent variable is usually on the y-axis
What is meant by the term causal relationship?
Two variables have a causal relationship if a change in one variable causes a change in the other
What is the regression line of y on x?
y = a + bx
What does the coefficient b tell us?
It tells us the change in y for every unit change in x
Chapter 5
Probability
What is an experiment?
A repeatable process that gives rise to a number of outcomes
What can a Venn diagram be used for?
To represent events graphically, frequencies and probabilities can be placed in the regions of the Venn diagram
What is an event?
A collection of one or more outcomes
What is a sample space?
The set of all possible outcomes
On Venn diagram what does intersection of A and B mean?
The region that only satisfies both A and B (The middle)
On a Venn diagram what does union of A and B mean?
The region that satisfies A or B (Everything inside the two circles)
On a Venn diagram what does complement of A mean?
The region that satisfies not A (Everything except the middle and A)
What is meant by mutually exclusive?
When events have no outcomes in common (On a diagram this means the two circles are separate and have no middle)
What is meant by independent?
When one event has no effect on another
What is a tree diagram used for?
To show the outcomes of two or more events happening in succession
Chapter 6
Statistical distributions
What is a random variable?
A variable whose value depends on the outcome of a random event
What is the term for the range of values that a random variable can take?
Sample space
What is meant by the term discrete?
The variable is discrete if it can only take certain numerical numerical values
What is meant by the term random?
If the outcome is not known until the experiment is carried out
What does a probability distribution do?
It fully describes the probability of any outcome in the sample space
How are random variables represented?
By upper case letters like X or Y
How are particular values the random variable can take represented?
By lower case letters like x or y
What does P(X = x) mean?
The probability that the random variable takes a particular value x
What is term used when all of the probabilities are the same?
Discrete uniform distribution
What is the formula for binomial distribution?
P(X = r) = Number of trials Choose Desired Value x Chance of success to the power of number of successful outcomes x Chance of failures to the power of number of failure outcomes
What is the numerical formula for binomial distribution?
P(X = r) = nCr x (P)^r x (1-P)^n-r
What does a cumulative probability function tell you?
The sum of all the individual probabilities up to and including the given value of x in the calculation for P(X _< x)
How can binomial distribution be written in a question?
X ~ B (n,p)
Chapter 7
Hypothesis Testing
What is a hypothesis?
What is a test statistic?
The result of the experiment or the statistic that is calculated from the sample
What is the null hypothesis?
The hypothesis that you assume to be correct
What is the alternative hypothesis?
It tells you about the parameter if your assumption is shown to be wrong
What is the difference between one-tailed and two-tailed tests?
In one-tailed the alternative hypothesis can be one of either < or > not both but in two-tailed it has two critical regions and so can be both
What is meant by the significance level?
The given threshold you measure against
What is meant by a critical region?
A region of the probability distribution which, if the test statistic falls within it, would cause you to reject the null hypothesis
What is meant by a critical value?
The first value to fall inside of the critical region
What is the acceptance region?
The region where we accept the null hypothesis
What is meant by the actual significance level?
The probability of incorrectly rejecting the null hypothesis
When using the significance level for a two-tailed test what do you need to consider?
You need to halve the significance level at the end you are testing
