A Level Statistics: Unit 4 - Bivariate Data

?
View mindmap
  • Bivariate Data
    • PMCC
      • The PMCC is a measure of the strength of a linear association between two variables and is denoted by r.  Basically, if you were to draw a line of best fit through the data of two variables, the PMCC, r, indicates how far away all these data points are to this line of best fit.
      • ScaleThe value of the PMCC is a score between -1.0 and +1.0A negative value implies a negative correlationA positive value implies a positive correlationThe closer the value is to zero, the weaker the correlation is
      • Product Moment Correlation should be used if:the variables are either interval or ratio measurements (i.e. numerical values which are not already ranked),each variable is approximately normally distributedthere is a linear relationship between the two variables. i.e. the two sets of data use the same scale
    • SRCC
      • RankingAlways rank from smallest to largest (unless otherwise stated)If two values are equal they MUST share the median rankThe final rank should be equal to n
      • Spearman’s rank is a calculation of the PMCC of the rank values of the two variables
      • ScaleThe value of the SRCC is a score between -1.0 and +1.0A negative value implies a negative correlation between the ranksA positive value implies a positive correlation between the ranksThe closer the value is to zero, the weaker the correlation is
      • Spearman’s rank should be used if:the variables recorded are already ranks/scoresthere is a non-linear relationship between ?the variables (as seen from a scatter diagram).
    • Least Squares Regression
      • The least squares regression line is the statistical name for the equation of the line of best fit drawn on a scatter graph. It is the straight line that can be drawn through the centre of the data points with the smallest distance between the points and the line.
      • y = a + bx
      • The gradient shows us how many the variable on the y axis increases for every one the variable on the x axis increasesA positive gradient implies a positive correlation between the two variablesA negative gradient implies a negative correlation between the two variables
      • The y-intercept tells us the value of the variable on the y axis when the variable on the x axis is zero.
      • Residuals
        • A residual is the difference between the data point plotted on a scatter graph and its estimated value on the Least Squares Regressions Line. A positive residual suggested the observed data point is above the estimate from the regression line
        • A negative residual suggests the observed data point is below the estimate from the regression line.
        • The larger (positive or negative) a residual value takes, the further away it is from the regression line and therefore less the value ‘fits’ with the rest of the data. The more values there are with large residuals, the weaker the correlation will be.
        • Values with large residuals in comparison to the rest of the data points are seen as outliers.
    • It is common for SRCC to be stronger than PMCC as it is more likely for ranks to be equal than actual values

Comments

No comments have yet been made

Similar Statistics resources:

See all Statistics resources »See all Bivariate Data resources »