PSYC214 - Weeks 11-17 - Correlations and Regressions

  • Correlation and partial correlation
  • Linear regression
  • Multiple regression
  • Hierarchical and Stepwise regression
  • Interactions and Polynomial regression analyses
  • Logistic Regression
  • Rank Data
?

Correlation and Partial Correlation

Correlations: how two variables are related.
Scattergrams plot two values for each case in the analysis as a single point located from the two axes.

Z scores: makes it easier to directly compare the values of one variable to another
Mean = 0, Standard deviation = 1.
(http://www.uth.tmc.edu/uth_orgs/educ_dev/oser/ZSCORE.GIF)

Pearson Product Moment Correlation: rxy = total(zx.zy)/n-1
= proportion of variance in 1 variable that can be predicted if the other is known 
df = n - 2 (n= #'s of pairs of data)

1 of 12

Correlation and Partial Correlation: Inflation and

Inflation: correlations will be inflated if they correlate in the same direction

  • inflated by outliers with very high scores of both variables
  • if only high and low scores are analysed

Suppression: two variables covary, both contribute to the depenant varuavkes, but they do so in opposite directions

Inflation and suppression are features of how variables interact in the world

2 of 12

Partial Correlation

used when experimental control is not possible, statistical techniques remove the effect of one or more variables from the correlation, clarifying underlying relationships.

df = n - 2 ( - 1 for every variable removed)

3 of 12

Linear Regression

Predicts scores on one variable from the other.
It fits a predicted line to the data points, for predicting x from y.

Calculates the relationship between the y and the x values.

Formula for line of best fit: y = mx + c
m is the slope: m = y/x, c is the constant 

Linear regression formula: y' = mx + c
y'
= the predicted value of y

Method of least squares: fits the best straight line to the data, calcylates the lowest values for the sum of the squared deviation of the y scores from the line.

Straight line: linear regression
Curved line: polynomial regression

4 of 12

Multiple Regression

Best prediction of a dependant variable from a set of independant variables.

  • estimate the relative importance of your variables
  • control for some variables
  • predict scores combining variables

Predicts the dependant variable y, using more than one independant variable

Multiple R: correlation of the value predict by the multiple regression equation
For the best estimate use as few variables as possible & as many cases as possible.

R²: the amount of variance in the dependant variable predicted by the equation

df = k - 1, n = # of cases, k = # of independant variables

Adjusted R²: estimate of the r² if the same regression formula was used for a new set of ppts, lower than r²

5 of 12

Hierarchial Regression

  • can control for variables distorting the results
  • one or more variables added to the regression at a time
    once added and control, effect of the remaining variable can be evaluated

Stage 1: select the dependant variable, then the independant variable to control
Stage 2: bring in the variables you are studying

R² change: significant: if additional variable significantly improve the prediction

6 of 12

Stepwise Regression

Allows you to identify the minimum set of independant variables that together significantly predict the dependant variables.

  • variables are entered into the regression one by one
    program selects ordering of the variables

1st variable: highest correlation with the dependant variable
once selected its effect is semi-partialled out,

If variables are not significantly contributing to the regression there effects are removed.

  • -ve, give a minimal set of predictors, but underlying relationships may be more complicated
    -ve, do not assume causal relationships, you can get chance effects
7 of 12

Interactions

Interactions: effect of one variable may be different at different levels of the others
Significant: If the R² change is significiant when the product is entered

Standard Error of Mean: standard deviation of the distribution of means in the population from which you samples

  • indicate accuracy of your findings
  • low SEMs are desirable
    • use z scores to lower standard errors, usually high

SE(R²): standard deviation of the distribution of R²
SE(B): standard deviation of the distribution of B

8 of 12

Polynomial Regression Analyses

Find the best fit to curved data as well as straight line (linear) fits.

Quadratic equation: y = mx² + mx + c
Cubic equation: y = mx³ + mx² + mx + c

For data first try and fit a linear equation
If this is insignificant, a quadratic equation may be a better fit, and so on.

9 of 12

Logistic Regression

Combines categorical data analysis with ANOVA. To calculate expected frequencies in categorical data multiplication is used, whereas addition is used in ANOVA. Conversion to log-scores allows the additive process.

  • can be used to predict dependant variables with 3+ categories
  • used for modelling binary outcome measures
  • regressed onto explanatory measures: continuous and categorical

Proportions and Odds

Proportion: divide each score by the column total
Odds of passing in 1 category: dividing the pass proportion by the fail proportion
Odds ratio: (measures of effect size) pass ratio of group 1/ the pass ratio of group 2

10 of 12

Logistic Regression

Problems with the linear regression of a binary variable:

  • ends of the regression line = minus, plus infinity
  • breach homoscedacity assumptions

Logit model:

  • forces the linear model to become unbounded
  • smoothes out assymetry
  • forces a straight line into a function that is curvelinear
11 of 12

Statistics Using Ranks

Scoring of data where it is possible to compare scoring of lowest and highest. 

Why rank data?

  • Data are ordinal
  • Cannot assume normal distribution

Types of rank order tests:

  • 1 sample
    • between: kolomorogrov smirnoff
    • within: wilcoxon signed ranks
  • 2 sample
    • between: mann whitney, wilcoxon signed ranks
    • within: wilcoxon signed ranks
  • K sample
    • between: kruskal wallis
    • within: freidman's anova
12 of 12

Comments

No comments have yet been made

Similar Psychology resources:

See all Psychology resources »