Non-parametric stats

?

Rank transformation

-          Generic way to deal with heavily skewed or otherwise difficult data

-          Assign 1 to lowest value, 2 to next lowest etc

-          Average score of group for same values (ties)

1 of 15

Effects of rank transformation

-          Will flatten any distribution and remove gaps between outcome value ranges.

-          Moves outliers closer to the centre of the data

-          Hence deals with the most serious issues in non-normal data: modality, skew, outliers

-          E.g. identical rank scores for log-normal and transformed normal distributions

2 of 15

Non-parametric correlation methods

-          Spearman’s rank

o   For non-normal but continuous data (interval and ratio)

o   Can also be used for ordinal data with many levels.

o   Works on rank transformations

-          Kendall’s tau

o   Specifically designed for discrete data (i.e whole numbers only) and ordinal data

o   E.g. rating on Likert scales

3 of 15

Spearman's rank correlation

-          Step 1: rank transformation of the x and y variables

-          Step 2: Pearson’s r correlation on the transformed variables

-          Stats software will perform step one implicitly, you don’t need to do the rank transformation first.

-          Correlation coefficient denoted by Greek letter p (rho) or rs

-          Rho has the same range (12, 1) and interpretation as Pearson’s R

-          Where rank(x) and rank(y) denote the rank-transformed variables x and y, respectively.

4 of 15

Spearman's rho vs Pearson R

-          Pearson’s r assesses the linear relationship

-          Spearman’s assesses only a monotonic relationship

-          Similar results if x and y are near-normal – use Pearson

-          Different if not normal (outliers) – use Spearman

5 of 15

Kendall's tau rank correlation

-          Step 1: rank transformation of the x and y variables

-          Step 2: count concordant and discordant pairs

-          Concordant pair: ranks for both observations (x-y pairs i and j) agree (xi > xj and yi > yj or xi < xj and yi < yj)

-          Discordant pair:  where ranks for both observations (x-y pairs i and j) differ (xi > xj and yi < yj or xi > xj and yi < yj)

-       correlation coefficient denoted by Greek letter  (tau)

-          Total number of pairs is n(n-1)/2, used as scaling factor

-     Tau has range (-1,1), same interpretation as r and rho

    Tau = number of concordant pairs - number of discondant pairs / n (n-1)/2

6 of 15

Kendall's tau rank correlation

-          Step 1: rank transformation of the x and y variables

-          Step 2: count concordant and discordant pairs

- Concordant – both the x values are larger than the y values

- Discordant – x value is larger and a y value is smaller

-       correlation coefficient denoted by Greek letter  (tau)

-          Total number of pairs is n(n-1)/2, used as scaling factor

-     Tau has range (-1,1), same interpretation as r and rho

    Tau = number of concordant pairs - number of discondant pairs / n (n-1)/2

7 of 15

Accounting for ties

-          Ties: Pairs which are neither concordant nor discordant.  i.e. where ranks for x or y in a pair do not differ.

-          Tau-A: No adjustment for ties.  This can keep true range for tau smaller than (-1, 1)

-          Tau-B:  Adjustment for ties to keep range of tau in (-1, 1).  Easier to interpret. Standard method used in most stats packages

-          Tau-C:  Another way of adjusting for ties. Not frequently used.

8 of 15

Kendall's tau vs Spearman's rho

-      Both typically yield similar results

-          Tau is more robust for significance testing in smaller samples

-          Tau is particularly suitable for discrete and ordinal data

9 of 15

Non-parametric tests for group differences

-          Wilcoxon signed-rank test

o   For paired samples

o   Alternative to paired t test

-          Mann-Whitney U

o   For independent samples

o   Alternative to independent samples t test

-          Binomial test

o   For binomial data

10 of 15

Wilcoxon signed rank test

-          Frank Wilcoxon (1945): signed-rank method

o   You have n paired observations x1 and x2

o   For each pair, calculate:

§         absolute value difference (always positive): |1x1i - 2i|  or abs(x1i-x2i) 

§      sign of difference (+ or -): (x1i - x2i)

o     exclude pairs with 1x1i - x2i

o        reduce sample size by excluded pairs to get nr

o   rank pairs by absolute difference and multiply rank with sign

o   Sum of signed ranks gives W (Wilcoxon test statistic)

11 of 15

wilcoxon test statistic

-          Similar to t test, W follows a known distribution that approximates normal as n increases

-          W and degrees of freedom (n-1) can then be used to derive Z statistic and significance (p)

-          Stats program will do all this for you and give you W, Z and p

12 of 15

The Mann-Whitney U test

-          Developed by Mann and Whitney in 1947 and Wilcoxon (1945), also known as Wilcoxon ranked-sum test

-          For independent samples

-          Ranked-sum method:

o        you have two samples with sample sizes n1, n2

o   Combine both samples and rank all values

o        Add up the ranks which came from each sample to find sum of ranks r1, r2

o   Use smaller U as the U test statistic

13 of 15

Mann-whitney I test interpretation

-          U can range from 0 (complete separation between the groups, most likely to reject null hypothesis) to n1 * n2 (no separation between the groups, accept H0)

-          U and df used to derive Z

-          Z is normally distributed for larger samples

-          Z used to derive significance (p), all usually included in stats program output.

-          Having numerous tied ranks leads to problems

o   Complex correction to standard deviation of U in stats software

o   Not an issue if only few ties are present

14 of 15

Advantages of ranked group difference tests

-          They do not require normally distributed data

-          Ranking method provides robustness against outliers

-          Also applicable to ordinal data, unlike t tests

-          But if you have normal data use t test, more power

-          Non-parametric tests that do not use rankings:

o   For 2 sets of binomial data use binomial test

o   Chi square or McNemar for nominal data

15 of 15

Comments

No comments have yet been made

Similar Psychology resources:

See all Psychology resources »See all Visual System resources »