# Non-parametric stats

- Created by: Sam_dearnx
- Created on: 24-01-17 21:11

## Rank transformation

- Generic way to deal with heavily skewed or otherwise difficult data

- Assign 1 to lowest value, 2 to next lowest etc

- Average score of group for same values (ties)

## Effects of rank transformation

- Will flatten any distribution and remove gaps between outcome value ranges.

- Moves outliers closer to the centre of the data

- Hence deals with the most serious issues in non-normal data: modality, skew, outliers

- E.g. identical rank scores for log-normal and transformed normal distributions

## Non-parametric correlation methods

- Spearman’s rank

o For non-normal but continuous data (interval and ratio)

o Can also be used for ordinal data with many levels.

o Works on rank transformations

- Kendall’s tau

o Specifically designed for discrete data (i.e whole numbers only) and ordinal data

o E.g. rating on Likert scales

## Spearman's rank correlation

- Step 1: rank transformation of the x and y variables

- Step 2: Pearson’s r correlation on the transformed variables

- Stats software will perform step one implicitly, you don’t need to do the rank transformation first.

- Correlation coefficient denoted by Greek letter p (rho) or r_{s}

- Rho has the same range (12, 1) and interpretation as Pearson’s R

- Where rank(x) and rank(y) denote the rank-transformed variables x and y, respectively.

## Spearman's rho vs Pearson R

- Pearson’s r assesses the linear relationship

- Spearman’s assesses only a monotonic relationship

- Similar results if x and y are near-normal – use Pearson

- Different if not normal (outliers) – use Spearman

## Kendall's tau rank correlation

- Step 1: rank transformation of the x and y variables

- Step 2: count concordant and discordant pairs

- Concordant pair: ranks for both observations (x-y pairs i and j) agree (x_{i} > x_{j} and y_{i} > y_{j} or x_{i} < x_{j} and y_{i} < y_{j})

- Discordant pair: where ranks for both observations (x-y pairs i and j) differ (x_{i} > x_{j} and y_{i} < y_{j} or x_{i} > x_{j} and y_{i} < y_{j})

- correlation coefficient denoted by Greek letter (tau)

- Total number of pairs is n(n-1)/2, used as scaling factor

- Tau has range (-1,1), same interpretation as r and rho

Tau = number of concordant pairs - number of discondant pairs / n (n-1)/2

## Kendall's tau rank correlation

- Step 1: rank transformation of the x and y variables

- Step 2: count concordant and discordant pairs

- Concordant – both the x values are larger than the y values

- Discordant – x value is larger and a y value is smaller

- correlation coefficient denoted by Greek letter (tau)

- Total number of pairs is n(n-1)/2, used as scaling factor

- Tau has range (-1,1), same interpretation as r and rho

Tau = number of concordant pairs - number of discondant pairs / n (n-1)/2

## Accounting for ties

- Ties: Pairs which are neither concordant nor discordant. i.e. where ranks for x or y in a pair do not differ.

- Tau-A: No adjustment for ties. This can keep true range for tau smaller than (-1, 1)

- Tau-B: Adjustment for ties to keep range of tau in (-1, 1). Easier to interpret. Standard method used in most stats packages

- Tau-C: Another way of adjusting for ties. Not frequently used.

## Kendall's tau vs Spearman's rho

- Both typically yield similar results

- Tau is more robust for significance testing in smaller samples

- Tau is particularly suitable for discrete and ordinal data

## Non-parametric tests for group differences

- Wilcoxon signed-rank test

o For paired samples

o Alternative to paired t test

- Mann-Whitney U

o For independent samples

o Alternative to independent samples t test

- Binomial test

o For binomial data

## Wilcoxon signed rank test

- Frank Wilcoxon (1945): signed-rank method

o You have n paired observations x1 and x2

o For each pair, calculate:

§ absolute value difference (always positive): |1x1i - 2i| or abs(x1i-x2i)

§ sign of difference (+ or -): (x1i - x2i)

o exclude pairs with 1x1i - x2i

o reduce sample size by excluded pairs to get nr

o rank pairs by absolute difference and multiply rank with sign

o Sum of signed ranks gives W (Wilcoxon test statistic)

## wilcoxon test statistic

- Similar to t test, W follows a known distribution that approximates normal as n increases

- W and degrees of freedom (n-1) can then be used to derive Z statistic and significance (p)

- Stats program will do all this for you and give you W, Z and p

## The Mann-Whitney U test

- Developed by Mann and Whitney in 1947 and Wilcoxon (1945), also known as Wilcoxon ranked-sum test

- For independent samples

- Ranked-sum method:

o you have two samples with sample sizes n1, n2

o Combine both samples and rank all values

o Add up the ranks which came from each sample to find sum of ranks r1, r2

o Use smaller U as the U test statistic

## Mann-whitney I test interpretation

- U can range from 0 (complete separation between the groups, most likely to reject null hypothesis) to n1 * n2 (no separation between the groups, accept H0)

- U and df used to derive Z

- Z is normally distributed for larger samples

- Z used to derive significance (p), all usually included in stats program output.

- Having numerous tied ranks leads to problems

o Complex correction to standard deviation of U in stats software

o Not an issue if only few ties are present

## Advantages of ranked group difference tests

- They do not require normally distributed data

- Ranking method provides robustness against outliers

- Also applicable to ordinal data, unlike t tests

- But if you have normal data use t test, more power

- Non-parametric tests that do not use rankings:

o For 2 sets of binomial data use binomial test

o Chi square or McNemar for nominal data

## Comments

No comments have yet been made