World Library  
Flag as Inappropriate
Email this Article

Wilcoxon signed-rank test

Article Id: WHEBN0001954987
Reproduction Date:

Title: Wilcoxon signed-rank test  
Author: World Heritage Encyclopedia
Language: English
Subject: Sign test, Mann–Whitney U test, Rank correlation, Pair programming, Location test
Collection: Nonparametric Statistics, Statistical Tests, U-Statistics
Publisher: World Heritage Encyclopedia

Wilcoxon signed-rank test

The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ (i.e. it is a paired difference test). It can be used as an alternative to the paired Student's t-test, t-test for matched pairs, or the t-test for dependent samples when the population cannot be assumed to be normally distributed.[1]

The Wilcoxon signed-rank test is not the same as the Wilcoxon rank-sum test, although both are nonparametric and involve summation of ranks.


  • History 1
  • Assumptions 2
  • Test procedure 3
  • Example 4
  • Effect size 5
  • See also 6
  • References 7
  • External links 8
    • Implementations 8.1


The test is named for Frank Wilcoxon (1892–1965) who, in a single paper, proposed both it and the rank-sum test for two independent samples (Wilcoxon, 1945).[2] The test was popularized by Sidney Siegel (1956) in his influential text book on non-parametric statistics.[3] Siegel used the symbol T for a value related to, but not the same as, W. In consequence, the test is sometimes referred to as the Wilcoxon T test, and the test statistic is reported as a value of T.


  1. Data are paired and come from the same population.
  2. Each pair is chosen randomly and independently.
  3. The data are measured at least on an ordinal scale (cannot be nominal).

Test procedure

Let N be the sample size, the number of pairs. Thus, there are a total of 2N data points. For i = 1, ..., N, let x_{1,i} and x_{2,i} denote the measurements.

H0: difference between the pairs follows a symmetric distribution around zero
H1: difference between the pairs does not follow a symmetric distribution around zero.
  1. For i = 1, ..., N, calculate |x_{2,i} - x_{1,i}| and \sgn(x_{2,i} - x_{1,i}), where \sgn is the sign function.
  2. Exclude pairs with |x_{2,i} - x_{1,i}| = 0. Let N_r be the reduced sample size.
  3. Order the remaining N_r pairs from smallest absolute difference to largest absolute difference, |x_{2,i} - x_{1,i}|.
  4. Rank the pairs, starting with the smallest as 1. Ties receive a rank equal to the average of the ranks they span. Let R_i denote the rank.
  5. Calculate the test statistic W
    W = \sum_{i=1}^{N_r} [\sgn(x_{2,i} - x_{1,i}) \cdot R_i], the sum of the signed ranks.
  6. Under null hypothesis, W follows a specific distribution with no simple expression. This distribution has an expected value of 0 and a variance of \frac{N_r(N_r + 1)(2N_r + 1)}{6}.
    W can be compared to a critical value from a reference table.[1]
    The two-sided test consists in rejecting H_0, if |W| \ge W_{critical, N_r}.
  7. As N_r increases, the sampling distribution of W converges to a normal distribution. Thus,
    For N_r \ge 10, a z-score can be calculated as z = \frac{W}{\sigma_W}, \sigma_W = \sqrt{\frac{N_r(N_r + 1)(2N_r + 1)}{6}}.
    If |z| > z_{critical} then reject H_0 (two-sided test)
    Alternatively, one-sided tests can be realised with either the exact or the approximative distribution. p-value can also be calculated.

The T statistic used by Siegel is the smaller of two sums of ranks of given sign; in the example given below, therefore, T would equal 3+4+5+6=18. Low values of T are required for significance. As will be obvious from the example below, T is easier to calculate by hand than W and the test is equivalent to the two-sided test above-described (the distribution of the statistic under H0 has to be adjusted).

Excluding zeros is not a statistically justified method and such an approach can lead to enormous calculation errors. A more stable method is:[4]

  • Calculate W = \sum_{i=1}^{N} [\sgn(x_{2,i} - x_{1,i}) \cdot R_i], (assume sgn(0) = 0)
  • Calculate sampling probabilities \pi^+ = P(x_{2,i} > x_{1,i}), \pi^- = P(x_{2,i} < x_{1,i}), \pi^0 = P(x_{2,i} = x_{1,i})
  • For {N \ge 10} use normal approximation {Z = \frac{4W - N(N+1)}{\sqrt{\frac{2N(N+1)(2N+1)}{3}(\pi^+ + \pi^- - (\pi^+ - \pi^-)^2)}}}.

(Note that this value is undefined if either \pi^+ = 1 or \pi^- = 1: i.e. if all samples show positive effect or all samples show negative effect. This is not the case with the test statistic as originally defined.)


      x_{2,i} - x_{1,i}
i_{} x_{2,i} x_{1,i} \sgn \text{abs}
1 125 110 1 15
2 115 122  –1 7
3 130 125 1 5
4 140 120 1 20
5 140 140   0
6 115 124  –1 9
7 140 123 1 17
8 125 137  –1 12
9 140 135 1 5
10 135 145  –1 10
order by absolute difference
      x_{2,i} - x_{1,i}
i_{} x_{2,i} x_{1,i} \sgn \text{abs} R_i \sgn \cdot R_i
5 140 140   0    
3 130 125 1 5 1.5 1.5
9 140 135 1 5 1.5 1.5
2 115 122  –1 7 3  –3
6 115 124  –1 9 4  –4
10 135 145  –1 10 5  –5
8 125 137  –1 12 6  –6
1 125 110 1 15 7 7
7 140 123 1 17 8 8
4 140 120 1 20 9 9
sgn is the sign function, \text{abs} is the absolute value, and R_i is the rank. Notice that pairs 3 and 9 are tied in absolute value. They would be ranked 1 and 2, so each gets the average of those ranks, 1.5.
N_r = 10 - 1 = 9, |W| = |1.5+1.5-3-4-5-6+7+8+9| = 9.
|W| < W_{\alpha = 0.05, 9 , two-sided} = 35 \therefore \text{fail to reject } H_0.

Effect size

To compute an effect size for the signed-rank test, one can use the rank correlation.

If the test statistic W is reported, Kerby (2014) has shown that the rank correlation r is equal to the test statistic W divided by the total rank sum S, or r = W/S.[5] Using the above example, the test statistic is W = 9. The sample size of 9 has a total rank sum of S = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9) = 45. Hence, the rank correlation is 9/45, so r = .20.

If the test statistic T is reported, an equivalent way to compute the rank correlation is with the difference in proportion between the two rank sums, which is the Kerby (2014) simple difference formula.[5] To continue with the current example, the sample size is 9, so the total rank sum is 45. T is the smaller of the two rank sums, so T is 3 + 4 + 5 + 6 = 18. From this information alone, the remaining rank sum can be computed, because it is the total sum S minus T, or in this case 45 - 18 = 27. Next, the two rank-sum proportions are 27/45 = 60% and 18/45 = 40%. Finally, the rank correlation is the difference between the two proportions (.60 minus .40), hence r = .20.

See also

  • Mann–Whitney–Wilcoxon test (the variant for two independent samples)
  • Sign test (Like Wilcoxon test, but without the assumption of symmetric distribution of the differences around the median, and without using the magnitude of the difference)


  1. ^ a b Lowry, Richard. "Concepts & Applications of Inferential Statistics". Retrieved 24 March 2011. 
  2. ^ Wilcoxon, Frank (Dec 1945). "Individual comparisons by ranking methods" (PDF). Biometrics Bulletin 1 (6): 80–83. 
  3. ^ Siegel, Sidney (1956). Non-parametric statistics for the behavioral sciences. New York: McGraw-Hill. pp. 75–83. 
  4. ^ Ikewelugo Cyprian Anaene Oyeka (Apr 2012). "Modified Wilcoxon Signed-Rank Test". Open Journal of Statistics: 172–176. 
  5. ^ a b Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Innovative Teaching, volume 3, article 1. doi:10.2466/11.IT.3.1. link to pdf

External links

  • Wilcoxon Signed-Rank Test in R
  • Example of using the Wilcoxon signed-rank test
  • An online version of the test
  • A table of critical values for the Wilcoxon signed-rank test


  • ALGLIB includes implementation of the Wilcoxon signed-rank test in C++, C#, Delphi, Visual Basic, etc.
  • The free statistical software R includes an implementation of the test as wilcox.test(x,y, paired=TRUE), where x and y are vectors of equal length.
  • GNU Octave implements various one-tailed and two-tailed versions of the test in the wilcoxon_test function.
  • SciPy includes an implementation of the Wilcoxon signed-rank test in Python
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.