COMPUTE Kendall's Tau for the data : - x : 5,3,7,6,2 ; y : 7,8,9,6,4
GEMINI -Kendall's Tau is a non-parametric measure of the strength and direction of association that exists between two variables. It's particularly useful when you're interested in whether the ranking of data points for one variable is similar or dissimilar to the ranking for another variable.
Here are the steps involved in computing Kendall's Tau:
Order the Data:
First, you sort the data based on the values of one of the variables (let's say x) in ascending order. If there are ties in x, you maintain the original relative order of the corresponding y values or break ties arbitrarily, but typically scipy handles this robustly.
Your original data:
x: 5, 3, 7, 6, 2
y: 7, 8, 9, 6, 4
Sorted by x:
| x | y |
|---|---|
| 2 | 4 |
| 3 | 8 |
| 5 | 7 |
| 6 | 6 |
| 7 | 9 |
Identify Pairs of Observations:
You then consider all possible pairs of observations. For each pair, you'll compare their relative order for both x and y. Since we've already sorted by x, for any two pairs (x
i
,y
i
) and (x
j
,y
j
) where x
i
<x
j
, we only need to look at the corresponding y values.
Classify Pairs as Concordant, Discordant, or Tied:
For each pair of observations (x
i
,y
i
) and (x
j
,y
j
) (where x
i
<x
j
due to sorting):
Concordant (C): If y
i
<y
j
. This means the ranks of both variables move in the same direction.
Discordant (D): If y
i
>y
j
. This means the ranks of the variables move in opposite directions.
Tied: If y
i
=y
j
. These pairs are not counted as either concordant or discordant in the calculation of the basic Kendall's Tau-a. More advanced versions (like Kendall's Tau-b, which scipy.stats.kendalltau computes) adjust the denominator for ties in either X or Y to provide a more robust measure.
Let's go through the sorted pairs (2,4), (3,8), (5,7), (6,6), (7,9):
Compare (2,4) with subsequent pairs:
(2,4) vs (3,8): 4<8 -> Concordant
(2,4) vs (5,7): 4<7 -> Concordant
(2,4) vs (6,6): 4<6 -> Concordant
(2,4) vs (7,9): 4<9 -> Concordant (Total 4 Concordant from this row)
Compare (3,8) with subsequent pairs:
(3,8) vs (5,7): 8>7 -> Discordant
(3,8) vs (6,6): 8>6 -> Discordant
(3,8) vs (7,9): 8<9 -> Concordant (Total 1 Concordant, 2 Discordant from this row)
Compare (5,7) with subsequent pairs:
(5,7) vs (6,6): 7>6 -> Discordant
(5,7) vs (7,9): 7<9 -> Concordant (Total 1 Concordant, 1 Discordant from this row)
Compare (6,6) with subsequent pairs:
(6,6) vs (7,9): 6<9 -> Concordant (Total 1 Concordant from this row)
Count Concordant (C) and Discordant (D) Pairs:
Sum up all the concordant and discordant pairs:
Total Concordant (C) = 4 + 1 + 1 + 1 = 7
Total Discordant (D) = 2 + 1 = 3
Apply the Kendall's Tau Formula:
The basic formula for Kendall's Tau (τ) is:
τ=
C+D
C−D
Where:
C is the number of concordant pairs.
D is the number of discordant pairs.
Using the counts from your data:
τ=
7+3
7−3
=
10
4
=0.4
This value of 0.4 indicates a moderate positive association between the rankings of your x and y variables. A value of +1 indicates perfect agreement in ranking, -1 indicates perfect disagreement, and 0 indicates no association.
No comments:
Post a Comment