Odds Ratio and Yule's Q
Back to
Agreement Statistics
page
Introduction
The odds ratio is an important option for testing and quantifying the
association between two raters making dichotomous ratings. It
should probably be used more often with agreement data than it currently
is.
The odds ratio can be understood with reference to a 2×2
crossclassification table:
Crossclassification frequencies for binary ratings
by two raters
Rater 1 
Rater 2 
+ 
 
+ 
a 
b 
a + b 
 
c 
d 
c + d 

a + c 
b + d 
Total 
By definition, the odds ratio, OR, is
[a/(a+b)] / [b/(a+b)]
OR = , (1)
[c/(c+d)] / [d/(c+d)]
but this reduces to
a/b
OR = , (2)
c/d
or, as OR is usually calculated,
ad
OR = . (3)
bc
The last equation shows that OR is equal to the simple
crossproduct
ratio of a 2×2 table.
Intuitive explanation
The concept of "odds" is familiar from gambling. For instance, one
might say the odds of a particular horse winning a race are "3 to 1";
this means the probability of the horse winning is 3 times the
probability of not winning.
In Equation (2), both the numerator and denominator are odds. The
numerator, a/b, gives the odds of a positive versus negative rating by
Rater 2 given that Rater 1's rating is positive. The denominator, c/d,
gives the odds of a positive versus negative rating by Rater 2 given
that Rater 1's rating is negative.
OR is the ratio of these two oddshence its name, the odds
ratio. It indicates how much the odds of Rater 2 making a
positive rating increase for cases where Rater 1 makes a positive
rating.
This alone would make the odds ratio a potentially useful way to assess
association between the ratings of two raters. However, it has some
other appealing features as well. Note that:
a/b a/c d/b d/c ad
OR =  =  =  =  = .
c/d b/d c/a b/a bc
From this we see that the odds ratio can be interpreted in various ways.
Generally, it shows the relative increase in the odds of one rater
making a given rating, given that the other rater made the same
ratingthe value is invariant regardless of whether one is
concerned with a positive or negative rating, or which rater is the
reference and which the comparison.
The odds ratio can be interpreted as a measure of the magnitude of
association between the two raters. The concept of an odds ratio is
also familiar from other statistical methods (e.g., logistic
regression).
Yule's Q
OR can be transformed to a 1 to 1 scale by converting it to Yule's Q
(or a slightly different statistic, Yule's Y.)
For example, Yule's Q is
OR  1
Q = .
OR + 1
Logodds ratio
It is often more convenient to work with the log of the odds ratio than
with the odds ratio itself. The formula for the standard error of
log(OR) is very simple:
s_{log(OR)} =
squareroot(1/a + 1/b + 1/c + 1/d).
Knowing this standard error, one can easily test the significance of
log(OR) and/or construct confidence intervals. The former is
accomplished by calculating:
and referring to a table of the cumulative distribution of the standard
normal curve to determine the pvalue associated with z.
Confidence limits are calculated as:
log(OR) ± z_{L} × s_{log(OR)}.
where z
_{L} is the z value defining the
appropriate confidence limits, e.g., z
_{L}
= 1.645 or 1.96 for a twosided 90% or 95% confidence interval,
respectively.
Confidence limits for OR may be calculated as:
exp[log(OR) ± z_{L} × s_{log(OR)}].
Alternatives are to estimate confidence intervals by the nonparametric
bootstrap (for description, see the Raw agreement
indices page) or to construct exact confidence intervals by
considering all possible distributions of the cases in a 2×2
table.
Once one has used log OR or OR to assess association between raters, one
may then also perform a test of marginal homogeneity, such as the McNemar test.
(Top of Page)
Pros and Cons: the Odds Ratio
Pros
 The odds ratio is very easily calculated.
 Software for its calculation is readily available, e.g., SAS
PROC FREQ and SPSS CROSSTABS.
 It is a natural, intuitively acceptable way to express magnitude
of association.
 The odds ratio is linked to other statistical methods.

Cons
 If underlying trait is continuous, the value of OR depends on the
level of each rater's threshold for a positive rating.
That is not ideal, as it implies the
basic association between raters changes if their thresholds change.
Under certain distributional assumptions (socalled "constant
association" models), this problem can be eliminated, but the
assumptions introduce
extra complexity.
 While the odds ratio can be generalized to ordered category
data, this again introduces new assumptions and complexity. (See the Loglinear, association, and quasisymmetry models
page).

(Top of Page)
Extensions and alternatives
Extensions
 More than two categories. In an N×N table (where N >
2), one might collapse the table into various 2×2 tables and calculate
log(OR) or OR for each. That is, for each rating category k = 1, ...,
N, one would construct the 2×2 table for the crossclassification of
Level k vs. all other levels for Raters 1 and 2, and calculate log OR or
OR. This assesses the association between raters with respect to the
Level k vs. notLevel k distinction.
This method is probably more appropriate for nominal ratings than
for orderedcategory ratings. In either case, one might consider instead
using Loglinear, association, or quasisymmetry
models.

Multiple raters. For more than two raters, a possibility is to
calculate log(OR) or OR for all pairs of raters. One might then report,
say, the average value and range of values across all rater pairs.
Alternatives
Given data by two raters, the following alternatives to the odds ratio
may be considered.

In a 2×2 table, there is a close relationship between the odds ratio
and loglinear modeling. The latter can be used
to assess both association and marginal homogeneity.

Cook and Farewell (1995) presented a model that considers formal
decomposition of a 2×2 table into independent components which reflect
(1) the odds ratio and (2) marginal homogeneity.

The tetrachoric and polychoric correlations are
alternatives when one may assume that ratings are based on a latent
continuous trait which is normally distributed. With more than two
rating categories, extensions of the polychoric correlation are
available with more flexible distributional assumptions.

Association and quasisymmetry models can be used
for N×N tables, where ratings are nominal or orderedcategorical.
These methods are related to the odds ratio.

When there are more than two raters, latent
trait and latent class
models can be used. A particular type of
latent trait model called the Rasch model is related to the odds ratio.
(Top of Page)
References
Either of the books by Agresti are excellent starting points.
Agresti A. Categorical data analysis. New York: Wiley, 1990.
Agresti A. An introduction to categorical data analysis. New York:
Wiley, 1996.
Bishop YMM, Fienberg SE, Holland PW. Discrete nultivariate analysis:
theory and practice. Cambridge, Massachusetts: MIT Press, 1975
Cook RJ, Farewell VT. Conditional inference for subjectspecific
and marginal agreement: two families of agreement measures.
Canadian Journal of Statistics, 1995, 23, 333344.
Fleiss JL. Statistical methods for rates and proportions, 2nd Ed. New
York: John Wiley, 1981.
Khamis H. Association, measures of. In Armitage P, Colton T (eds.),
The Encyclopedia of Biostatistics, Vol. 1, pp. 202208. New York:
Wiley, 1998.
Somes GW, O'Brien, KF. Odds ratio estimators. In Kotz L, Johnson NL
(eds.), Encyclopedia of statistical sciences, Vol. 6, pp. 407410. New
York: Wiley, 1988.
Sprott DA, VogelSprott MD. The use of the logodds ratio to
assess the reliability of dichotomous questionnaire data.
Applied Psychological Measurement, 1987, 11, 307316.
Go to
Agreement Statistics
Go to
Latent Class Analysis
Go to
My papers and programs page
Last updated: 21 August 2006 (added counter)
(c) 2006
John Uebersax PhD
email