Entropy as a Measure of Homogeneity in Categorical Grouping
Analysis
Erkki Latosaari and Ilkka Virtanen
Abstract
The paper deals with the concept of Shannon´s entropy from the
point of view of statistics. Entropy is considered as a measure of
dispersion for a categorical variable. Of special interest in the paper is
the case where the classes or categories of the variable have been
aggregated to from homogeneous (with respect to the class
frequencies) groups. The total entropy of the variable is divided into
two components, the entropy between the groups and the entropy
within the groups. This division forms the basis for analyzing the
homogeneity of the aggregated groups. Further, an entropy-based test
statistic, viz. Kullback´s information statistic, is introduced to carry
out homogeneity tests for the groups in the case of sample data. The
grouping procedure is illustrated with an application to the finnish
representative elections.
Key words: categorical variables, entropy decomposition,
grouping analysis, information statistic, measure of
homogeneity.
(Proceedings of the University of Vaasa,Research Papers,
No. 96, 1983, 44 p.