World Library  
Flag as Inappropriate
Email this Article

Level of measurement

In statistics and quantitative research methodology, various attempts have been made to classify variables (or types of data) and thereby develop a taxonomy of levels of measurement or scales of measure. Perhaps the best known are those developed by the psychologist Stanley Smith Stevens. He proposed four types: nominal, ordinal, interval, and ratio.


  • Typology 1
  • Nominal scale 2
    • Mathematical operations 2.1
    • Central tendency 2.2
    • Percentage 2.3
  • Ordinal scale 3
  • Interval scale 4
    • Central tendency and statistical dispersion 4.1
  • Ratio scale 5
  • Debate on typology 6
    • Scale types and Stevens' "operational theory of measurement" 6.1
  • See also 7
  • Notes 8
  • References 9
  • External links 10


Stevens proposed his typology in a 1946 Science article titled "On the theory of scales of measurement".[1] In that article, Stevens claimed that all measurement in science was conducted using four different types of scales that he called "nominal," "ordinal," "interval," and "ratio," unifying both "qualitative" (which are described by his "nominal" type) and "quantitative" (to a different degree, all the rest of his scales). The concept of scale types later received the mathematical rigour that it lacked at its inception with the work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens (1981a, b), and R. Duncan Luce (1986, 1987, 2001). As Luce (1997, p. 395) wrote:

S. S. Stevens (1946, 1951, 1975) claimed that what counted was having an interval or ratio scale. Subsequent research has given meaning to this assertion, but given his attempts to invoke scale type ideas it is doubtful if he understood it himself . . . no measurement theorist I know accepts Stevens' broad definition of measurement . . . in our view, the only sensible meaning for 'rule' is empirically testable laws about the attribute.

Nominal scale

The nominal type differentiates between items or subjects based only on their names or (meta-)categories and other qualitative classifications they belong to; thus dichotomous data involves the construction of classifications as well as the classification of items. Discovery of an exception to a classification can be viewed as progress. Numbers may be used to represent the variables but the numbers do not have numerical value or relationship.

Examples of these classifications include gender, nationality, ethnicity, language, genre, style, biological species, and form.[2][3] In a university one could also use hall of affiliation as an example. Other concrete examples are

Nominal scales were often called qualitative scales, and measurements made on qualitative scales were called qualitative data. However, the rise of qualitative research has made this usage confusing.

Mathematical operations

Set membership, classification, categorical equality, and equivalence are all operations which apply to objects of the nominal type.

Central tendency

The mode, i.e. the most common item, is allowed as the measure of central tendency for the nominal type. On the other hand, the median, i.e. the middle-ranked item, makes no sense for the nominal type of data since ranking is meaningless for the nominal type.


Percentages can be used to determine or develop a comparison of the classifications.

Ordinal scale

The ordinal type allows for rank order (1st, 2nd, 3rd, etc.) by which data can be sorted, but still does not allow for relative degree of difference between them. Examples include, on one hand, dichotomous data with dichotomous (or dichotomized) values such as 'sick' vs. 'healthy' when measuring health, 'guilty' vs. 'innocent' when making judgments in courts, 'wrong/false' vs. 'right/true' when measuring truth value, and, on the other hand, non-dichotomous data consisting of a spectrum of values, such as 'completely agree', 'mostly agree', 'mostly disagree', 'completely disagree' when measuring opinion.

Central tendency

The median, i.e. middle-ranked, item is allowed as the measure of central tendency; however, the mean (or average) as the measure of central tendency is not allowed. The mode is allowed.

In 1946, Stevens observed that psychological measurement, such as measurement of opinions, usually operates on ordinal scales; thus means and standard deviations have no validity, but they can be used to get ideas for how to improve operationalization of variables used in questionnaires. Most psychological data collected by psychometric instruments and tests, measuring cognitive and other abilities, are ordinal, although some theoreticians have argued they can be treated as interval or ratio scales. However, there is little prima facie evidence to suggest that such attributes are anything more than ordinal (Cliff, 1996; Cliff & Keats, 2003; Michell, 2008).[4] In particular,[5] IQ scores reflect an ordinal scale, in which all scores are meaningful for comparison only.[6][7][8] There is no absolute zero, and a 10-point difference may carry different meanings at different points of the scale.[9][10]

Interval scale

The interval type allows for the degree of difference between items, but not the ratio between them. Examples include temperature with the Celsius scale, which has an arbitrarily-defined zero point (the freezing point of a particular substance under particular conditions), date when measured from an arbitrary epoch (such as AD) and direction measured in degrees from true or magnetic north. Ratios are not allowed since 20 °C cannot be said to be "twice as hot" as 10 °C, nor can multiplication/division be carried out between any two dates directly. However, ratios of differences can be expressed; for example, one difference can be twice another. Interval type variables are sometimes also called "scaled variables", but the formal mathematical term is an affine space (in this case an affine line).

Central tendency and statistical dispersion

The mode, median, and arithmetic mean are allowed to measure central tendency of interval variables, while measures of statistical dispersion include range and standard deviation. Since one can only divide by differences, one cannot define measures that require some ratios, such as the coefficient of variation. More subtly, while one can define moments about the origin, only central moments are meaningful, since the choice of origin is arbitrary. One can define standardized moments, since ratios of differences are meaningful, but one cannot define the coefficient of variation, since the mean is a moment about the origin, unlike the standard deviation, which is (the square root of) a central moment.

Ratio scale

The ratio type takes its name from the fact that measurement is the estimation of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind (Michell, 1997, 1999). A ratio scale possesses a meaningful (unique and non-arbitrary) zero value. Most measurement in the physical sciences and engineering is done on ratio scales. Examples include mass, length, duration, plane angle, energy and electric charge. Ratios are allowed because having a non-arbitrary zero point makes it meaningful to say, for example, that one object has "twice the length" of another (= is "twice as long"). Very informally, many ratio scales can be described as specifying "how much" of something (i.e. an amount or magnitude) or "how many" (a count). The Kelvin temperature scale is a ratio scale because it has a unique, non-arbitrary zero point called absolute zero.

Central tendency and statistical dispersion

The geometric mean and the harmonic mean are allowed to measure the central tendency, in addition to the mode, median, and arithmetic mean. The studentized range and the coefficient of variation are allowed to measure statistical dispersion. All statistical measures are allowed because all necessary mathematical operations are defined for the ratio scale.

Debate on typology

While Stevens' typology is widely adopted, it is still being challenged by other theoreticians, particularly in the cases of the nominal and ordinal types (Michell, 1986).[11]

Duncan (1986) objected to the use of the word measurement in relation to the nominal type, but Stevens (1975) said of his own definition of measurement that "the assignment can be any consistent rule. The only rule not allowed would be random assignment, for randomness amounts in effect to a nonrule". However, so-called nominal measurement involves arbitrary assignment, and the "permissible transformation" is any number for any other. This is one of the points made in Lord's (1953) satirical paper On the Statistical Treatment of Football Numbers.[12]

The use of the mean as a measure of the central tendency for the ordinal type is still debatable among those who accept Stevens' typology. Many behavioural scientists use the mean for ordinal data, anyway. This is often justified on the basis that the ordinal type in behavioural science is in fact somewhere between the true ordinal and interval types; although the interval difference between two ordinal ranks is not constant, it is often of the same order of magnitude.

For example, applications of measurement models in educational contexts often indicate that total scores have a fairly linear relationship with measurements across the range of an assessment. Thus, some argue that so long as the unknown interval difference between ordinal scale ranks is not too variable, interval scale statistics such as means can meaningfully be used on ordinal scale variables. Statistical analysis software such as SPSS requires the user to select the appropriate measurement class for each variable. This ensures that subsequent user errors cannot inadvertently perform meaningless analyses (for example correlation analysis with a variable on a nominal level).

Rasch model that provides a theoretical basis and justification for obtaining interval-level measurements from counts of observations such as total scores on assessments.

Another issue is derived from Nicholas R. Chrisman's article "Rethinking Levels of Measurement for Cartography",[13] in which he introduces an expanded list of levels of measurement to account for various measurements that do not necessarily fit with the traditional notions of levels of measurement. Measurements bound to a range and repeating (like degrees in a circle, clock time, etc.), graded membership categories, and other types of measurement do not fit to Steven's original work, leading to the introduction of six new levels of measurement, for a total of ten: (1) Nominal, (2) Graded membership, (3) Ordinal, (4) Interval, (5) Log-Interval, (6) Extensive Ratio, (7) Cyclical Ratio, (8) Derived Ratio, (9) Counts and finally (10) Absolute. The extended levels of measurement are rarely used outside of academic geography.

Scale types and Stevens' "operational theory of measurement"

The theory of scale types is the intellectual handmaiden to Stevens' "operational theory of measurement", which was to become definitive within psychology and the behavioral sciences, despite Michell's characterization as its being quite at odds with measurement in the natural sciences (Michell, 1999). Essentially, the operational theory of measurement was a reaction to the conclusions of a committee established in 1932 by the British Association for the Advancement of Science to investigate the possibility of genuine scientific measurement in the psychological and behavioral sciences. This committee, which became known as the Ferguson committee, published a Final Report (Ferguson, et al., 1940, p. 245) in which Stevens' sone scale (Stevens & Davis, 1938) was an object of criticism:

…any law purporting to express a quantitative relation between sensation intensity and stimulus intensity is not merely false but is in fact meaningless unless and until a meaning can be given to the concept of addition as applied to sensation.

That is, if Stevens' sone scale genuinely measured the intensity of auditory sensations, then evidence for such sensations as being quantitative attributes needed to be produced. The evidence needed was the presence of additive structure – a concept comprehensively treated by the German mathematician Otto Hölder (Hölder, 1901). Given that the physicist and measurement theorist Norman Robert Campbell dominated the Ferguson committee's deliberations, the committee concluded that measurement in the social sciences was impossible due to the lack of concatenation operations. This conclusion was later rendered false by the discovery of the theory of conjoint measurement by Debreu (1960) and independently by Luce & Tukey (1964). However, Stevens' reaction was not to conduct experiments to test for the presence of additive structure in sensations, but instead to render the conclusions of the Ferguson committee null and void by proposing a new theory of measurement:

Paraphrasing N.R. Campbell (Final Report, p.340), we may say that measurement, in the broadest sense, is defined as the assignment of numerals to objects and events according to rules (Stevens, 1946, p.677).

Stevens was greatly influenced by the ideas of another Harvard academic, the Nobel laureate physicist Percy Bridgman (1927), whose doctrine of operationism Stevens used to define measurement. In Stevens' definition, for example, it is the use of a tape measure that defines length (the object of measurement) as being measurable (and so by implication quantitative). Critics of operationism object that it confuses the relations between two objects or events for properties of one of those of objects or events (Hardcastle, 1995; Michell, 1999; Moyer, 1981a,b; Rogers, 1989).

The Canadian measurement theorist William Rozeboom (1966) was an early and trenchant critic of Stevens' theory of scale types.

See also


  1. ^  
  2. ^ Nominal measures are based on sets and depend on categories, ala Aristotle. accessdate=2014-08-25
  3. ^ "Invariably one came up against fundamental physical limits to the accuracy of measurement. ... The art of physical measurement seemed to be a matter of compromise, of choosing between reciprocally related uncertainties. ... Multiplying together the conjugate pairs of uncertainty limits mentioned, however, I found that they formed invariant products of not one but two distinct kinds. ... The first group of limits were calculable a priori from a specification of the instrument. The second group could be calculated only a posteriori from a specification of what was done with the instrument. ... In the first case each unit [of information] would add one additional dimension (conceptual category), whereas in the second each unit would add one additional atomic fact.", – pp. 1–4: MacKay, Donald M. (1969), Information, Mechanism, and Meaning, Cambridge, MA: MIT Press, ISBN 0-262-63-032-X
  4. ^ *Lord, Frederic M.; Novick, Melvin R.; Allan Birnbaum (1968). Statistical Theories of Mental Test Scores. Reading (MA): Addison-Wesley. p. 21.  
  5. ^ Sheskin, David J. (2007). Handbook of Parametric and Nonparametric Statistical Procedures (Fourth ed.). Boca Raton (FL): Chapman & Hall/CRC. p. 3.  
  6. ^ Mussen, Paul Henry (1973). Psychology: An Introduction. Lexington (MA): Heath. p. 363.  
  7. ^ Truch, Steve (1993). The WISC-III Companion: A Guide to Interpretation and Educational Intervention. Austin (TX): Pro-Ed. p. 35.  
  8. ^  
  9. ^ Eysenck, Hans (1998). Intelligence: A New Look. New Brunswick (NJ):  
  10. ^  
  11. ^ Velleman, Paul F.; Wilkinson, Leland (1993). "Nominal, Ordinal, Interval, and Ratio Typologies Are Misleading". The American Statistician (American Statistical Association) 47 (1): 65–72.  
  12. ^ Lord, Frederic M. (December 1953). "On the Statistical Treatment of Football Numbers". American Psychologist 8 (12): 750–751. 
  13. ^ Chrisman, Nicholas R. (1998). Rethinking Levels of Measurement for Cartography. Cartography and Geographic Information Science, vol. 25 (4), pp. 231-242


  • Alper, T. M. (1985). A note on real measurement structures of scale type (m, m + 1). Journal of Mathematical Psychology, 29, 73–81.
  • Alper, T. M. (1987). A classification of all order-preserving homeomorphism groups of the reals that satisfy finite uniqueness. Journal of Mathematical Psychology, 31, 135–154.
  • Briand, L. & El Emam, K. & Morasca, S. (1995). On the Application of Measurement Theory in Software Engineering. Empirical Software Engineering, 1, 61–88. [On line]
  • Cliff, N. (1996). Ordinal Methods for Behavioral Data Analysis. Mahwah, NJ: Lawrence Erlbaum. ISBN 0-8058-1333-0
  • Cliff, N. & Keats, J. A. (2003). Ordinal Measurement in the Behavioral Sciences. Mahwah, NJ: Erlbaum. ISBN 0-8058-2093-0
  • Lord, Frederic M (December 1953). "On the Statistical Treatment of Football Numbers".  
See also reprints in:
Readings in Statistics, Ch. 3, (Haber, A., Runyon, R.P., and Badia, P.) Reading, Mass: Addison–Wesley, 1970.
Maranell, Gary Michael, ed. (2007). "Chapter 31". Scaling: A Sourcebook for Behavioral Scientists. New Brunswick, New Jersey & London, UK: Aldine Transaction. pp. 402–405.  
  • Hardcastle, G. L. (1995) S. S. Stevens and the origins of operationism. Philosophy of Science 62:404–424.
  • Lord, F. M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison–Wesley.
  • Luce, R. D. (1986). Uniqueness and homogeneity of ordered relational structures. Journal of Mathematical Psychology, 30, 391–415.
  • Luce, R. D. (1987). Measurement structures with Archimedean ordered translation groups. Order, 4, 165–189.
  • Luce, R. D. (1997). Quantification and symmetry: commentary on Michell 'Quantitative science and the definition of measurement in psychology'. British Journal of Psychology, 88, 395–398.
  • Luce, R. D. (2000). Utility of uncertain gains and losses: measurement theoretic and experimental approaches. Mahwah, N.J.: Lawrence Erlbaum.
  • Luce, R. D. (2001). Conditions equivalent to unit representations of ordered relational structures. Journal of Mathematical Psychology, 45, 81–98.
  • Luce, R. D. & Tukey, J.W. (1964). Simultaneous conjoint measurement: a new scale type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.
  • Michell, J. (1986). Measurement scales and statistics: a clash of paradigms. Psychological Bulletin, 3, 398–407.
  • Michell, J. (1997). Quantitative science and the definition of measurement in psychology. British Journal of Psychology, 88, 355–383.
  • Michell, J. (1999). Measurement in Psychology – A critical history of a methodological concept. Cambridge: Cambridge University Press.
  • Michell, J. (2008). Is psychometrics pathological science? Measurement – Interdisciplinary Research & Perspectives, 6, 7–24.
  • Narens, L. (1981a). A general theory of ratio scalability with remarks about the measurement-theoretic concept of meaningfulness. Theory and Decision, 13, 1–70.
  • Narens, L. (1981b). On the scales of measurement. Journal of Mathematical Psychology, 24, 249–275.
  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
  • Rozeboom, W.W. (1966). Scaling theory and the nature of measurement. Synthese, 16, 170–233.
  • Stevens, S. S. (1951). Mathematics, measurement and psychophysics. In S. S. Stevens (Ed.), Handbook of experimental psychology (pp. 1–49). New York: Wiley.
  • Stevens, S. S. (1975). Psychophysics. New York: Wiley.
  • von Eye, A. (2005). Review of Cliff and Keats, Ordinal measurement in the behavioral sciences. Applied Psychological Measurement, 29, 401–403.

External links

This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.