Multidimensional Item Response Theory (MIRT), which is an extension of the unidimensional IRT model, is mostly used for research purposes due to its high complexity and lack of operational use in the assessment industry. Once used operationally, MIRT will allow psychometricians to analyze assessment items based on more than one dimension.

For example, a mathematics item may require students to have multiple abilities such as (1) understanding the question correctly, (2) transforming the question into an equation, and (3) having the computation skill to solve the equation. The student’s ability can be called by several terms depending on the literature referenced, such as latent trait, construct, or dimension. For our purposes, the student’s ability is called a dimension. In the example, all three abilities, or dimensions, would be required to answer the question correctly. While most current psychometric IRT models would allow for the analysis of only one of those dimensions, MIRT would allow for the analysis of all three simultaneously.

It has been more than 50 years since Frederic Lord published “A Theory of Test Scores,” a very influential article in IRT history.[1] Since then, IRT models have been extensively used in the assessment industry to report student scores for yearly progress reports. In order to apply an IRT model into a test design, at least three assumptions need to be met:

  1. local independence: a student’s ability to answer one item correctly does not affect the student’s ability to answer the other items
  2. unidimensionality of latent traits: a student needs to have one ability to answer the item correctly
  3. monotonicity: a student’s probability of correctly responding to items increases, or at least does not decrease, as the student’s ability increases

These assumptions are crucial factors to the modeling of IRT. Since IRT is the statistical representation of the relationship between test items and student ability, these three assumptions have to be met in order to make adequate interpretation of a student’s test score.

In addition, there has been extensive development of applications based on IRT such as equating, linking, Differential Item Functioning (DIF), and standard setting. These applications have the same assumption, that the item parameters are calibrated as accurately as possible. In order to estimate the item parameters as accurately as possible, all three assumptions listed above have to be met. One of the most common violations out of the three assumptions is unidimensionality, meaning that the assessment measures one aspect of a student’s ability to answer the item correctly. This could produce large standard error in terms of the accuracy of item parameters.

MIRT has been introduced to overcome that violation. However, several factors need to be considered before adopting the MIRT into the test design and score reports. Since MIRT is the statistical model of the relationship between test items and students, having correct prior information of the following factors will reduce the estimation error:

  1. number of dimensions
  2. definition of the dimensions
  3. structure of the dimensions
  4. correlation between dimensions
  5. different types of configuration of dimensions

My next blog post in this series on MIRT will cover these above factors one by one.

[1] Lord, F. M. (1952). A theory of test scores. Psychometric Monograph, No. 7.