Thursday, 21 April 2011

Mokken scaling and invariant item ordering (IIO)

 From Guttman to Mokken scaling
My first encounter with scaling came while I was developing the Edinburgh Feeding Evaluation in Dementia (EdFED) Scale when Ian Atkinson, then of The University of Edinburgh, suggested the use of Guttman scaling  to see if the items formed a hierarchy.  The items in the scale - which had to be collapsed into categories - formed a Guttman scale and this was replicated and Guttman scaling was also used to develop the Caring Dimensions Inventory (CDI-25 ).

Subsequently, Ian Deary - still at The University of Edinburgh - and who was instrumental in helping me carry out a multivariate analysis of the EdFED and also the CDI, met someone at a conference who suggested we should be using Mokken scaling.  We gathered more EdFED data and, combining this with existing data, I discovered that 6 behavioural items from the EdFED formed a Mokken scale.  The EdFED has since been translated into Chinese  and the psychometric properties, including Mokken scaling, replicated in a

Further applications of Mokken scaling

More recently, with several colleagues, I have been engaged in applying Mokken scaling to a range of psychological instruments including: the NEO-FFI; the GHQ-30; the EPI; the Townsend ADL scale; the Oxford Happiness Inventory; the CORE-OM; the DSSI/sAD ; and the Religious Involvement Inventory and the Spiritual Well-Being Scale.  However, in the process of publishing the above work, it emerged that my understanding of invariant item ordering (IIO) was incomplete.  The concept had first been drawn to my attention in the process of revising the NEO-FFI paper and I was directed to a review of IIO by Sijtsma & Junker (1996).  The situation was compounded by the inclusion of a method for estimating IIO in the MSP for Windows Version 5.0 using the diagnostics for the double-monotonicity model (DMM) - the non-intersection of item step response functions (ISRFs); but this only applies to dichotmous items where the ISRFs are the same as the item response functions (IRFs).  My error was pointed out by Rob Meijer and, with Ian Deary, I replied (doi:10.1016/j.paid.2009.11.025).  It is obvious in the last few pages of the MSP for Windows 5.0 manual that the methods for estimating IIO in polytomous items were still being developed.

At around this time, a method for estimating IIO had been developed and was available in the R Project for Statistical Computing (‘R’), specifically the Mokken Scaling Analysis (MSA) in R.  The application of R and the estimation of IIO in polytomously scored items is explained in a landmark paper by Ligtvoet et al (2010) and expounded on further in relation to our recent applications of Mokken scaling by Sijtsma et al (2011).  For anyone in any doubt about what IIO is and how it can be estimated, these papers are obligatory reading.

What is IIO?
According to Ligtvoet et al (2010) IIO is 'An item ordering that is the same for all respondents' and: 'the assumption of an IIO is both omnipresent and implicit in the application of many tests, questionnaires, and inventories.'  but also that 'IIO research is new, and experience on how to interpret results has to accumulate as more applications become available.'  Ligtvoet et al (2010), as do Sijtsma et al (2011), show that even though the DMM applies to ISRFs it may not, necessarily, apply to their resulting IRFs.  Using the MSA in R, Ligtvoet et al (2010) show how IIO can be estimated in a set of items an also how the accuracy of the IIO can be estimated using Htrans (analogous to Loevinger's coefficient H).  The larger the value of Htrans the more accurate the IIO and the accuracy of IIO arises from IRFs that are far apart and Htrans 'expresses the degree to which the scores of respondents have the same ordering as the item totals.'  The further apart the the IRFs the more accurate the IIO.

Two points arise:

First how the combination of serendipity and good colleagues, and the will to act on the advice of people you trust can lead to new discoveries.  The road to Mokken scaling was illuminated by Ian Atkinson and Ian Deary and the most recent papers have arisen due to the willingness of colleagues - too numerous to mention but all acknowledged through co-authorship in the papers referred to above - willing to share their data and allow secondary analyses.  Rob Meijer's comments on our work and the willingness of L Andries van der Ark to walk me through the use of MSA in R have been instrumental in deepening my understanding of IIO and Mokken scaling and item response theory.  Their generosity in collaborating in a paper on Mokken scaling (under review) with me and several colleagues has been a lesson to me of how science works.

Second a new and powerful method for investigating the psychometric properties of questionnaires is now available.  Its application to existing questionnaires is providing some interesting insights into old databases.  However, its potential in the development of new questionnaires remains to be explored.


  1. Where do item analysis and IIO, Mokken scaling, and item response theory meet in investigating the psychometric properties of a questionnaire? Do both approaches yield similar results?

  2. Belal - at one level, these methods are all in the same domain: Mokken scaling is a kind of item response theory and IIO is the most important property to establish in Mokken scaling. Therefore, this is very important in establishing the psychometric properties of a questionnaire and can be complementary to other methods falling broadly under classical test theory such as factor analysis and Cronbach's alpha. The reliabilty of a Mokken scale can be estimated and the subsequent scales can be tested against others for various types of construct validity - the advantage is that, with item response theory, you know more about how items relate to each other in a hierarchy and not just in their covariance; thanks for the comment - Roger