by Dr Andrew Johnson, Cortney Hanna and Rachael Smyth
The videos presented within this site will introduce students to the fundamentals of test construction and psychometrics. Videos will include the basics of measurement, descriptive statistics, and item writing, and will build to more advanced topics such as factor analysis and item response theory. Where appropriate, students will be provided with instruction as to how these analyses are conducted within R, an open-source statistical programming language.
Please contact Andrew Johnson with any comments, concerns, or technical support queries.
By the end of this module, students will:
In this topic, we will explore the basics of measurement, and the comparison of apples and oranges...
The most basic task that we can undertake is to describe a set of data. In this lesson, we explore the concepts of central tendency (mean, median, and mode) and dispersion (range, standard deviation, and coefficient of variation).
Comparing apples and oranges is easier when we can conduct our comparisons on a common scale. In this lesson, we discuss the basics of standardizing variables, using z-scores, T-scores, and percentiles.
It is hard to imagine undertaking any sort of test validation without the use of correlation coefficients - most of our efforts at evaluating reliability and validity will depend upon some sort of correlation coefficient. In this lesson, we introduce the concept of correlation, touching upon the most common correlation coefficients: the Pearson Product-Moment correlation coefficient, the Spearman rank-order correlation coefficient, and the point-biserial correlation coefficient.
In addition to the obvious use of correlation coefficients for the estimation of reliability and validity, measurement principles can be applied to the evaluation and interpretation of correlation coefficients when we consider the impact of the properties of tests and measures on the correlation coefficient itself. In this lesson, we start with a consideration of the "latent variable" (or "construct") that we intend to measure. We also introduce the notion of "disattenuation", with particular attention to the effects of unreliability and constrained variability on a demonstrated correlation between two observed variables.
Almost anything that is worth measuring will be predicted by multiple factors, and so the concept of multiple correlation will be critical to our ability to understand the nature of tests and measures. In this lesson, we introduce the idea of multiple correlation, and with it, the concepts of part and partial correlation.
Although it has been overshadowed by advances in modern psychometrics, an understanding of classical test theory remains an important part of understanding the impact of measurement quality on scientific investigations. In this lesson, we introduce classical test theory, along with critical assumptions of the theory, and the importance of these assumptions for our understanding of reliability and validity.
Arguably one of the most common topics of discussion in any introductory course on measurement, the concept of reliability lies at the heart of any psychometric evaluation of tests and measures. Starting from a discussion of the standard error of measurement, we explore practical issues in the development of reliable measures, and introduce common terms in the assessment of reliability. We conclude with a consideration of various sources of unreliability within a measure, including the importance of the number of items on a test.
A measure that consistently measures the wrong thing is not particularly useful in any scientific context, and so the topic of validity is critical to our understanding of the quality of a measure. In this lesson, we discuss construct validity, and how this concept fits into a discussion of other forms of validity, including face validity, content validity, and discriminative validity.
At the end of the day, a measure is only as good as the items that it is made up of. In this lesson, we discuss basic item writing principles, and learn how to identify "good" (and "bad") items within our measures.
Despite our best efforts at creating high quality items for our measures, it is likely that some items will be "better" than others. In this lesson, we discuss a number of methods that may be employed to evaluate tests and measures, as well as the items of which they are comprised.
When we consider tests (whether they are tests of ability, or tests of personality), we usually speak as though they are measuring a single thing. This assumption of unidimensionality is frequently untenable, and should be tested using some form of factor analysis. In this lesson, we introduce the concepts of principal components and principal axis factoring, and how methods of data reduction can be used to better understand the dimensionality of our tests and measures.
Modern psychometrics, including item response theory, has begun to supplant classical test theory in many educational contexts. As the name implies, item response theory is better able to evaluate the ability of individual items within a test to discriminate amongst individuals within our sample. In this lesson, we introduce the concept of item response theory, and discuss the application of this method to the evaluation of both ability and personality tests.