Learning Module in Test Construction

The videos presented within this site will introduce students to the fundamentals of test construction and psychometrics. Videos will include the basics of measurement, descriptive statistics, and item writing, and will build to more advanced topics such as factor analysis and item response theory. Where appropriate, students will be provided with instruction as to how these analyses are conducted within R, an open-source statistical programming language.

Please contact Andrew Johnson with any comments, concerns, or technical support queries.

Learning Objectives

By the end of this module, students will:

understand classical test theory as it applies to the critical evaluation of item and scale quality
be able to critically evaluate existing test items, and create new items, in a way that minimizes bias
recognize the limitations of classical test theory, and understand it's relationship to item response theory (IRT)
be able to compare the relative merits of a classical test construction approach with an IRT approach, and select the method that is most appropriate for the measure that they wish to evaluate, the data that they have at their disposal, and the audience to which they must present their conclusions
be able to comfortably use R for the calculation of basic descriptive statistics, and for factor analysis, classical item analysis, and IRT model fitting

Intro to Measurement

In this topic, we will explore the basics of measurement, and the comparison of apples and oranges...

Describing Data

The most basic task that we can undertake is to describe a set of data. In this lesson, we explore the concepts of central tendency (mean, median, and mode) and dispersion (range, standard deviation, and coefficient of variation).

The Normal Distribution

Comparing apples and oranges is easier when we can conduct our comparisons on a common scale. In this lesson, we discuss the basics of standardizing variables, using z-scores, T-scores, and percentiles.

Intro to Correlation

It is hard to imagine undertaking any sort of test validation without the use of correlation coefficients - most of our efforts at evaluating reliability and validity will depend upon some sort of correlation coefficient. In this lesson, we introduce the concept of correlation, touching upon the most common correlation coefficients: the Pearson Product-Moment correlation coefficient, the Spearman rank-order correlation coefficient, and the point-biserial correlation coefficient.

Correlation and Test Construction

In addition to the obvious use of correlation coefficients for the estimation of reliability and validity, measurement principles can be applied to the evaluation and interpretation of correlation coefficients when we consider the impact of the properties of tests and measures on the correlation coefficient itself. In this lesson, we start with a consideration of the "latent variable" (or "construct") that we intend to measure. We also introduce the notion of "disattenuation", with particular attention to the effects of unreliability and constrained variability on a demonstrated correlation between two observed variables.

Multiple Correlation

Almost anything that is worth measuring will be predicted by multiple factors, and so the concept of multiple correlation will be critical to our ability to understand the nature of tests and measures. In this lesson, we introduce the idea of multiple correlation, and with it, the concepts of part and partial correlation.

Classical Test Theory

Although it has been overshadowed by advances in modern psychometrics, an understanding of classical test theory remains an important part of understanding the impact of measurement quality on scientific investigations. In this lesson, we introduce classical test theory, along with critical assumptions of the theory, and the importance of these assumptions for our understanding of reliability and validity.

Lessons
Quiz

Reliability

Arguably one of the most common topics of discussion in any introductory course on measurement, the concept of reliability lies at the heart of any psychometric evaluation of tests and measures. Starting from a discussion of the standard error of measurement, we explore practical issues in the development of reliable measures, and introduce common terms in the assessment of reliability. We conclude with a consideration of various sources of unreliability within a measure, including the importance of the number of items on a test.

Validity

A measure that consistently measures the wrong thing is not particularly useful in any scientific context, and so the topic of validity is critical to our understanding of the quality of a measure. In this lesson, we discuss construct validity, and how this concept fits into a discussion of other forms of validity, including face validity, content validity, and discriminative validity.

Writing Good Items

At the end of the day, a measure is only as good as the items that it is made up of. In this lesson, we discuss basic item writing principles, and learn how to identify "good" (and "bad") items within our measures.

Lessons
Quiz

Item Analysis

Despite our best efforts at creating high quality items for our measures, it is likely that some items will be "better" than others. In this lesson, we discuss a number of methods that may be employed to evaluate tests and measures, as well as the items of which they are comprised.

Factor Analysis

When we consider tests (whether they are tests of ability, or tests of personality), we usually speak as though they are measuring a single thing. This assumption of unidimensionality is frequently untenable, and should be tested using some form of factor analysis. In this lesson, we introduce the concepts of principal components and principal axis factoring, and how methods of data reduction can be used to better understand the dimensionality of our tests and measures.

Item Response Theory

Modern psychometrics, including item response theory, has begun to supplant classical test theory in many educational contexts. As the name implies, item response theory is better able to evaluate the ability of individual items within a test to discriminate amongst individuals within our sample. In this lesson, we introduce the concept of item response theory, and discuss the application of this method to the evaluation of both ability and personality tests.

Our Sponsors

Thanks to our Sponsors

--- Web design by Julie Whitehead, Instructional Designer, WesternU ---

Welcome to Test Construction