A comparison of IRT model combinations for assessing fit in a mixed format elementary school science test
Abstract
Open ended and multiple choice questions are commonly placed on the same tests; however, there is a discussion on the effects of using different item types on the test and item statistics. This study aims to compare model and item fit statistics in a mixed format test where multiple choice and constructed response items are used together. In this 25-item fourth grade science test administered to 2351 students in 35 schools in Turkey, items are calibrated separately and concurrently utilizing different IRT models. An important aspect of this study is that the effect of the calibration method on model and item fit is investigated on real data. Firstly, while the 1-, 2-, and 3-Parameter Logistic models are utilized to calibrate the binary coded items, the Graded Response Model and the Generalized Partial Credit Model are used to calibrate the open-ended ones. Then, combinations of dichotomous and polytomous models are employed concurrently. The results based on model comparisons revealed that the combination of the 3PL and the Graded Response Model produced the best fit statistics. © IEJEE.