Linear and Separate Calibration Methods of Equating Continuous Assessment Scores of Public and Private Elementary Schools
Main Article Content
Abstract
This study determined the mean ability estimates of students in public and private secondary schools when their scores are equated through linear equating, and it examined which of the equating methods is more efficient. This study adopted a descriptive research design. The population for the study comprised 24,874 candidates that registered and sat for the 2016 June/July National Examinations Council (NECO) mathematics examination in Ogun State. A sample of 1139 candidates were selected from both public and private schools using the multi-stage sampling procedure. The research instruments used for the study were secondary data sources. The data was analyzed using Multidimensional Item Response Theory (MIRT), Separate and Linear Equation. The results showed that private school candidates’ ability score (x̅ = -0.001, SD = 0.961), (x̅= -0.001, SD = 0.961) was higher than public school candidates’ ability score (x̅= -0.865, SD = 1.058), (x̅= -0.626, SD = 0.970) when equated using separate calibration and linear equating methods respectively and that the difference observed in the ability estimate of examinees from public school and private school were significant (t 1138 = - 14.431, p < 0.05) and (t 1138 = -10.876, p < 0.05) when their scores were equated with each of the equating methods. However, the results showed that the linear equating method was more efficient.
Downloads
Article Details
Authors who publish in this journal retain full copyright and can publish their work under a Creative Commons Attribution-ShareAlike 4.0 International License, which allows others to share and adapt the work with proper attribution to the authors and a link to the license. Authors may also enter into separate, non-exclusive distribution agreements, such as posting to institutional repositories or publishing in books, with acknowledgment of the original publication. Additionally, authors are encouraged to share their work online to increase citations. The journal allows third parties to share and adapt the work with appropriate credit, a link to the license, and an indication of changes made. For more information on Open Access, see The Effect of Open Access.
References
Afemikhe, O. A., (2007). Assessment and Educational Standard Improvement: Reflections from Nigeria. A paper presented at the 33rd Annual conference of the International Association for Educational Assessment held at Baku, Azerbaijan. September 16th 21st 2007.
Agah, J.J, (2013). Relative Efficiency of Test Scores Equating Methods in Comparison of Students Continuous Assessment Measures. Ph.D Thesis submitted to the Department of Science Education, University of Nigeria, Nsukka. https://www.unn.edu.ng/publications/files/AGAH%20JOHN%20JOSEPH.pdf
Akin-Arikan, Ç., & Gelbal, S. (2021). A Comparison of Kernel Equating and Item Response Theory Equating Methods. Eurasian Journal of Educational Research, 93, 179-198. https://eric.ed.gov/?id=EJ1299641
Eiji, M., Catherine, M. H. & Yong-Won, L., (2008). Equating and Linking of Performance Assessment. Applied Psychological Measurement. 24(4): 325-337. Available at: http://upm.sagepub.com/cgi/content/abstract/24/4/325.
Emaikwu, S. O., (2004). Relative Efficiency of Four Multiple Matrix Sample Models in Estimating Aggregate Performance from Partial Knowledge of Examinees Ability Levels. An Unpublished PhD Thesis. University of Nigeria, Nsukka. https://globalacademicgroup.com/journals/the%20nigerian%20academic%20forum/Sunday11.pdf
González, J., & Gempp, R. (2021). Test comparability and measurement validity in educational assessment. In Validity of Educational Assessments in Chile and Latin America (pp. 173-204). Springer, Cham. https://link.springer.com/chapter/10.1007/978-3-030-78390-7_8
Hanson, B. A. & Beguin, A. A., (2002). Obtaining a Common Scale for IRT Item Parameters using Separate Versus Concurrent Estimation in Common Item Nonequivalent Groups Equating Design. Applied Psychological Measurement. 26(1): 3-24. https://doi.org/10.1177/0146621602026001001
Hanson, B. A. & Beguin, A. A., (2002). Obtaining a Common Scale for IRT Item Parameters using Separate Versus Concurrent Estimation in Common Item Nonequivalent Groups Equating Design. Applied Psychological Measurement. 26(1): 3-24. https://doi.org/10.1177/0146621602026001001
Hattie, J., Jaeger, R. M., & Bond, L. (1999). Persistent methodological questions in educational testing. Review of Research in Education, 24 (1), 393-446. https://doi.org/10.3102/0091732X024001393
Iris Eekhout, Ann M. Weber, Stef Buuren et al. Equate groups: An innovative method to link multi-item instruments across studies, 29 March 2022, PREPRINT (Version 1) available at Research Square [ https://doi.org/10.21203/rs.3.rs-1439153/v1 ]
Jones, P., Tong, Y., Liu, J., Borglum, J., & Primoli, V. (2022). Score comparability between online proctored and in‐person credentialing exams. Journal of Educational Measurement. https://doi.org/10.1111/jedm.12320
Kasali, J., & Adeyemi, A. (2022). Estimation of Item Parameter Indices of NECO Mathematics Multiple Choice Test Items among Nigerian Students. Journal of Integrated Elementary Education, 2(1), 43-54. doi: https://doi.org/10.21580/jieed.v2i1.10187
Kim, S. H. & Cohen, A. S. (1998). A comparison of Linking and Concurrent Calibration under Item Response Theory. Applied Psychological Measurement, 22 (2), 131-143. https://doi.org/10.1177/01466216980222003
Kolen, M. J. & Brennan, R. L., (2004). Test equating, scaling and linking (Second Edition). USA: Springer. https://link.springer.com/book/10.1007/978-1-4939-0317-7
Kolen, M. J., & Brennan, R. L. (2013). Test equating: Methods and practices. Springer Science & Business Media. https://link.springer.com/book/10.1007/978-1-4757-2412-7
Kolen, M. J., & Brennan, R. L. (2014). Item response theory methods. In Test Equating, Scaling, and Linking (pp. 171-245). Springer, New York, NY.
Makiney, J. D., Rosen, C., Davis, B.W., Tinios, K. & Young, P. (2003). Examining the Measurement Equivalence of Paper and Computerized Job Analyses Scales. Paper presented at the 18th Annual Conference of the Society for Industrial and Organizational Psychology, Orlando, FLORIDA
Morrison, C. A. & Fitzpatrick, S.J. (1992). Direct and Indirect Equating: A Comparison of Four Methods using the Rasch Model. Available at: https://eric.ed.gov/ERICdoes/data/pdf
Onjewu, M. A., (2007). Assuring Fairness in the Continuous Assessment Component of School Based Assessment Practice in Nigeria. Paper Presented at the 33rd Annual Conference of the International Association for Educational Assessment. Baku, Azerbaijan.
Petersen, N. S., Cook, L. L., & Stocking, M. L. (1983). IRT versus conventional equating methods: A comparative study of scale stability. Journal of Educational Statistics, 8, 137-156. https://doi.org/10.3102/10769986008002137
Pinsoneault, T. B, (1996). Equivalency of Computer-Assisted Paper-and-Pencil administered version of the Minnesota Multiphasic Personality Inventory-2.Computers in Human Behavior, 12, 291-300. https://doi.org/10.1016/0747-5632(96)00008-8
Raju, N. S., Laffitte, L. J., & Byrne, B. M. (2002). Measurement equivalence: A comparison of methods based on confirmatory factor analysis and item response theory. Journal of Applied Psychology, 87(3), 517–529. https://doi.org/10.1037/0021-9010.87.3.517
Schalet, B. D., Lim, S., Cella, D., & Choi, S. W. (2021). Linking scores with patient-reported health outcome instruments: A validation study and comparison of three linking methods. Psychometrika, 86(3), 717-746. https://link.springer.com/article/10.1007/s11336-021-09776-z
Wingersky, M. S., Cook, L. L., &Eignor, D. R. (1986). Specifying the characteristics of linking items used for item response theory item calibration. Paper presented at the annual meeting of the American Educational Research Association, San Francisco. https://doi.org/10.1002/j.2330-8516.1987.tb00228.x
Yang, W. (1997). The Effect of Content Mix and Equating Method on the Accuracy of Test Equating using Anchor. Item Design. Available at: http://eric.ed.gov
Zhu, W., Konishi, D., Welk, G., Mahar, M., Laurson, K., Janz, K., & Baptista, F. (2022). Linking Vertical Jump and Standing Broad Jump Tests: A Testing Equating Application. Measurement in Physical Education and Exercise Science, 1-9. https://doi.org/10.1080/1091367X.2022.2112683