Hello everybody!
I am currently trying to implement a CAT for reasoning. After having read a lot, I am still not sure about some of the steps for item bank calibration, so I thought i could get some advice here.
What I did: - developed items using an item generator - set up a (sequential) online test to collect initial data from the target population - collected data in 10 blocks each consisting of 16 different items and 4 anchor items - cleaned the data (no missing values, no click-thorughs etc.) --> for every block there is data from about 100 to 400 participants
What I plan on doing now: - test for rasch model fit in every block - test for DIF and remove bad items from every block - chain-link the remaining items using a common-item non equivalent groups design
This procedure should result in a calibrated item bank that I could then feed to my CAT, right? Would you do it the same way or is something wrong or missing? What would be suitable variables for testing for DIF? I thought about gender, age and level of education.
Thank you very much for your advice! |