- Joined
- May 27, 2010
- Messages
- 1,616
- Reaction score
- 38
Here's some non-speculative snippets of information about how the NBME makes USMLE Step 1 tests: Article 1 Article 2
Indices use to quantify question difficulty and quality
In article 2, the distribution of experimental (unscored) questions within a block was interesting; out of the two experimental items, one was included randomly in the section and the other item was always the last question in a block.
Just thought I'd share some of the info I found while obsessing over Step 1.
"Dozens of test forms are used, with examinees randomly assigned to forms. Test sessions are scheduled for eight hours… Sections and items within sections are presented in random order."
Indices use to quantify question difficulty and quality
- item difficulty (P value) - calculated as the proportion of examinees who responded to the item correctly
- logit transform of the item difficulty -log[p / (1 - p)]
- index of item discrimination: the item-total (biserial) correlation - the correlation between the item (scored 0/1 for incorrect/correct) and the reported total score
- r-to-z transformation of the biserial correlation - commonly used to correct for nonlinearities in the magnitude of correlation coefficients
- mean response time in seconds
- mean of the natural logs of response times
In article 2, the distribution of experimental (unscored) questions within a block was interesting; out of the two experimental items, one was included randomly in the section and the other item was always the last question in a block.
Just thought I'd share some of the info I found while obsessing over Step 1.