Understanding Scaled Scores
The ABPMR administers a different version of the Part I examination each year. Because these versions differ slightly from one another in difficulty, the number correct achieved by Candidates tested with different versions cannot be directly compared. A statistical process called equating is therefore conducted to put all scores from all test versions on the same scale. To aid score interpretation, this scale is usually quite different from number correct. For example, the Part I scale, which ranges roughly between 200 and 800, determined that Candidates who tested for the first time in 1998 had an average scaled score of 500. Each possible number-correct score on the version of the exam administered in 1998 had a corresponding scaled score. Note that these scores are neither percentage points nor are they constructed on a curve where a certain percentage would pass or fail.
Procedurally, scaling is simply a pair of conversions. For each test, the first conversionequating converts number correct on the current exam version to the version administered in 1998. The second step then employs the conversion of 1998 number-right scores to scaled scores. The 1998 exam serves as a basis or reference for all subsequent exams.
The 1998 exam also functioned as a basis for setting the initial specific passing standards. Expert review of the 1998 exam by a panel representing a broad constituency of physiatrists established the pass-fail point, called the "cut score," through a process called standard setting. The cut score represents the standard of performance required to pass the examination. Those who meet or exceed the standard pass the exam. The conversion of number-correct to scaled scores was chosen so that this proportion mapped to a scaled score of 405. Because the equating process ensures that performances on new versions of the exam are converted to equivalent performances on the 1998 exam, the passing score can remain constant even as the exam changes slightly from version-to-version.
Content Area Scoring
Examination scores aren't just a question of pass or fail. The scores are also structured to reveal, by content area, Candidates strengths and weaknesses.
Domain section scores range from 1 to 10. This 1-to-10 point range is intentionally narrow, because the section scores are often based on a small number of test items. Scores based on such relatively small numbers of items should be interpreted with caution and may not truly reflect performance on a larger sample.