# Statistical Analysis of Medical Board Exams

#### 1stGenrationDoc

##### Junior Member
10+ Year Member
7+ Year Member

Over the years I have spoke with several professors, consultants, and individuals involved with the scoring system of medical board exams (as it amazed me how some people believe there is a 'curve' while others say there is a 'minimum requirement' needed to pass). While the actual methods used are not allowed to be released, everyone tends to agree that a multiple-choice exam shouldn't take 6-8 weeks to get a score.

First of all...let me inform you that the exam is NOT to 'test your knowledge' as this has been emphasized to me several times over the course of years. Does that surprise you? It did me at first, and then I was informed that they do not need to test medical students knowledge because that is what medical school does! So, in your opinion, if the exam does not test ones knowledge, then what exactly are they testing? If they WERE testing your knowledge, then the exam would not need to be timed.

Secondly, statisticians love numbers...but not necessarily common sense, which is unfortunate since these are the people that decide who passes and fails. Statisticians try to make board exams 'FAIR' by keeping them standardized. If medical board exams use statistics to make them 'standardized', 'equal', or 'fair', then why does each test (and even each section) vary so much in the number of words used?! Doesn't this cause a statistical error?

Additionally, it is interesting to know that performance on standardized exams has been shown to have no correlation with performance of a physician; and in fact, a few years ago it was published that there was actually an inverse correlation with performance of these exams with performance in the third year of medical school.

So my final question to you all is if these do not test our knowledge, aren't 'fair', and have no correlation with how one will perform as a physician, why are physicians required to take them?

#### turkeyjerky

10+ Year Member
Lol at the idea that step one performance is actually inversly correlated with third year performance...man, where do people come up with this stuff

#### bbydoc

step1 is an excellent tool for residency programs to compare applicants.
you are right, its purpose is not merely to test knowledge but knowledge testing is a big part of it.

OP
1

#### 1stGenrationDoc

##### Junior Member
10+ Year Member
7+ Year Member
Do I catch a hint of cynicism? Don't believe me...look for yourself!

While I had initially included about 10 journal articles for your convenience I finally decided to just post a link to one that summarizes most of them in a meta-analysis-like format.

(Read it all...or scroll to page 45-46)

Go ahead and look up journal articles yourself but be CAREFUL if you find positive correlations because those are showing MCAT to USMLE to another standardized exam. PERFORMANCE has a negative correlation to these exams.

#### turkeyjerky

10+ Year Member
Do I catch a hint of cynicism? Don't believe me...look for yourself!

While I had initially included about 10 journal articles for your convenience I finally decided to just post a link to one that summarizes most of them in a meta-analysis-like format.

(Read it all...or scroll to page 45-46)

Go ahead and look up journal articles yourself but be CAREFUL if you find positive correlations because those are showing MCAT to USMLE to another standardized exam. PERFORMANCE has a negative correlation to these exams.
I don't know what you're talking about dude. On page 9 of that paper it notes numerous mild-to-moderate positive correlations between nbme exams and clinical performance. Ne thinks you don't know what "inversely" means.

OP
1

#### 1stGenrationDoc

##### Junior Member
10+ Year Member
7+ Year Member
DUDE

Im sorry that this information upsets and/or frustrates you and can see how you may be displacing or projecting those feelings on to me; but do not fret as I do not personally take any offense (as I can understand why this would bother you personally).

I searched for the information you suggested on Page 9, and think you may have misinterpreted it yourself as you didnt include all the information. This would be considered a logical fallacy of the faulty generalization type known as cherry picking. Also, I believe there may be an informal fallacy present known as appeal to ridicule.

I will instead quote material from the journal article (Page 8-9) for everyones convenience to counter your argument which actually comes directly from the USMLE! So to make you feel better please know that you picked a good example!

excellent meta-analysis featuring data from the USMLE. In this review, Hamdy et al (2006) conclude: The studies included in the review and meta-analysis provided statistically significant mild to moderate correlations between medical school assessment measurements and performance in internship and residency. Basic science grades and clinical grades can predict residency performance. The authors also concluded that, as might be hoped, performance on similar measurement instruments is better correlated than performance of different instruments. So NBME II scores correlate well with NBME III scores, medical school clerkship grades correlate well with supervisor rating of residents; and OSCE scores correlate well with supervisor rating of residents, when similar constructs are assessed.

To clarifythe information data from the USMLE/NBME find mild to moderate correlations between medical school assessment measurements (i.e. basic science grades and clinical grades) and performance in internship and residency.

It also emphasizes what I posted earlier in that performance on similar instruments is better correlated than performance of different instruments. So those who do well on standardized exam will do well on other standardized exams, while those who do well on clinical rotations will do well with supervisor ratings as residents. This makes logical sense, doesnt it? What exactly does taking a timed standardized multiple-guess test have to do with being a physician anyways? Again, I state that these exams are NOT testing ones knowledge, but I still wonder exactly what they are testing?!

One professor suggested to me that they are testing conformity or your ability to think the same. Is that good or bad though? As an example, I will give you a true story of an Internal Medicine resident who was extremely knowledgeable, bright, and many even believed smarter than faculty members; yet he failed his Step 3 exam three times!!! His problemhe was not conforming and was always got the answer wrong because of things normal residents wouldnt think of; but once he took the exam answering the questions like everyone else didhe did amazing! So is it good or bad to be a conformist? That is the question.

Thank you for playingI think this is becoming a worthwhile debate.

#### MediumDef

10+ Year Member
5+ Year Member
"2.2.1 There is clear evidence from a variety of sources that performance on national licensing exams is a moderate predictor of performance in later clinical practice, by a variety of measures and outcomes."

This is the conclusion the authors seem to have come up with, I admit that I didn't peruse all 100+ pages but, like others, I would be interested to see where you found that better performance on standardized exams correlated with poorer performance on clinical rotations.

OP
1

#### 1stGenrationDoc

##### Junior Member
10+ Year Member
7+ Year Member
It is TRUE that this is a direct quote from the paper, but unfortunately, this section conclusion (on page 10) is followed by a paragraph that starts with the word, However

However, this is information which relates to assessment performance only, usually through cognitive knowledge measures, or less frequently through skills measures. A separate body of work has looked at the third element of Blooms taxonomy, which in this context might be called professionalism. This is addressed in Section 4.

Assessment performance is measured by (youve guessed it) yet another standardized exam.

Furthermore, this was not the ONLY conclusion listed as there were a total of 18 (EIGHTEEN) such section conclusions throughout the paper (all of which are listed on pages 3 and 4). Preceding these conclusions is a paragraph which gave such conclusions a qualitative description based on effect with low or small being below 0.3, medium or moderate being around 0.5, and those around 0.8 being described as high, good, or strong. These section conclusions were then followed by 5 General Conclusions (Page 4) and an evidence based, acceptable highly defensible, selection approach list of 5 Recommendations (Page 5) made by the authors.

Also to clarify, I did NOT say that better performance on standardized exams correlated with poorer performance on clinical rotations. Many individuals who do extremely well on these standardized exams are extremely smart as they have somehow figured out how to think like a robot and after information is inputted, they can easily output a conforming answer. What I did say was that I found it interesting to know that performance on standardized exams has been shown to have no correlation with [the] performance of a physician; and in fact, a few years ago it was published that there was actually an inverse correlation with performance of these exams with performance in the third year of medical school.

Where did I get this information??? As you know journal articles are limitless, so I think the easiest thing would be to kindly refer you to anywhere between the pages 46-97 of this document and pick any journal article that answers your question more specifically. In statistics, we generally try to PROVE ourselves wrong instead of looking for something to PROVE ourselves right. It might seem counterintuitive, but it is actually impossible to prove yourself right! One must only accept truth when one fails to prove the null hypothesis.

#### MossPoh

##### Textures intrigue me
10+ Year Member
7+ Year Member
and how does one exactly test performance as a physician relative to their board scores? Do they send out surveys and collect data on every single physician at the time they took the boards and 10 years later? 15 years later? 20? Do they analyze the relative riskiness of treating each patient to tease out the ones that just do the basic and easy stuff? Do they factor in personal and life factors that can decrease performance? What about relative time studied for boards versus score? Personal matters during that time? The temperature of the testing facility and relative comfort of the person?

There are far too many variables to simply make it an x = y, y=z, therefore x=z type equation.

I'm sorry, but if you held all things equal and gave me two identical applicants in every way with one testing in the 90 percentile and the other in the 50s. I'm still going to take the one in the 90th. The thing is, nothing is ever held equal. There are too many factors and angles to view things from.

#### bbydoc

"DUDE&#8230;"

Again, I state that these exams are NOT testing ones knowledge, but I still wonder exactly what they are testing?!

One professor suggested to me that they are testing &#8216;conformity' or your ability to think the same. Is that good or bad though? As an example, I will give you a true story of an Internal Medicine resident who was extremely knowledgeable, bright, and many even believed &#8216;smarter' than faculty members; yet he failed his Step 3 exam three times!!! His problem&#8230;he was not &#8216;conforming' and was always got the answer wrong because of things &#8216;normal' residents wouldn't think of; but once he took the exam &#8216;answering the questions like everyone else did'&#8230;he did amazing! So is it good or bad to be a conformist? That is the question.
my comment earlier on was sarcasm indeed.
Being in the UK system I was quite familiar with the content of the paper as it has been widely discussed and is still being discussed among my peers and in medical journals, after all, by the time I graduate the selection process will most certainly change.

this is an excellent point (above in bold) and was one of the main reasons why the UK has rejected indroducing a standardized test for selection purposes as it will lead to conformity (ie all medical schools try to teach knowledge required to excell on the standardized test).
In the UK they appreciate the variety of graduates offered by different medical schools and these graduates are deemed qualified enough when they pass medical school exams.

#### turkeyjerky

10+ Year Member
Let me ask you a question: have you actually taken a usmle or shelf exam, or us this all based on hearsay from disgruntled professors who have something against standardized exams?

#### jaguar1

IMHO, I feel that standardized test taking is obviously required and is necessary for benchmarking performances. However, when questions are made convoluted where they are trying to confuse you between two or three choices just on the basis of wording is just like "they are out to fail you". So you reapply and they make more money.
I mean seriously, I don't mind second step questions which will straight forwardly test you, if you know your material or not.

Lastly, this was more or less a rant but It is the reality so no whining and gotta man up to this test and face the reality!

OP
1

#### 1stGenrationDoc

##### Junior Member
10+ Year Member
7+ Year Member
Hmpf...for some reason, SDN didn't email me when there was 'new posts' on this thread. Anyways...

To start I would like to say that this thread was NEVER intended to get people so upset or angry, but rather just to get a feeling for what people on SDN thought about the board exams (since SDN is 'known' as the place to go for 'gunners' training for the exam or 'lost souls' who feel they have no where else to turn...not really the 'in-between people'). This is WHY it is such a good group for a debate.

While I do understand that strong opinions equals strong emotions, I would appreciate it if individuals would like to continue to contribute to this debate keep 'ridicule' and 'name-calling' out of their opinions. I apologize if you feel the responses we have wrote made you feel attacked, but please trust that they were never intended to be taken as such and for that I do sincerely apologize.

Sadly, I do not have time to reply to all the questions and comments posted since my last diatribe, but will soon.

#### turkeyjerky

10+ Year Member
Hmpf...for some reason, SDN didn't email me when there was 'new posts' on this thread. Anyways...

To start I would like to say that this thread was NEVER intended to get people so upset or angry, but rather just to get a feeling for what people on SDN thought about the board exams (since SDN is 'known' as the place to go for 'gunners' training for the exam or 'lost souls' who feel they have no where else to turn...not really the 'in-between people'). This is WHY it is such a good group for a debate.

While I do understand that strong opinions equals strong emotions, I would appreciate it if individuals would like to continue to contribute to this debate keep 'ridicule' and 'name-calling' out of their opinions. I apologize if you feel the responses we have wrote made you feel attacked, but please trust that they were never intended to be taken as such and for that I do sincerely apologize.

Sadly, I do not have time to reply to all the questions and comments posted since my last diatribe, but will soon.
Hmm...I think someone is projecting (sorry--on psych right now)

#### Dirt

10+ Year Member
7+ Year Member
I will sum this up for you.

Residencies need a cheap, easy way to evaluate everyone on the same standardized scale because if, as you say, medical school is what tests your knowledge, each school does it differently and residency programs have no way of knowing whether it is reliable or not. A multiple choice test is simply put, easy and reliable.

While step 1 may not be perfect at telling who will be a great doc, it not only tests your "medical knowledge' it tests an important skill, namely your ability to digest information, process it, and come up with a conclusion in a set amount of time. This skill is highly necessary in the majority of clinical medicine. For example, a patient walks in your door, tells you X, Y, and Z. You have to digest this information, think about what he/she may have, think about what tests you would need to order, and think about what treatments you may need to prescribe in 15 minutes. This is not all that different from a step 1 test question. The major difference is that step 1 often tests rare things, whereas in the real world you often treat the same common things over and over again.

As was mentioned before, there are far to many variables that go into being a good clinician to isolate and thus research. Any research that attempts to do this is going to be victim to confounding a million times over.

Lastly, everyone knows that step 1 is not a perfect judge of what makes a good doctor, which is why they consider many other things when you apply to residency. If ERAS was just a name and a board score I could see your beef, but really this just sounds like sour grapes from someone who didn't do well.

10+ Year Member
In statistics, we generally try to PROVE ourselves wrong instead of looking for something to PROVE ourselves right. It might seem counterintuitive, but it is actually impossible to prove yourself right! One must only accept truth when one fails to prove the null hypothesis.
That's actually an "appeal to ignorance" fallacy, not statistics. In statistics, nothing is ever proven, neither the alternative nor the null hypothesis - you can't "prove" yourself wrong any more than you can "prove" yourself right. If p < 0.05, the null hypothesis is considered to be unlikely, not proven wrong. The threshold of 0.05 itself is arbitrary.

Coming back to this thread, several others have cited evidence that step 1 scores are a mild/moderate predictor of clinical performance. You reject that argument because it is measured by a mix of both knowledge-based and skill-based assessments, and then go tell your audience to find the information supporting your argument for themselves. Not very persuasive.

As an example, I will give you a true story of an Internal Medicine resident who was extremely knowledgeable, bright, and many even believed &#8216;smarter' than faculty members; yet he failed his Step 3 exam three times!!! His problem&#8230;he was not &#8216;conforming' and was always got the answer wrong because of things &#8216;normal' residents wouldn't think of; but once he took the exam &#8216;answering the questions like everyone else did'&#8230;he did amazing! So is it good or bad to be a conformist? That is the question.
Going back to statistics, I'll just say "n=1."

I'll give you another true story: there's an internal medicine resident I know who did amazing and was believed to be 'smarter' than many faculty members. He got high scores on all his step exams.

Not an interesting story, is it? That's because it's not at all uncommon.

I did NOT say that "better performance on standardized exams correlated with poorer performance on clinical rotations"...What I did say was that..."there was actually an inverse correlation with performance of these exams with performance in the third year of medical school."
I'm not sure I can make the contradiction any clearer than that (at least, not without walking everyone through a needless description of inverse correlation).

While I do understand that strong opinions equals strong emotions, I would appreciate it if individuals would like to continue to contribute to this debate keep 'ridicule' and 'name-calling' out of their opinions. I apologize if you feel the responses we have wrote made you feel attacked, but please trust that they were never intended to be taken as such and for that I do sincerely apologize.
As others have already pointed out, your comments reveal an almost textbook pattern of projection. If you go to the start of this thread and re-read, I think you'll see that no one else is nearly as riled up as you on this subject. The use of bold case, capital letters, the dismissive "thanks for playing", and "I can understand why this would bother you personally" (as if you know the poster personally) - it doesn't seem like a purely academic exercise to you.

Last edited:

10+ Year Member
posterized