Questions about the USMLE

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.
So just out of curiosity, what are your thoughts on making the MCAT pass/fail and why?

The problem with that approach would be defining a passing threshold. The MCAT is different things to different medical schools. The USMLE was build to be one thing to the licensing boards: yes or no.

I would be in favor of not differentiating MCAT scores above a certain percentile.
 
The problem with that approach would be defining a passing threshold. The MCAT is different things to different medical schools. The USMLE was build to be one thing to the licensing boards: yes or no.

I would be in favor of not differentiating MCAT scores above a certain percentile.

Would the AAMC recommendation of 500 = pass work? I'm guessing not but was curious because apparently AAMC believed 500+ scores are correlated with med school board pass rates?

Also slightly related:

Going a little reductio ad absurdum, aren't we?

If Step 1 went P/F tomorrow there would still be all the other elements of ERAS that program directors already use to make decisions: LORs, the MSPE, clerkship grades, Step 2 CK, etc.

If Step 1 were made P/F, would that impose further pressure on Step 2 CK since it's scored? So a possible outcome could be Step 1 dedicated periods being replaced by a dedicated for Step 2 CK but there also runs a risk of people prioritizing Step 2 CK over clinical years?

I'm wondering that maybe making preclinicals unranked pass/fail wasn't a good idea after all. At least graded preclinicals forced students to concentrate on doing well on class and not just brush it off completely for Step 1.
 
Would the AAMC recommendation of 500 = pass work? I'm guessing not but was curious because apparently AAMC believed 500+ scores are correlated with med school board pass rates?

The dividing line between an acceptable board passage rate and an unacceptable one would remain somewhat arbitrary.

Lawper said:
If Step 1 were made P/F, would that impose further pressure on Step 2 CK since it's scored? So a possible outcome could be Step 1 dedicated periods being replaced by a dedicated for Step 2 CK but there also runs a risk of people prioritizing Step 2 CK over clinical years?

Step 2 CK is a very different exam, and the preparation for it is completely unlike Step 1. If anything, it’s probably a better measure than Step 1, because the content is more clinically relevant. It has historically been undervalued because it was not required to sit for it before residency application season.

Lawper said:
I'm wondering that maybe making preclinicals unranked pass/fail wasn't a good idea after all. At least graded preclinicals forced students to concentrate on doing well on class and not just brush it off completely for Step 1.

It’s a trade off, but I’ll take a relatively collaborative and collegial environment over the old days.
 
Going a little reductio ad absurdum, aren't we?

If Step 1 went P/F tomorrow there would still be all the other elements of ERAS that program directors already use to make decisions: LORs, the MSPE, clerkship grades, Step 2 CK, etc.
Not at all, going off what our deans tell us here.

We have no preclinical grades or ranking in the MSPE, no AOA (until after the match), and are advised not to take step 2 until after applying to residency unless we got a low step1 we need to compensate for. We're told the norm is to have strong letters, much like med school admissions where a lukewarm LoR is a very bad thing that shouldn't be common. And while our school does try to keep clinical Honors to a minority each rotation so it means something, I know some peers (e.g. Yale) give a large majority Honors. If I recall correctly, I even once read a Yale student say the most common transcript to graduate with is full Honors.

If you switch step1 to pass/fail and aren't allowed to know school names, what's truly left on an app like what my school produces? The bolded stuff is accounted for and in the etc., the only things I can come up with that are already used much, are:

1) do we know of you personally from a rotation you did with us, or personally know someone vouching for you, and
2) did you adequately fluff up your research/extracurricular entries

Neither of those sound like something I want controlling people's futures
 
We have no preclinical grades or ranking in the MSPE, no AOA (until after the match), and are advised not to take step 2 until after applying to residency unless we got a low step1 we need to compensate for. We're told the norm is to have strong letters, much like med school admissions where a lukewarm LoR is a very bad thing that shouldn't be common. And while our school does try to keep clinical Honors to a minority each rotation so it means something, I know some peers (e.g. Yale) give a large majority Honors. If I recall correctly, I even once read a Yale student say the most common transcript to graduate with is full Honors.

If you switch step1 to pass/fail and aren't allowed to know school names, what's truly left on an app like what my school produces? The bolded stuff is accounted for and in the etc., the only things I can come up with that are already used much, are:

1) do we know of you personally from a rotation you did with us, or personally know someone vouching for you, and
2) did you adequately fluff up your research/extracurricular entries

Neither of those sound like something I want controlling people's futures

I don't know where the idea of blinding to school name is coming from. If it has been offered as a serious proposal please link me to a source, I would be interested to see it.

LOR's do provide useful information, which is why they are the second most commonly cited factor in granting an interview, with an average rating of 4.2 (vs. 4.1 for Step 1).

The MSPE may contain no preclinical grades or rank, but the large majority contain comparative information, in line with the AAMC recommendations. This may be why the MSPE is the third most commonly cited factor in granting an interview, with an average rating of 4.0 (vs. 4.1 for Step 1).

If Step 1 goes P/F, the optional nature of Step 2 would vanish overnight. It has already become the fourth most commonly cited factor for granting an interview, with an average rating of 4.0 (vs. 4.1 for Step 1).

And so on.

Screen Shot 2019-04-10 at 5.50.15 PM.png


Here are the next 14:
Screen Shot 2019-04-14 at 7.06.41 AM.png


You also have to bear in mind that there are 5,000 residency programs, so there are 5,000 idiosyncratic ways of evaluating applications.

The other potential way to limit Step 1's use in the application process would be to address the phenomenon of over-application, but ERAS seems to have no stomach for limiting its revenue growth.

Lastly, the only thing Yale is representative of is Yale.
 
Last edited:
You can't address over application without screwing some people over. Some people match at #30 or lower.

The match rate for US allopathic seniors is the same now as it was before over-application became so widespread. Someone who matches at #30 or lower today would have matched at #15 or lower 20 years ago, because there were fewer interviews and shorter rank lists. It's still a zero sum game.
 
I could imagine a future in which medical school is as selective as it is now, and people show they are capable of practicing any field intellectually by getting into school and passing. Then in the end they select their residency and their residency selects them based on their interests and abilities rather than sorting the high step 1 scorers into the highly reimbursed specialties.

What if the person who just passes med school were just as likely to do ortho as to do FM? And what if the person who is junior AOA were as likely to go into pediatrics as dermatology? Just because that’s what peeled their banana, what they were good at and liked? You’d probably need to blind residency programs to school name and even have a more rational healthcare payment model, but I’m dreaming big.
 
Would making Step 1 P/F nearly destroy US citizens' chances at matching from an IMG?
 
Would making Step 1 P/F nearly destroy US citizens' chances at matching from an IMG?
probably but the path to residency for IMGs has been narrowing every single year simply because the number of medical school graduates is increasing faster than the number of residency slots. Unfortunately, there are still many students who take the IMG gamble. Might seemed callous but personally I think it would be better if something happened to discourage any more people from going down that predatory pipeline.
 
Oh boy. Lots to get caught up on this thread for me. Sorry, some of these quotes are from Page 1

The impetus for change is the fact that what was originally intended to be a test of minimal foundational medical knowledge has evolved into an all-consuming monster, one that is causing significant harm to medical education and student health. In the past couple of years everyone has reached broad consensus that it's a serious problem, and the time has come to do something about it.

The fact that the exam is designed to test minimal knowledge for licensing, and that residency programs are now misusing it, is often stated. IMHO, this is ridiculous. Regardless of how it was initially developed, using it for something else is fine. Does a better step score = a better resident? More on that later. But there's no reason an exam can't fulfill two separate goals at the same time.

Whether "everyone has reached a broad consensus" is debatable, since no one has talked to program directors at all about this. IM programs that take mostly IMG's depend on Step scores, as the rest of the elements of their applications that you reference in other posts are mostly useless.

Only happy medium I can think of would be a broad-strokes categorization like High Pass (top 1/3rd), Pass (most of the remaining 2/3rds), and Fail. That would be a nice step down from the insane competitiveness of striving for the top ~15%, without making it worthless to work harder than the bare minimum in preclinical years. It would suck to narrowly miss the cutoff, but then again someone in the high 230s wasn't going to be very competitive for stuff like Derm and Ortho prior to the score change anyways.

This is being discussed -- either quartiles, or some other categorization. The "problem" as you mention is that you create an artificial cutoff -- a 234 is a "Pass" and a 235 is a "Pass with Honors", yet the two scores are really the same. One could argue that if PD's screen out everyone with a S1 score less than 240 then this really isn't any worse, but some people will win and some will lose with this type of system. I could live with it.

Capping would work I think only if there is a guarantee of matching. It's not really fair to make applicant A who has below-average stats only apply to 10 programs or whatever, since the more programs he applies to, the better his chance of matching. Not sure how you do that though, unless you did it like the military, where you list your top two choices of specialty and then your ranked list of locations, and someone takes you.

Capping applications is very controversial, it would mean that applicants would need to pick their applications very carefully. There have been suggestions floated that the number of apps be unlimited, but interviews be capped. And, if any limit in apps was placed there would be an exemption for couples -- I expect multiple medical students would "say" they are couple's matching just to get more apps, then not couples match in the NRMP.

Residency programs were capable of choosing highly qualified applicants in the decades before Step 1 became the be-all and end-all. The current arms race is largely a product of 1.) over-application to residency programs, and 2.) the further commercialization of the exam/exam prep industry.

I believe that they could make Step 1 P/F tomorrow and it would not have much effect on the outcome of the match. Programs would simply shift to other criteria that they already evaluate: Step 2 CK, clerkship grades, elective grades, shelf exams, sub-I grades, audition rotations, the MSPE, faculty LOR's, etc. Step 1 has never been the only piece of the puzzle, it's just the easiest one to obsess over.

I'd add 3) the growth of social media driving this insane behavior. Which is why I don't think that changing S1 to P/F will fix anything. medical students will pick the next thing -- probably S2 -- and just focus on that. Students will have "first aid for S2" on day one, and this insane focus on exam scores will stretch into the clinical years.

Fifteen years ago the average Step 1 score was 217, so an upper 230's was not slightly above average. Also, the exam did not report any specific score above a 245.

In the 1990's Part 1 was retired and USMLE Step 1 debuted, and thus began the slow but inexorable march toward our current state. The death knell occurred when program directors came under pressure to have 80% board passage rates in order to maintain ACGME accreditation, with Step 1 being seen as a good predictor of success on the speciality boards.

The key difference between the current situation and 10+ years ago isn't that Step 1 has really changed, or even that scores have increased. It's that residency programs no longer have the manpower to read all the applications they receive, so they use Step scores to whittle the number down to a manageable level. After that the rest of the package becomes more important.

This is somewhat anecdotal, but consistent across multiple program directors I know personally. The general feeling is that Step 1 scores have no real predictive power in terms of differentiating good residents from bad ones. One told me that the program's best residents are the ones with lower scores (meaning 220-240), because they tended to have strong interpersonal skills and could therefore function effectively in the clinical environment.

Scores about 245 were definitely reported. I took S1 >20 years ago, my score was above 245.

Another "fact" mentioned by student leaders is that step scores don't correlate with resident performance. This simply isn't true. Of course, nothing is a perfect predictor and some residents with high step scores have interpersonal problems, or lack team leadership, etc. But in general, residents with higher scores in my program do better than those with lower scores.

The third "fact" I'm often told is that as long as someone has a step score >211 they are guaranteed to pass the boards. This is ridiculously false.

The research game has also blown up into a monster of its own. For the most competitive specialties the AVERAGE numbers of posters/pubs on ERAS is 15-18. My friends aiming for things like academic neurosurg spend all their time finding ways to fluff up their research list, like doing a million mindless chart review data extractions for middle authorship, or presenting the same stupid small summer project at 8 different conferences. Having just a couple good longitudinal involvements in projects is perceived as killing your chances. Research years are also becoming a lot more common.

These numbers are up SEVERAL FOLD from just ten years ago. It's all madness and clearly a far cry from actual value added to their candidacy

This is driven IMHO by statistics, much like application inflation. If I was told that the average number of apps last year was 12 and I had to apply, you know how many apps I'd submit? Probably 14-15. I like being above average. Lake Wobegon and all that. So then next year the average is 14. Surprise! Same thing happens with everything else -- everyone is striving to be above average, and averages increase. It's inflation.

We're really hampered by the fact that the NBME has never divulged its system for generating a three digit score. Step 1 is an odd test for a number of reasons. One of them is that its ostensible purpose is to evaluate whether a taker has a minimum fund of medical knowledge. It would therefore make sense for Step 1 to be a criterion-referenced test. But it's not. It's a norms-referenced test, build around the notion that 5-6% of US allopathic students should fail it on the first attempt.

I'm mixed on this. I agree the USMLE hasn't published it's scoring rubric. I also agree that there should be a criterion cutoff for passing. But just because 5% fail each year, one would expect that the same percentage of people each year would not meet that criterion. Or, perhaps, the criterion increases over time as there is more to know.

I suspect a significant driver, which happens to be more recent, is the proliferation of self-assessment exams and the inception of Reddit. The combination of these two seems to work even the most stoic student into a lather.

Totally agree. Which raises the question of whether changing S1 to P/F really fixes the problem, or only treats the symptom.
If Step 1 goes P/F, the optional nature of Step 2 would vanish overnight. It has already become the fourth most commonly cited factor for granting an interview, with an average rating of 4.0 (vs. 4.1 for Step 1).

Totally agree. So the exam insanity would just shift to S2. And, now there's only one exam. If you screw up S1, you can always take S2 early and try to do better. If S1 is P/F, you get one chance at S2, and by the time you have a score you're probably already applying.

Would making Step 1 P/F nearly destroy US citizens' chances at matching from an IMG?

No. The number of spots, and the number of applicants remains the same. IMg's will still get spots, there are more spots than US grads.

probably but the path to residency for IMGs has been narrowing every single year simply because the number of medical school graduates is increasing faster than the number of residency slots.

This isn't true, at least not yet. We might be at an inflection point now, though.

-----

Overall, I understand why student leaders are frustrated with the current state of S1. Med students see S1 as their primary focus, and anything that isn't directly related to S1 is blown off. Anything non Step targeted that medical schools try to plug into their curriculum fails. And there's this sense (somewhat correct) that the S1 result defines their future.

Making S1 quartiled (or similar) might help, although would students really study less, how you you be certain you were in the top 25%
Making S1 P/F will just put that pressure on S2. All students will be required to pass S2 pior to applications, and that score will become paramount.
Make both S1 and S2 P/F, and programs will add specialty specific exams. Which will be an added cost, and more studying and stress.

Medical schools could report true quartiles in ERAS -- that might take the pressure off S1. But you're not allowed to put 65% of your students in the first "quartile", which plenty of schools do. Just telling us all of your students have met their milestones / EPA's / whatever and that we need to do a "holistic" review is impractical.

Dealing with app inflation is a whole topic on it's own.

... and I'm off to a forum with the NBME. Should be a party!
 
@aProgDirector @Med Ed

What's your prediction for what actually exists in five years? Is there critical mass for a change to either pass fail or quartiles? Or is this all a pipe dream that will get lip service at the most, a la the MCAT recentering to 500 but still reporting percentiles?
 
The inflation of scores is insane a score correct is probably needed. Something similar to a stock market correction.
 
The fact that the exam is designed to test minimal knowledge for licensing, and that residency programs are now misusing it, is often stated. IMHO, this is ridiculous. Regardless of how it was initially developed, using it for something else is fine. Does a better step score = a better resident? More on that later. But there's no reason an exam can't fulfill two separate goals at the same time.

I think a reason does exist that makes it difficult for the exam to serve two masters. The USMLE, like the NMBE before it, was designed for maximum discrimination around the pass/fail point with a negligible rate of false positives. This is at odds with the design of an achievement exam, which would provide discrimination across a broader array of scores.

aProgDirector said:
Whether "everyone has reached a broad consensus" is debatable, since no one has talked to program directors at all about this.

Broad consensus has been reached in UME land, at least. We've seen a spike in exam delays over the last 2 years, with record numbers of students already planning MPH/MBA years as buffers. We've seen record numbers of students getting put on antidepressants and anxiolytics. And while it may not be happening at every institution, the amount of conversation generated by this phenomenon has also spiked. It's a shame that the barriers between UME and GME communication seem to be as sturdy as ever.

aProgDirector said:
IM programs that take mostly IMG's depend on Step scores, as the rest of the elements of their applications that you reference in other posts are mostly useless.

COMLEX to the rescue?

aProgDirector said:
Scores about 245 were definitely reported. I took S1 >20 years ago, my score was above 245.

Interesting. I took it 15-20 years ago and even confirmed the 245 thing with a colleague who took it my year. I wonder the NBME experimented with a different score reporting system for a year or two and then abandoned it.

aProgDirector said:
But in general, residents with higher scores in my program do better than those with lower scores.

As the saying goes, as 5,000 program directors a question, you'll get 5,001 opinions.

aProgDirector said:
The third "fact" I'm often told is that as long as someone has a step score >211 they are guaranteed to pass the boards. This is ridiculously false.

They might be parroting this Sheriff of Sodium blog post. I see the 210-211 number as fairly arbitrary.

aProgDirector said:
Which raises the question of whether changing S1 to P/F really fixes the problem, or only treats the symptom.

I don't think the problem is fundamentally fixable. There will always be at least one point of major stress that drives the system into dysfunction, so we will forever be solving some issues while simultaneously creating others. The only thing I am certain of is that the current situation is untenable, and something has to give.

aProgDirector said:
... and I'm off to a forum with the NBME. Should be a party!

Hope they were serving some good stuff! Lord knows they have the $$$.
 
Do you think the changes would go into affect for 2020 test takers?

@aProgDirector @Med Ed

What's your prediction for what actually exists in five years? Is there critical mass for a change to either pass fail or quartiles? Or is this all a pipe dream that will get lip service at the most, a la the MCAT recentering to 500 but still reporting percentiles?

As mentioned above, the NBME plans an announcement for May/June 2019. No point in worrying about this until we hear.

I think a reason does exist that makes it difficult for the exam to serve two masters. The USMLE, like the NMBE before it, was designed for maximum discrimination around the pass/fail point with a negligible rate of false positives. This is at odds with the design of an achievement exam, which would provide discrimination across a broader array of scores.
Is there actually a statement from the NBME that there's actually a focus on discrimination at the P/F point? Because, as I mentioned above, the exam doesn't appear to be structured that way. If it was, there should be a big skew towards the higher end of the scale -- it would be easier to get all of the questions correct, since you don't care about discrimination at that part of the curve.

Broad consensus has been reached in UME land, at least. We've seen a spike in exam delays over the last 2 years, with record numbers of students already planning MPH/MBA years as buffers. We've seen record numbers of students getting put on antidepressants and anxiolytics. And while it may not be happening at every institution, the amount of conversation generated by this phenomenon has also spiked. It's a shame that the barriers between UME and GME communication seem to be as sturdy as ever.

You get no argument from me that the system is broken and can be improved. But this solution doesn't actually fix the problem. All that will happen is pushing the problem onto GME. We will then create a high stakes exam, and it easily could be worse. Imagine if it's only given a few times a year? Maybe surgery would have an exam only in a few cities and everyone has to travel there to test their knot tying skills, much like CS now. Would that really be better?

I think there are reasonable communication links between UME and GME. In this case, they were completely bypassed.

And around half of residents in IM are IMG's. There is no UME for them. Who is looking out for their interests? The ECFMG? Not really, they are a licensing / credentailing agency also, and don't want to get into the business of student evals.

As the saying goes, as 5,000 program directors a question, you'll get 5,001 opinions.

Agree, although at the PD meeting, there was almost universal concern about this USMLE change. Some of that is people being worried about change, and not the change itself

They might be parroting this Sheriff of Sodium blog post. I see the 210-211 number as fairly arbitrary.

The study quoted in that blog is problematic. First, it's only a single program over a few years, so extrapolating it nationally isn't reasonable. Second, much of the data isn't really spelled out. Last, the statistics seem biased to me, they say that the relationship between S1 and ABIM is modest with rho=0.59, and between S2 and S1 is good with rho=0.56. I don't fully undesrtand the stats, but these seem similar. They also did a logistic regression and say that S1 only explains 32% of the variance -- honestly, I think that's useful. Perhaps most importantly, if they looked at S1 and S2 as independent variables in a regression that's problematic, since performance on S1 probably predicts S2. Details in the paper aren't clear how they dealt with this.

I don't think the problem is fundamentally fixable. There will always be at least one point of major stress that drives the system into dysfunction, so we will forever be solving some issues while simultaneously creating others. The only thing I am certain of is that the current situation is untenable, and something has to give.

I don't disagree, but you're "giving" on the wrong thing. You could easily make the situation much worse

Hope they were serving some good stuff! Lord knows they have the $$$.

Mostly they were listening to us yell at them.

My bottom line is this: GME programs need standardized assessments of students. Ideally, this needs to be a national standard so that no matter which school you go to, we have some sense of your performance. Although imperfect, the USMLE does that. I don't buy the argument that the exam wasn't designed for this, so we can't use it. I'm happy to be convinced otherwise. As mentioned above, many of the talking points in this discussion are based upon tiny studies, with people cherry picking the numbers / statements that bolster their point of view.

Switching to a full P/F USMLE will not fix your problems. All of the stress currently on S1 will transfer to S2. Programs will demand a S2 score by Sept 15th to review applications. Student will demand a "dedicated" period for S2. Since many schools are moving to an 18 month basic sci, followed by 12 months of clerkships, the plan was to have another 6 months of clinical time with a chance to do other clerkships, etc. That's going to partially turn into study time for S2.

If the USMLE changes to quartiles (either 4 quartiles and fail, or 1st Q, 2+3 Q, 4th Q, and fail -- I've heard both are being discussed), this could easily hurt students. I currently have a USMLE cutoff -- there's a number that if you're below that, you do not get an interview. There's a second number that if you're above my floor but below it, we consider you for an interview depending on the rest of your application, including a willingness to interview you in anticipation of a S2 score before ranking. We're flexible. If USMLE switches to quartiles, we will not interview anyone in the 4th Q, because we won't be able to tell those really low scorers (i.e. the "less than 211" in the study above) from those in the 4Q but perhaps OK.

Let's actually fix the problem. Let's find a way to decrease application inflation. Let's also address the fact that applying to residency programs will be stressful, and nothing we do to the application process will fix that. Changing USMLE to P/F will just increase the pressure on other metrics. Medical schools switching to P/F clerkships is also not helping. Not all students are created equal. There's a push in UME to set some minimum competency floor, and then say that if everyone you graduate is above that floor, that's a "good assessment". That won't work for us, and if medical schools won't address the problem then GME will -- which is really how we got into this problem in the first place.

Last, although I support student and resident wellness, I think we do need to be careful. I heard from clerkship directors that many clerkships have shifted so that students get all weekends off. This improves their evaluations, students think the clerkship is better. Wellness is a balance. We can't remove all of the stress of Med Ed. That's just the way it is. Even on this thread, you can see that many students want to keep USMLE scoring. Removing the scoring is not a simple intervention, and may have downstream impacts that are worse, or may harm IMG or DO students.
 
I feel if it goes P/F, the number of dual or more degree wielding residents will begin to rise steadily.
 
I feel if it goes P/F, the number of dual or more degree wielding residents will begin to rise steadily.
Really?? Why would a residency director care about an MBA or MPH, aren't they just planning to squeeze maximal clinical hours out of you? Not counting the big name academic centers that build research time into the training, those I can see finding value in your extra year of experience in research/stats
 
Is there actually a statement from the NBME that there's actually a focus on discrimination at the P/F point?

Is there a statement from the NBME that Step 1 is a validated instrument for anything other than medical licensure? Here is what the NBME states on the InCUS website:

Screen Shot 2019-04-17 at 6.20.20 PM.png


aProgdirector said:
Because, as I mentioned above, the exam doesn't appear to be structured that way. If it was, there should be a big skew towards the higher end of the scale -- it would be easier to get all of the questions correct, since you don't care about discrimination at that part of the curve.

I am no psychometrician, but I have had to learn enough to be dangerous (literally). The most reliable point biserial indices come from harder questions. Not coincidentally, I think, the percentage correct needed to pass Step 1 reportedly hovers around 50-55%. That's a hard test, especially considering how much effort vis put into preparation by very smart people.

Don't get me wrong, I am not discounting that Step 1 can discriminate in ranges above the passing threshold, but this takes us back to the NBME's statistical black box. You can look through Pubmed and find many attempts to correlate NBME/Step scores with subsequent assessments. Some find associations, others don't. It's a tough nut to crack.

aProgDirector said:
My bottom line is this: GME programs need standardized assessments of students. Ideally, this needs to be a national standard so that no matter which school you go to, we have some sense of your performance. Although imperfect, the USMLE does that.

I understand and appreciate your dilemma, but it seems that the emerging bottom line for UME is this: the status quo is untenable.
 
Is there a statement from the NBME that Step 1 is a validated instrument for anything other than medical licensure? Here is what the NBME states on the InCUS website:

View attachment 258553



I am no psychometrician, but I have had to learn enough to be dangerous (literally). The most reliable point biserial indices come from harder questions. Not coincidentally, I think, the percentage correct needed to pass Step 1 reportedly hovers around 50-55%. That's a hard test, especially considering how much effort vis put into preparation by very smart people.

Don't get me wrong, I am not discounting that Step 1 can discriminate in ranges above the passing threshold, but this takes us back to the NBME's statistical black box. You can look through Pubmed and find many attempts to correlate NBME/Step scores with subsequent assessments. Some find associations, others don't. It's a tough nut to crack.

Don't know how far apart we are on this point. I agree that the USMLE's primary purpose is for licensing, and then there are secondary purposes. I don't read the above to suggest that those secondary uses are invalid. The lack of transparency by the NBME prevents us from knowing for certain. The ability of USMLE performance (on any Step) to correlates with ABIM performance is unclear, most of the evidence in my view suggests there is some correlation. It's difficult, since low performance on S1 might correlate best with PGY-1 ITE, and residents scoring low on that exam might be put in a remediation program that ultimately helps them, so you might expect that the correlation of S1 to ABIM might be weaker. But that doesn't mean it's unimportant, and I'd rather not have to remediate my residents.

Your stats knowledge is well above mine.

This question (for IM) is actually "easy" to answer, theoretically. The NBME has all of the USMLE and IM ITE data. if they were to share with the ABIM, we would have a complete recordset and would be able to answer this question with maximal certainty. But it's like gun violence research -- lots of people don't want to look at it for all sorts of reasons.


I understand and appreciate your dilemma, but it seems that the emerging bottom line for UME is this: the status quo is untenable.

I hear you. But from our view, a P/F Step exam (especially if this was done to all steps) is untenable. That's the problem.
 
Really?? Why would a residency director care about an MBA or MPH, aren't they just planning to squeeze maximal clinical hours out of you? Not counting the big name academic centers that build research time into the training, those I can see finding value in your extra year of experience in research/stats

Because it's something. When you take away step scores, there's really not a lot left.
 
Top