I've long been an advocate for making Step 1 a true pass/fail test. I think it's use and importance have been a net negative for medical education as students have wisely shirked many of the curriculum elements designed to make them better doctors in order to focus on the things that will help them do well on what is currently a very important test. Make it strict pass/fail and all that goes away.
Ideally, all of the steps would be strict pass/fail since that is and was their purpose: assure a minimum level of competence in physicians before granting them a license to practice.
But wouldn't this make it hard to stratify applicants to residency?
It would eliminate the use of one stratification tool that arguably is terrible at doing so. Every study trying to correlate step scores with residency performance have shown it's a terrible test for doing so. Two scores must differ by 16 points to even be statistically significant! Yes, a 240 and 255 could represent identical knowledge and the difference attributed purely to scaling error. This is straight out of the NBME's own publications, yet people always seem surprised when I say it.
So how do programs stratify applicants? Who cares?! Frankly, I don't think the NBME/FSMB or the faculties of medical schools are responsible for figuring out how residency programs select applicants.
Why not let programs figure it out for themselves? Every specialty board already has its own exam material it uses to award board certification as well as test its residents at each year of training. Why not let them design their own aptitude tests if they feel it's necessary? Presumably, the sort of baseline knowledge you want in a future pediatrician is slightly different than what you want in a future OBGYN, or a future internist versus future surgeon. Our field has even been trialing something akin to how large companies hire, using a telephone behavioral question interview that is scored by a third party to help stratify applicants. This is a big pain in the butt for applicants but at least there's a large body of evidence in the business world suggesting it actually helps hire better people. It certainly has to be better than stratifying based on who better understood the electron transport chain 2-3 years ago. Couple this with a field-specific aptitude and knowledge test and now I can stratify my applicants in a much more meaningful way.
Truthfully, we would probably end up interviewing the same group of applicants even if we didn't have step 1 scores. Now THAT would make an interesting study! A little crossover design where you have the same faculty review prospective applicants both with and without their step scores and see if you still get the same group. My gut instinct tells me there would be a vast amount of overlap.
It blows my mind that the AAMC and LCME are not leaning heavily on the NBME to make this kind of change.