I fail to understand why step matters so much....

Goro · Jun 11, 2023

Ho0v-man said:
. One personal statement about a rotting possum corpse dropped someone to the bottom.

Wait, what?????

gyngyn · Jun 11, 2023

Goro said:
Wait, what?????

And you thought it was only my school!

Ho0v-man · Jun 11, 2023

Goro said:
Wait, what?????

Someone gave a very vivid description of a possum decaying in the sun on a hot day. Thankfully, we didn’t end up recruiting Dexter Morgan.

eteshoe · Jun 12, 2023

Ho0v-man said:
Someone gave a very vivid description of a possum decaying in the sun on a hot day. Thankfully, we didn’t end up recruiting Dexter Morgan.

Come on - you now Dexter would be too smart to put all that on paper 😉

Franzd'Epinay · Jun 13, 2023

gyngyn said:
Yes, it's the new normal. Step 2 is the new Step 1.
It was the predictable outcome of making Step 1 P/F.
Now there is only one bite at the apple and it comes too late to change course if there is an unfortunate score.

I don't think Step 2 could seriously have been considered a "second bite at the apple", especially at the height of Step 1 madness, especially for competitive specialties.

If there's going to be a high-stakes test for residency stratification purposes, I would much rather have that be something similar to Step 2 and 3 than Step 1; so, although I don't think this was the intention of the Step 1 change, I do agree with the result.

A better scenario would be to make all licensing tests pass-fail, then create a "new" test purely for residency stratification purposes that allows for retakes. I put "new" in quotes because, practically speaking, this would just be a Step 2/3 rip-off. Alternatively, we just keep using Step 2, allow for retakes, and let state boards figure out what to do in the rare case where someone gets a passing score on Step 2 followed by a retake where they fail. Perhaps this would be as easy as saying "your most recent Step 2 score must be in the passing range".

Med Ed · Jun 13, 2023

Franzd'Epinay said:
A better scenario would be to make all licensing tests pass-fail, then create a "new" test purely for residency stratification purposes that allows for retakes. I put "new" in quotes because, practically speaking, this would just be a Step 2/3 rip-off.

From a purely psychometric standpoint, it would be reasonable to create a new exam that is designed to stratify. The current step exams are designed to yield a yes/no result around a central question of minimal competency, which is a different goal.

Of course, this would require someone having to build and administer this new exam, which would be expensive, time-consuming, and add testing and financial burdens to medical students. The question of whether or not the resulting stratification is actually meaningful would likely persist, as well.

Franzd'Epinay · Jun 13, 2023

Med Ed said:
From a purely psychometric standpoint, it would be reasonable to create a new exam that is designed to stratify. The current step exams are designed to yield a yes/no result around a central question of minimal competency, which is a different goal.

They provide a score and a percentile, though. I know their original intent was to be criterion-based, but the scoring definitely allows for stratification at this time.

Med Ed said:
Of course, this would require someone having to build and administer this new exam, which would be expensive, time-consuming, and add testing and financial burdens to medical students. The question of whether or not the resulting stratification is actually meaningful would likely persist, as well.

I think the natural answer to this would be the NBME with its decades of experience in building these tests.

The cynic in me says that the NBME just makes a "new" test which is just a Frankenstein of Step 2+3 to rake in more money.

The hopeful in me says that the NBME keeps Step 2 scored, allows for retakes, then the individual state licensing boards just deal with the fact that some people will have multiple Step 2 scores.

VA Hopeful Dr · Jun 13, 2023

Franzd'Epinay said:
They provide a score and a percentile, though. I know their original intent was to be criterion-based, but the scoring definitely allows for stratification at this time.

I think the natural answer to this would be the NBME with its decades of experience in building these tests.

The cynic in me says that the NBME just makes a "new" test which is just a Frankenstein of Step 2+3 to rake in more money.

The hopeful in me says that the NBME keeps Step 2 scored, allows for retakes, then the individual state licensing boards just deal with the fact that some people will have multiple Step 2 scores.

State boards don't care about number of step scores as long as none are failures. Even then, most allow more than 1 fail per step before they care.

Here's what my state has to say about it:
For the United States Medical Licensing Examination or the Comprehensive Osteopathic Medical
Licensing Examination, or the Medical Council of Canada Qualifying Examination, the applicant
shall pass all steps within ten years of passing the first taken step. The results of the first three takings
of each step examination must be considered by the board. The board may consider the results from a
fourth taking of any step; however, the applicant has the burden of presenting special and compelling
circumstances why a result from a fourth taking should be considered.

So basically if you haven't passed a Step exam by the 4th try you can't get licensed. But if you take it 6 times to try and get the highest score possible, the state doesn't care as long as you passed one of the first 4 times.

Franzd'Epinay · Jun 13, 2023

VA Hopeful Dr said:
State boards don't care about number of step scores as long as none are failures. Even then, most allow more than 1 fail per step before they care.

Here's what my state has to say about it:
For the United States Medical Licensing Examination or the Comprehensive Osteopathic Medical
Licensing Examination, or the Medical Council of Canada Qualifying Examination, the applicant
shall pass all steps within ten years of passing the first taken step. The results of the first three takings
of each step examination must be considered by the board. The board may consider the results from a
fourth taking of any step; however, the applicant has the burden of presenting special and compelling
circumstances why a result from a fourth taking should be considered.

So basically if you haven't passed a Step exam by the 4th try you can't get licensed. But if you take it 6 times to try and get the highest score possible, the state doesn't care as long as you passed one of the first 4 times.

Right, but I think they haven't cared because the NBME makes you stop once you pass.

In my mind "Fail -> Pass" (or even "Fail -> Fail -> Pass") is much more straightforward to interpret from a licensing point of view than "Pass -> Fail" ( or "Pass -> Fail -> Pass"). With the former you can say "This person had some learning deficiencies, which they shored up and are now proficient to practice medicine"; with the latter you'd have to say something like "this person achieved proficiency, then... lost proficiency, but they're still good to go anyways."

Like I said, maybe I'm overthinking these rare edge cases.

Med Ed · Jun 13, 2023

Franzd'Epinay said:
They provide a score and a percentile, though. I know their original intent was to be criterion-based, but the scoring definitely allows for stratification at this time.

This statement sort of gets at the original point of this thread: the fact that scores and percentiles are generated as a byproduct of a pass/fail exam does not make them meaningful.

The passing threshold for the step exams is set using something called the Modified-Angoff method, which is a criterion-referenced approach.

Franzd'Epinay · Jun 13, 2023

Med Ed said:
the fact that scores and percentiles are generated as a byproduct of a pass/fail exam does not make them meaningful.

Is the issue that the questions are not good enough to accurately assess knowledge, or is there an issue with the scoring method itself that inherently increases the error in a score?

I admit I don't know enough about standardized testing methodology--I just assumed that the NBME practices were the best we could do given their long history.

VA Hopeful Dr · Jun 13, 2023

Franzd'Epinay said:
Right, but I think they haven't cared because the NBME makes you stop once you pass.

In my mind "Fail -> Pass" (or even "Fail -> Fail -> Pass") is much more straightforward to interpret from a licensing point of view than "Pass -> Fail" ( or "Pass -> Fail -> Pass"). With the former you can say "This person had some learning deficiencies, which they shored up and are now proficient to practice medicine"; with the latter you'd have to say something like "this person achieved proficiency, then... lost proficiency, but they're still good to go anyways."

Like I said, maybe I'm overthinking these rare edge cases.

Is that a new thing, because that definitely wasn't the case when I was in med school?

DOVinciRobot · Jun 13, 2023

VA Hopeful Dr said:
Is that a new thing, because that definitely wasn't the case when I was in med school?

Yeah you can’t retake a Step exam after you’ve passed it.

Franzd'Epinay · Jun 13, 2023

VA Hopeful Dr said:
Is that a new thing, because that definitely wasn't the case when I was in med school?

From the USMLE FAQs:

"If you pass a Step, you are not allowed to retake it, except to comply with certain state board requirements which have been previously approved by USMLE governance."

As to how recent it is--I'm not sure, but this was definitely the policy 10+ years ago when I took them.

deleted1139416 · Jun 13, 2023

VA Hopeful Dr said:
Is that a new thing, because that definitely wasn't the case when I was in med school?

I couldn’t fathom taking it twice. Even if I were unhappy with my score.

VA Hopeful Dr · Jun 13, 2023

Franzd'Epinay said:
From the USMLE FAQs:

"If you pass a Step, you are not allowed to retake it, except to comply with certain state board requirements which have been previously approved by USMLE governance."

As to how recent it is--I'm not sure, but this was definitely the policy 10+ years ago when I took them.

Must have been right after me then, I'm 15 years out from Step 1 and we could take it again if we wanted (basically no one did though).

GoSpursGo · Jun 13, 2023

Franzd'Epinay said:
I don't think Step 2 could seriously have been considered a "second bite at the apple", especially at the height of Step 1 madness, especially for competitive specialties.

If there's going to be a high-stakes test for residency stratification purposes, I would much rather have that be something similar to Step 2 and 3 than Step 1; so, although I don't think this was the intention of the Step 1 change, I do agree with the result.

A better scenario would be to make all licensing tests pass-fail, then create a "new" test purely for residency stratification purposes that allows for retakes. I put "new" in quotes because, practically speaking, this would just be a Step 2/3 rip-off. Alternatively, we just keep using Step 2, allow for retakes, and let state boards figure out what to do in the rare case where someone gets a passing score on Step 2 followed by a retake where they fail. Perhaps this would be as easy as saying "your most recent Step 2 score must be in the passing range".

Whatever the problem is with the current USMLE system, the answer cannot possibly be "make another high stakes test."

CJhooper123 · Jun 13, 2023

GoSpursGo said:
Whatever the problem is with the current USMLE system, the answer cannot possibly be "make another high stakes test."

It's tough to imagine, but if step 1 remains P/F it might be worth it for those aiming for competitive specialties.

Imagine spending thousands of dollars setting up away rotations before taking step 2, and getting a very low score. People who don't match into competitive specialties often forfeit thousands of dollars taking research years to improve their applications.

GoSpursGo · Jun 13, 2023

CJhooper123 said:
It's tough to imagine, but if step 1 remains P/F it might be worth it for those aiming for competitive specialties.

Imagine spending thousands of dollars setting up away rotations before taking step 2, and getting a very low score. People who don't match into competitive specialties often forfeit thousands of dollars taking research years to improve their applications.

Yeah but presumably you'd spend thousands of dollars taking this specialty-specific exam. I can't imagine the specialty-specific exam would be any sooner than current Step 2 timeframe, so it's not like you could make useful decisions based on that information. And there just aren't enough months before ERAS opens to allow for yet another dedicated study period, plus sub-I, plus away rotations.

For all of the above reasons I think it is a net negative that Step 1 is now P/F, but now that we are here I think applicants who choose to shoot for a competitive specialty just have to embrace a certain level of risk. If risk makes you uncomfortable, then pick a different specialty. Again--not that I am saying this is by any means FAIR, but I'm not sure there is a good alternative.

deleted1139416 · Jun 13, 2023

GoSpursGo said:
Yeah but presumably you'd spend thousands of dollars taking this specialty-specific exam. I can't imagine the specialty-specific exam would be any sooner than current Step 2 timeframe, so it's not like you could make useful decisions based on that information. And there just aren't enough months before ERAS opens to allow for yet another dedicated study period, plus sub-I, plus away rotations.

For all of the above reasons I think it is a net negative that Step 1 is now P/F, but now that we are here I think applicants who choose to shoot for a competitive specialty just have to embrace a certain level of risk. If risk makes you uncomfortable, then pick a different specialty. Again--not that I am saying this is by any means FAIR, but I'm not sure there is a good alternative.

The timing of VSAS with Step 2 scoring is crap. Even though there are many schools that are switching to 1.5 preclinical curriculum and students are taking Step 2 approximately 6-7 months before ERAS is due, this still isn't enough time to switch audition rotations because those usually get locked in around this time anyways.

LucidSplash · Jun 13, 2023

VA Hopeful Dr said:
Must have been right after me then, I'm 15 years out from Step 1 and we could take it again if we wanted (basically no one did though).

I took Step 1 summer of 2008. At that time it you couldn’t retake it if you passed. I also don’t recall any kerfluffle about a policy change and I feel like that’s something that would have been a hot topic of conversation and much complaining from various students. So I suspect the policy has been around since at least 2006.

The MCAT you could take a million times.

GoSpursGo · Jun 13, 2023

LucidSplash said:
I took Step 1 summer of 2008. At that time it you couldn’t retake it if you passed. I also don’t recall any kerfluffle about a policy changed and I feel like that’s something that would have been a hot topic of conversation and much complaining from various students. So I suspect the policy has been around since at least 2006.

The MCAT you could take a million times.

Basically, you're old @VA Hopeful Dr 🤣

LucidSplash · Jun 13, 2023

GoSpursGo said:
Basically, you're old @VA Hopeful Dr 🤣

Now now, I’m saying I’m equally as old. 😂 2008 puts me at 15 years out too! Not trying to throw shade at my fellow “seasoned” docs. 😂

GoSpursGo · Jun 13, 2023

LucidSplash said:
Now now, I’m saying I’m equally as old. 😂 2008 puts me at 15 years out too! Not trying to throw shade at my fellow “seasoned” docs. 😂

Lol, I can't say much--I think mine was 2011 🤣 But by then there was definitely no ambiguity, you only got one shot, and I actually was unaware that at one point you could have retaken the exam if you were a masochist.

VA Hopeful Dr · Jun 13, 2023

LucidSplash said:
I took Step 1 summer of 2008. At that time it you couldn’t retake it if you passed. I also don’t recall any kerfluffle about a policy change and I feel like that’s something that would have been a hot topic of conversation and much complaining from various students. So I suspect the policy has been around since at least 2006.

The MCAT you could take a million times.

I found a thread here from 2008 saying you couldn't retake it, so I'm guessing I heard wrong at the time.

Also, get off my lawn!

VA Hopeful Dr · Jun 13, 2023

LucidSplash said:
Now now, I’m saying I’m equally as old. 😂 2008 puts me at 15 years out too! Not trying to throw shade at my fellow “seasoned” docs. 😂

Yeah but you did like eleventy billion PGY years so in doctor years I'm older.

Med Ed · Jun 14, 2023

Franzd'Epinay said:
Is the issue that the questions are not good enough to accurately assess knowledge, or is there an issue with the scoring method itself that inherently increases the error in a score?

I admit I don't know enough about standardized testing methodology--I just assumed that the NBME practices were the best we could do given their long history.

Nor am I a psychmetrician, so take this with a grain of salt. But, if you're building an exam to assess a minimum level of knowledge, then the basic question for each exam item is "will a minimally competent test-taker get this right?" The exam only needs to be long enough to provide statistical heft to that analysis, and the passing threshold set to minimize false positives.

If, on the other hand, you want to build and exam that can statistially differentiate between two individuals with similar knowledge and test-taking abilities, that's a different situation. In the last year Step 1 was scored the standard deviation was 19, which is pretty large. In order to reduce that you're probably going to have to make the exam much longer, essentially adding power until getting 90% of the items correct is statistically different than getting 88% of the items correct.

And, as I said earlier, the question of whether or not this stratification is meaningful would persist. If you look at correlation studies between step scores and specialty board passage (more multiple choice exams), the curves flatten as you go up the score scale. Carmody examined this back in 2019. Ultimately it seems the exams are good at predicting future problems for low-scorers, but aren't very good at predicting anything useful for high-scorers.

NotAProgDirector · Jun 14, 2023

Whether or not higher scores predict anything (other than how people are likely to do on future exams) is unclear.

The performance distribution on all tests like this is very steep around the mean. Small changes in absolute performance will herald large changes in percentiles. The same is true for the MCAT also. The difference with the MCAT is that people with scores at the mean or below often don't get into med school at all. So when you look at all the higher scorers, the absolute differences tend to be a bit bigger. MCAT doesn't release absolute percentages as far as I know but I expect you'd find the same thing.

Regarding psychometrics as described by @Med Ed, although in general that's true and often argued by the USMLE as a reason not to use scores, it's also not really applicable because the USMLE isn't a test designed to assess minimum knowledge. If you really want a minimum knowledge test, you create it such that most people will get 100% of the questions correct. A written driver's ed test is a good example of this. It's designed so that if you know the basic material, you get everything correct. And by inference, the passing cut off tends to be relatively high. Another example are the innumerable online HR modules I need to complete each year -- each has a test, I need to score 90% to pass, and getting 100% is usually very easy. (Pointless aside, I am really annoyed when these tests have a minimum pass of 90% but only have 5 questions)

That's not the USMLE exam design. The USMLE is designed as a general knowledge test with the mean in the middle. The minimum pass level is theoretically picked to define minimum necessary knowledge. But the score clearly represents the taker's knowledge as measured on a MCQ test.

Not to nit pick, but the standard deviation doesn't tell you whether a score of 250 is different from 260. That's approximated by the standard error of measurement, which is much smaller (about 9 I think). And even that doesn't say that scores within 9 points are "indistinguishable" -- unless you want to make that statement with 66+% certainty.

GoSpursGo · Jun 14, 2023

Anakinmemer said:
My point was not whether a 250 vs 260 is statistically different, I'm sure it is. My point was that statistically different might mean absolutely nothing if we were to find out the true value. If I were a PD I would not care whether someone got 5 more questions right on a 300 question exam, regardless of its statistical significance, because the real world meaning of that is close to 0. When I go to practice clinical medicine I would use that same principle and not chose a diabetes drug that reduces A1C by an additional .001% when that clearly has no real effect vs other factors of the drug. When we report scores as they are now with percentiles, we are encouraging PD's to draw conclusions about the score that might not be true. Why not report an equated percent correct without percentiles and let them draw their own conclusion? If a PD sees 85% vs 82% and doesn't really care for the difference, doesn't that mean it's silly to show 260 vs 250 and encourage them to conclude the 260 is a superior student because they are 26 percentile places higher?

We need to start showing how bunched up students are in terms of raw performance because 26 percentile places might be very few questions. It's like they're hiding information that would make people take the test less seriously. I am all for standardized testing, but if everyone starts doing well you can't game the system by making some people look bad for being in the 5th percentile even if they are barely doing worse in terms of raw questions correct than someone in the 25th percentile.

Reveal the raw data and let people draw conclusions for themselves.

A score should look like:
260- 80th percentile- In 2022 students who scored a 260 got an average of 270/318 questions correct across all forms
250-54th percentile- In 2022 students who scored a 250 got an average of 255/318 questions correct across all forms

With this data you at least give program directors the chance to say hey I don't really care about an X question difference. Right now they don't have a choice but to blindly trust that higher is better without knowing the true difference.

(I made the raw numbers up as an example)

Even if what you are saying is correct and score differences are meaningless, how would you concretely suggest PDs stratify a bunch of applicants whose applications are otherwise also very similar? Because that’s what it always comes back to for me—complaining about the system doesn’t help if you don’t have a realistic suggestion for something better.

deleted1139416 · Jun 14, 2023

I also want to point out that the most competitive specialties (except for dermatology) get the lowest number of applicants. For example, the average plastic surgery program gets less than 100 and ENT/Uro both get 300-400. But there is a very strong self selection bias there since applicants with weak scores don't even bother applying.

NotAProgDirector · Jun 14, 2023

My apologies, I didn't really address your point.

I completely agree with you. For sure, the difference between percentiles in the middle of the pack is going to be very small. Although we can assess whether they are "statistically" different, that doesn't mean they have any practical difference - outcomes like this are common when the n in the population is very large.

I disagree that it would change anyone's behavior. People like thinking in percentiles, and that's likely to persist no matter what you do. Unless they report only the raw score, with no percentiles at all -- and that's unlikely.

In any case, reporting raw scores isn't feasible because not everyone takes the same exam and some exam forms may be harder than others. The SoS explored this in depth: Breaking the magic: the USMLE three-digit score

This also explains why you don't get your score immediately. They need to assess the group performance before they can score your exam.

Med Ed · Jun 14, 2023

NotAProgDirector said:
Regarding psychometrics as described by @Med Ed, although in general that's true and often argued by the USMLE as a reason not to use scores, it's also not really applicable because the USMLE isn't a test designed to assess minimum knowledge.

The USMLE is designed to give medical licensing boards a binary yes/no answer regarding an individual's possession of a minimum level of medical knowledge. It is expressly made for that purpose. All other uses of the score are secondary.

Med Ed · Jun 14, 2023

NotAProgDirector said:
Not to nit pick, but the standard deviation doesn't tell you whether a score of 250 is different from 260. That's approximated by the standard error of measurement, which is much smaller (about 9 I think). And even that doesn't say that scores within 9 points are "indistinguishable" -- unless you want to make that statement with 66+% certainty.

This is what I get for posting while sleep deprived!

NotAProgDirector · Jun 15, 2023

Med Ed said:
The USMLE is designed to give medical licensing boards a binary yes/no answer regarding an individual's possession of a minimum level of medical knowledge. It is expressly made for that purpose. All other uses of the score are secondary.

This is a USMLE talking point. They say it over and over. It's simply not true. As I mentioned before, if they want to design a test that really tests minimal knowledge, they should do so by building one that has a minimum pass around 85% of the questions correct, and the most common score would be 100%.

Put another way: yes, the USMLE uses the test to determine minimum competence. Using it to assess general medical knowledge is a secondary use. But doing so is completely statistically valid. Whether it reflects anything other than ability to pick the right answer on an MCQ test is an open question. But the USMLE stating that programs shouldn't use it because it wasn't designed for that is silly. At least from a psychmetric viewpoint.

deleted1139416 · Jun 15, 2023

NotAProgDirector said:
This is a USMLE talking point. They say it over and over. It's simply not true. As I mentioned before, if they want to design a test that really tests minimal knowledge, they should do so by building one that has a minimum pass around 85% of the questions correct, and the most common score would be 100%.

Put another way: yes, the USMLE uses the test to determine minimum competence. Using it to assess general medical knowledge is a secondary use. But doing so is completely statistically valid. Whether it reflects anything other than ability to pick the right answer on an MCQ test is an open question. But the USMLE stating that programs shouldn't use it because it wasn't designed for that is silly. At least from a psychmetric viewpoint.

I 100% agree with you!! There is a reason why they made Step 2 much more difficult when Step 1 became P/F.

Franzd'Epinay · Jun 15, 2023

HipiMochi said:
I 100% agree with you!! There is a reason why they made Step 2 much more difficult when Step 1 became P/F.

I wasn't aware of that. Was there a sharp drop in Step 2 scores after Step 1 became pass/fail?

bGMx · Jun 15, 2023

Anakinmemer said:
I think expanding ethics to 15% of the test was a play at adding a CARS like section that can't be as well studied for so to say...

Which is nuts because ethics is entirely too complex, nuanced and individualized to be tested.

SurfingDoctor · Jun 15, 2023

Step scores matter because ChatGPT can write your personal statement and letters of recommendation and all the other fluff better than you can.

randommedstudent24 · Jun 16, 2023

HipiMochi said:
I 100% agree with you!! There is a reason why they made Step 2 much more difficult when Step 1 became P/F.

Where is the data on this?

Med Ed · Jun 16, 2023

NotAProgDirector said:
This is a USMLE talking point. They say it over and over. It's simply not true. As I mentioned before, if they want to design a test that really tests minimal knowledge, they should do so by building one that has a minimum pass around 85% of the questions correct, and the most common score would be 100%.

What would be the unintended consequences to such an approach?

NotAProgDirector said:
Put another way: yes, the USMLE uses the test to determine minimum competence. Using it to assess general medical knowledge is a secondary use. But doing so is completely statistically valid. Whether it reflects anything other than ability to pick the right answer on an MCQ test is an open question. But the USMLE stating that programs shouldn't use it because it wasn't designed for that is silly. At least from a psychmetric viewpoint.

What statistically valid secondary use is the NBME saying you should avoid?

Med Ed · Jun 16, 2023

randommedstudent24 said:
Where is the data on this?

Good question. The pass rate on Step 2 has actually crept up to 99% for first time MD takers. For DO first takers it's 97%.

ChordaEpiphany · Jun 16, 2023

Anakinmemer said:
My suggestion is to include the raw data and let PD's decide for themselves how to interpret it. There's nothing inherently wrong with the exam or the fact that it is used to stratify. I'm not suggesting to make it p/f. My problem is with how scores are reported in a way that encourages conclusions that might not be true.

Is there any good reason not to reveal the raw data?

The issue is that there is something inherently wrong with the exam when used for stratification. The test-to-test variability is incredibly high, especially when compared to aptitude tests like the SATs or MCATs. You can take 10 predictive practice tests and still have a predicted score range of ~30 points (e.g., 250 +/- 15). It's meant to maximize accurate prediction right around a passing score, 209. So when most people are scoring in the 240 range on average, it's already nearing the ceiling of the exam.

Because of this high variability, PDs can really only meaningfully separate candidates into three groups with any degree of statistical certainty, 210-230, 230-250, and 250+. PDs already know this, which is why step cutoffs tend to be pretty low and they don't put as much stock in 255 vs. 265. Most actions by academic faculty in medicine absolutely baffle me and make me question if they have any grasp on statistics whatsoever, but somehow they get this one right.

It's also just a bad exam. I'm convinced half of the "what is the best thing to say to the patient?" questions on USMLE/NBME exams are written by radiologists. If they could just hire whoever does quality control at UWorld they might have an exam with statistical relevance that people actually respect.

HipiMochi said:
I 100% agree with you!! There is a reason why they made Step 2 much more difficult when Step 1 became P/F.

Is there any proof of this, or is step 2 just harder for people who took step 1 P/F? As someone who took step 1 scored but is now going through rotations with students who took it P/F, I've noticed wildly different study habits in this group of students. I'm not saying that's a bad thing either, because step exams always emphasized the wrong thing (i.e., obscure details over concepts). However, students I work with now are way, way less focused on minutiae and generally operate at a lower level of content mastery. Again, not saying that's a bad thing. This profession has needed to shift away from knowledge and shift towards interpersonal skills, leadership, and business/management for at least 20 years.

NotAProgDirector · Jun 16, 2023

I'm not certain I understand what you're asking.

Med Ed said:
What would be the unintended consequences to such an approach?

I assume you're asking: what would the unintended consequences be if they changed the USMLE to have a raw score of 85% to pass? It would be similar to reporting a pass/fail score only. Fail would remain a negative as it is today. Pass would be uninterpretable other than knowing that you passed. Since most people would get 95-100% of the questions correct, there would be absolutely no discrimination at that level of performance. There would be a slight difference from just P/F, as those scoring 85-95% would likely be considered differently than those scoring >=95%. Perhaps students wouldn't bother studying very much for the exam - similar to concerns raised about S1 being P/F. Is that what you're getting at?

Med Ed said:
What statistically valid secondary use is the NBME saying you should avoid?

Again, not sure what you're asking. I'm saying that a USMLE score of 250 shows that you "know more as assessed on an MCQ test" than people with a 240, and those more than those with a 230. The NBME seems to think that I should just treat anyone with a score higher than passing the same? This makes no sense to me at all. Again, I completely agree that a higher score on the USMLE doesn't necessarily predict that someone will be a better doctor/resident. But to state that it doesn't represent anything seems incorrect.

ChordaEpiphany said:
The test-to-test variability is incredibly high, especially when compared to aptitude tests like the SATs or MCATs.

Based on what? How certain are we that the SAT doesn;t have ranges like this? And the MCAT has a smaller range because the overall score range is smaller. We can fix that with the USMLE if we want -- simply divide the score by 10 and report that. Round it to a whole number if you wish. Now, scores will range from 16-28, pass will be a 20, and inter test variability will be 1.5. Does that make it better?

ChordaEpiphany said:
You can take 10 predictive practice tests and still have a predicted score range of ~30 points (e.g., 250 +/- 15).

Who says that these predictive practice tests are actually reflective of the test? Honestly, I think this is the biggest scam of all. The NBME should not be in the business of selling practice exams for it's own high stakes exam. This is all sorts of wrong.

2021Doctor · Jun 17, 2023

NotAProgDirector said:
I'm not certain I understand what you're asking.

I assume you're asking: what would the unintended consequences be if they changed the USMLE to have a raw score of 85% to pass? It would be similar to reporting a pass/fail score only. Fail would remain a negative as it is today. Pass would be uninterpretable other than knowing that you passed. Since most people would get 95-100% of the questions correct, there would be absolutely no discrimination at that level of performance. There would be a slight difference from just P/F, as those scoring 85-95% would likely be considered differently than those scoring >=95%. Perhaps students wouldn't bother studying very much for the exam - similar to concerns raised about S1 being P/F. Is that what you're getting at?

Again, not sure what you're asking. I'm saying that a USMLE score of 250 shows that you "know more as assessed on an MCQ test" than people with a 240, and those more than those with a 230. The NBME seems to think that I should just treat anyone with a score higher than passing the same? This makes no sense to me at all. Again, I completely agree that a higher score on the USMLE doesn't necessarily predict that someone will be a better doctor/resident. But to state that it doesn't represent anything seems incorrect.

Based on what? How certain are we that the SAT doesn;t have ranges like this? And the MCAT has a smaller range because the overall score range is smaller. We can fix that with the USMLE if we want -- simply divide the score by 10 and report that. Round it to a whole number if you wish. Now, scores will range from 16-28, pass will be a 20, and inter test variability will be 1.5. Does that make it better?

Who says that these predictive practice tests are actually reflective of the test? Honestly, I think this is the biggest scam of all. The NBME should not be in the business of selling practice exams for it's own high stakes exam. This is all sorts of wrong.

Do you agree that the group of students entering this year's match cycle that have high numeric Step 1 scores (such as those who may have delayed a year for research or other reasons) will have an advantage over those students with Step 1 scores of "PASS"?

Ho0v-man · Jun 17, 2023

NotAProgDirector said:
I'm not certain I understand what you're asking.

I assume you're asking: what would the unintended consequences be if they changed the USMLE to have a raw score of 85% to pass? It would be similar to reporting a pass/fail score only. Fail would remain a negative as it is today. Pass would be uninterpretable other than knowing that you passed. Since most people would get 95-100% of the questions correct, there would be absolutely no discrimination at that level of performance. There would be a slight difference from just P/F, as those scoring 85-95% would likely be considered differently than those scoring >=95%. Perhaps students wouldn't bother studying very much for the exam - similar to concerns raised about S1 being P/F. Is that what you're getting at?

Again, not sure what you're asking. I'm saying that a USMLE score of 250 shows that you "know more as assessed on an MCQ test" than people with a 240, and those more than those with a 230. The NBME seems to think that I should just treat anyone with a score higher than passing the same? This makes no sense to me at all. Again, I completely agree that a higher score on the USMLE doesn't necessarily predict that someone will be a better doctor/resident. But to state that it doesn't represent anything seems incorrect.

Based on what? How certain are we that the SAT doesn;t have ranges like this? And the MCAT has a smaller range because the overall score range is smaller. We can fix that with the USMLE if we want -- simply divide the score by 10 and report that. Round it to a whole number if you wish. Now, scores will range from 16-28, pass will be a 20, and inter test variability will be 1.5. Does that make it better?

Who says that these predictive practice tests are actually reflective of the test? Honestly, I think this is the biggest scam of all. The NBME should not be in the business of selling practice exams for it's own high stakes exam. This is all sorts of wrong.

It’s always baffled me that people actually pretend students with higher board scores don’t know more than students with lower scores.

deleted1139416 · Jun 17, 2023

Ho0v-man said:
It’s always baffled me that people actually pretend students with higher board scores don’t know more than students with lower scores.

That's not really what they are trying to say. The problem with comparing low and high board scores is that:

1) There is a range of error so that a 240 vs. 250 are not that different from each other due to the ranges of error overlapping, yet this range represents the 35 - 60 percentile and most residency PD's will absolutely treat those two scores differently
2) Certain school curricula have inherent advantages (6 week dedicated period, board study rotations, etc) versus schools where students only get a 2 week dedicated
3) Boards do not test on other skills, such as history taking, communication skills, teamwork, etc. All of these are important for patient outcomes.
4) Boards are broad, not deep. Amassing a large amount of knowledge needed to do well on Step doesn't necessarily translate to having critical thinking and problem solving skills. A friend of mine was the top bioengineering student in his undergraduate class and got a 521 on the MCAT, yet his step scores are average because he doesn't do well with memorizing every little detail. But he wiped the floor with me on rotations because he is BRILLIANT.

I do agree that boards are more than just a score, and serve as a proxy for work ethic and ability to think and reason and learn information quickly. All of these are important skills, so I am not saying that boards are worthless. But I still strongly believe that there is a lot of issues with how much of a role they play in residency selection.

lacrossegirl420 · Jun 17, 2023

I agree Step shouldn’t be everything, but IMO it definitely has value and should be valued highly. Doctors need a lot of knowledge and tests are probably the best way to objectively assess that, even if they’re not perfect

NotAProgDirector · Jun 17, 2023

2021Doctor said:
Do you agree that the group of students entering this year's match cycle that have high numeric Step 1 scores (such as those who may have delayed a year for research or other reasons) will have an advantage over those students with Step 1 scores of "PASS"?

No. I don't think it will matter much, really. Programs that are focused on USMLE scores will just use Step 2.

Also, there's this theme here on SDN that somehow USMLE scores are the key factor in evaluating applicants. I doubt this is true for most programs. Some fields it likely will have a bigger impact. I expect that for most programs, they may have a score below which they don't invite people, a borderline range where they look at the rest of the application, and a high enough score where the USMLE is no longer a disqualifying feature and the decision to invite is based upon the rest of the application.

HipiMochi said:
That's not really what they are trying to say. The problem with comparing low and high board scores is that:

1) There is a range of error so that a 240 vs. 250 are not that different from each other due to the ranges of error overlapping, yet this range represents the 35 - 60 percentile and most residency PD's will absolutely treat those two scores differently

It is true that the standard error of measurement of the USMLE is around 6, so a 240 and 250 "overlap" if you +/- the SE. But on average, the person getting the 250 has a better performance than the person getting the 240. Although it's possible that their actual performance is equal and the person with the 250 just had a "good day" and the 240 had a "bad day", it's more likely that the 250 represents a better performance. Programs are willing to accept much less than a 95% certainty.

HipiMochi said:
2) Certain school curricula have inherent advantages (6 week dedicated period, board study rotations, etc) versus schools where students only get a 2 week dedicated

Certainly true, but there's 2-3 years to study for these exams. Theoretically, you're learning all the material all along.

HipiMochi said:
3) Boards do not test on other skills, such as history taking, communication skills, teamwork, etc. All of these are important for patient outcomes.

Agreed. Presumably that's what clinical grades / performance are supposed to measure.

HipiMochi said:
4) Boards are broad, not deep. Amassing a large amount of knowledge needed to do well on Step doesn't necessarily translate to having critical thinking and problem solving skills. A friend of mine was the top bioengineering student in his undergraduate class and got a 521 on the MCAT, yet his step scores are average because he doesn't do well with memorizing every little detail. But he wiped the floor with me on rotations because he is BRILLIANT.

Which is why USMLE should be part of application review.

HipiMochi said:
I do agree that boards are more than just a score, and serve as a proxy for work ethic and ability to think and reason and learn information quickly. All of these are important skills, so I am not saying that boards are worthless. But I still strongly believe that there is a lot of issues with how much of a role they play in residency selection.

I think their importance is overstated here. Sure, if you get a 203 on S2, your chances of getting ortho are minimal. But for most applicants to most fields, a decent score is all you need. The step score insanity is driven mostly by student neuroticism, not reality.

deleted1139416 · Jun 17, 2023

NotAProgDirector said:
No. I don't think it will matter much, really. Programs that are focused on USMLE scores will just use Step 2.

Also, there's this theme here on SDN that somehow USMLE scores are the key factor in evaluating applicants. I doubt this is true for most programs. Some fields it likely will have a bigger impact. I expect that for most programs, they may have a score below which they don't invite people, a borderline range where they look at the rest of the application, and a high enough score where the USMLE is no longer a disqualifying feature and the decision to invite is based upon the rest of the application.

It is true that the standard error of measurement of the USMLE is around 6, so a 240 and 250 "overlap" if you +/- the SE. But on average, the person getting the 250 has a better performance than the person getting the 240. Although it's possible that their actual performance is equal and the person with the 250 just had a "good day" and the 240 had a "bad day", it's more likely that the 250 represents a better performance. Programs are willing to accept much less than a 95% certainty.

Certainly true, but there's 2-3 years to study for these exams. Theoretically, you're learning all the material all along.

Agreed. Presumably that's what clinical grades / performance are supposed to measure.

Which is why USMLE should be part of application review.

I think their importance is overstated here. Sure, if you get a 203 on S2, your chances of getting ortho are minimal. But for most applicants to most fields, a decent score is all you need. The step score insanity is driven mostly by student neuroticism, not reality.

I totally agree with what you said. It also seems like programs use a lot of other factors (IMG status, DO status, geographic status and preference signaling) to screen out applicants in lieu of/in addition to a usmle score cutoff.

Ho0v-man · Jun 17, 2023

HipiMochi said:
That's not really what they are trying to say. The problem with comparing low and high board scores is that:

1) There is a range of error so that a 240 vs. 250 are not that different from each other due to the ranges of error overlapping, yet this range represents the 35 - 60 percentile and most residency PD's will absolutely treat those two scores differently
2) Certain school curricula have inherent advantages (6 week dedicated period, board study rotations, etc) versus schools where students only get a 2 week dedicated
3) Boards do not test on other skills, such as history taking, communication skills, teamwork, etc. All of these are important for patient outcomes.
4) Boards are broad, not deep. Amassing a large amount of knowledge needed to do well on Step doesn't necessarily translate to having critical thinking and problem solving skills. A friend of mine was the top bioengineering student in his undergraduate class and got a 521 on the MCAT, yet his step scores are average because he doesn't do well with memorizing every little detail. But he wiped the floor with me on rotations because he is BRILLIANT.

I do agree that boards are more than just a score, and serve as a proxy for work ethic and ability to think and reason and learn information quickly. All of these are important skills, so I am not saying that boards are worthless. But I still strongly believe that there is a lot of issues with how much of a role they play in residency selection.

So what? Cutoffs exist because it’s too much work to sift through apps otherwise. That doesn’t change unless you limit the amount of apps. And as I discussed previously, there’s very little difference in the majority of apps besides scores.

A 32 and a 35 were treated very differently on the old mcat and that could literally be 3 questions. Similarly, a 508 and 512 were treated differently when I applied and that could be the difference between 4 questions. People repeat years in med school because they failed by a question or two. A line has to be drawn somewhere.

If we’re using personal anecdotes then allow me to convey my own. It’s very common for the med students who form the best ddx and tx plans on wards to also happen to have top quartile class rank and high board scores. Are there exceptions to the rule? Sure. But for the most part knowing more stuff is generally a preferred trait.

I fail to understand why step matters so much....

Full Member

Alta California

Full Member

IM PSTP Resident

Full Member

Full Member

Full Member

Senior Member

Full Member

Full Member

Full Member

Senior Member

General Surgery PGY-5

Full Member

deleted1139416

Senior Member

SDN Chief Administrator

PGY3 Ortho

SDN Chief Administrator

deleted1139416

#CallVascular

SDN Chief Administrator

#CallVascular

SDN Chief Administrator

Senior Member

Senior Member

Full Member

Pastafarians Unite!

SDN Chief Administrator

deleted1139416

Pastafarians Unite!

Full Member

Full Member

Pastafarians Unite!

deleted1139416

Full Member

He moʻolelo ia e hoʻopau ai i ka moʻolelo holoʻoko

"Good news, everyone"

Full Member

Full Member

Full Member

Full Member

Pastafarians Unite!

Membership Revoked

Full Member

deleted1139416

Full Member

Pastafarians Unite!

deleted1139416

Full Member

Similar threads