MD & DO EVMS MD critique of NBME

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.
All I'm seeing is whining. A gripe with step 2 CS is understandable, but step 1 and step 2 CK are validated and offer an opportunity to help stratify an enormous applicant pool. Everyone has access to the same materials, it's on you to deliver. If you can't, that's okay....if you can't and whine about it, get over it. LORs read so similarly they are useless unless they are negative...Deans letters are intentionally written to maximize match rates...boards are currently the only measure of objective performance that can be used to compare an applicant from North Dakota to someone over-seas. Is it fair that one day performance has a huge impact on your match success? Probably not..but that's life. Find a better way..until then, quit whining.

Members don't see this ad.
 
  • Like
Reactions: 2 users
I always felt that using the USMLE scores to stratify applicants was not the intent and spirit of the exam. It is simply a exam used for LICENSING. IT really shouldnt even be used by schools for promotion. It has simply morphed into something totally different from it's initial goal.

For those of you who say *well how will we stratify applicants?" Ill tell you how! Use the deans letter. Read comments about applicants. Interview. Choosing an applicant because of high scores on usmle will make you choose the wrong person more often than not.

Most programs do not rank just based on step..they may interview based on your score but once you get to the interview, your score takes a back seat to your perceived fit, interview day and personality. Is it still included in the rank? Sure..but to suggest someone is matched just based on their score is flat out wrong.
 
  • Like
Reactions: 4 users
All I'm seeing is whining. A gripe with step 2 CS is understandable, but step 1 and step 2 CK are validated and offer an opportunity to help stratify an enormous applicant pool. Everyone has access to the same materials, it's on you to deliver. If you can't, that's okay....if you can't and whine about it, get over it. LORs read so similarly they are useless unless they are negative...Deans letters are intentionally written to maximize match rates...boards are currently the only measure of objective performance that can be used to compare an applicant from North Dakota to someone over-seas. Is it fair that one day performance has a huge impact on your match success? Probably not..but that's life. Find a better way..until then, quit whining.
what if everyone gets a 250/250 on their boards? Then what...... Add another layer of exams to stratify that application pool further?
 
Members don't see this ad :)
what if everyone gets a 250/250 on their boards? Then what...... Add another layer of exams to stratify that application pool further?
There is no rule from the NBME, ACGME, AHA, or even the AARP that is forcing PDs to place emphasis on step score. They can currently read every application in its entirety to evaluate fit and stratify accordingly. Nothing is stopping them. They can treat step as a P/F excercise in the current system.
 
  • Like
Reactions: 2 users
what if everyone gets a 250/250 on their boards? Then what...... Add another layer of exams to stratify that application pool further?

Statistically improbable...the mean will continue to be around 230...with a deviation around 20..and a standard error around 6. When everyone magically performs the same...we can have a conversation about the utility of using the exam as part of the screening process.
 
  • Like
Reactions: 1 user
Just replying to the idea about a specialty specific exam. I’ve thought about this a lot and I keep coming to the conclusion that it would be a lot better for a number of reasons:

1) renewed investment in their medical education by students rather than attempting to hack a high stakes exam.

2) an exam that is potentially better at comparing applicants than the POS that is step one where a 250 and a 235 are not significantly different. We instinctively think those are scores are entirely different leagues but the
Nbme data says the difference does not reach the threshold of statistical significance.


3) incoming interns/residents with better field- relevant knowledge. Imagine a world where students put the same effort into studying what we want them to know rather than a lot of basic science stuff that - while important foundationally - May largely comprise material that is less relevant.


I’d like to see the usmle continue to report detailed scores to students and their schools. The one issue I can see with field specific exams is the delay in taking doesn’t allow much time to prepare backup plans. Giving people and their schools scaled data would help someone in the 10th percentile nationally realize that it may be risky applying only to a highly competitive field. Having national comparative data would help schools properly advise their students.

I don’t mind that students have a high stakes high pressure exam. The current setup has had the side effect of destroying student investment in their own curriculuae.

I think residency programs could also adopt other objective screening methods. There has been some flirtation with telephone based behavioral question screens that are scores by an independent company and the scores given to programs. This is pretty standard in corporate interviews and in the HR literature has proven highly predictive of job performance. Add something like that to field specific exam and you’ve got some powerful objective data.

Regardless, I’m encouraged that the winds of change are starting to stir!
Correct me if I am wrong, but a Standard error is 6 so any score that is 12 or more points different is statistically significant in its difference.
 
Last edited:
Correct me if I am wrong, but a Standard error is 6 so any score that is 12 or more points different is statistically significant in its difference.

Unless it’s changed, they quote the standard error of difference of 8 points. It’s slightly different from standard error of the mean.
 
Correct me if I am wrong, but a Standard error is 6 so any score that is 12 or more points different is in fact statistically significant and different.

Standard error is a broader term...standard error could refer to measurement, or difference. 6 is the standard error of measurement of the test...which reflects the precision of the test itself. The standard error of difference is currently around 8...which reflects the distance of significance between 2 scores. This is why 250 is the magical score...it's 2 standard error of difference above the average score + some cushion (229 mean...8 standard error of difference...229 + 16 = 245). PDs can know for sure..student A (250) did better than the average medical student (229) on the exam. Though at the end of the day...I'm sure higher is just better.
 
Last edited:
  • Like
Reactions: 2 users
Don't let the naysayers convince you, Higher is always better. We all have to admit this. 250 is such a feat of "grandeur" especially when the 250'er and 229'er both had 2 years to prepare for their exam. It's not like they didn't know that their step 1 would break or make them. We all know this!

Standard error is a broader term...standard error could refer to measurement, or difference. 6 is the standard error of measurement of the test...which reflects the precision of the test itself. The standard error of difference is currently around 8...which reflects the distance of significance between 2 scores. This is why 250 is the magical score...it's 2 standard error of difference above the average score + some cushion (229 mean...8 standard error of difference...229 + 16 = 245). PDs can know for sure..student A (250) did better than the average medical student (229) on the exam. Though at the end of the day...I'm sure higher is just better.
 
  • Like
Reactions: 1 user
This is simply not true.



Literally all dean's letters say something like this. And every LOR says the person is in the top 5% of everyone they have ever worked with.



Your definition of data and mine differ.

Does IM do milestones? I know they’re done in Peds, but I’m not sure if it was a ACGME wide thing or just the Peds part. Do you think if LCME started doing milestones that schools would report them accurately? Obviously it is in the school’s best interest for the student to match well, so there’s a conflict of interest, hence my curiosity.
 

That's interesting. But I don't know how much it helps me. What it says is that some residents run into trouble, if that trouble is communication / professionalism then the chance of recovery is not good. If the problem is medical knowledge / ITE scores (which they report as two separate things, but are presumably correlated) or efficiency, then the chance of remediation is better.

This is not terribly surprising. Of all the deficiencies to have, medical knowledge is the easiest to fix. And efficiency usually gets better with time and training.

So if only I could tell who had communication problems prior to matching. If only there were some standardized test on communication... oh wait...

And, if your point was that I should take people regardless of their exam scores because I can often successfully remediate that, remember that my goal is to not have to remediate anyone -- then I can put my full energy into creating a better program!

Anyway, I think there are better articles out there:
Comprehensive Assessment of Struggling Learners Referred to a Graduate Medical Education Remediation Program. - PubMed - NCBI

Looking at the PubMed search for "medical resident remediation", there appears to be a bunch of pubs by the EM folks that I have not looked at yet.

My deans letter didn't say all great things.
Nowadays it's pretty much required to be boarded. So the test is geared towards passing everyone within reason . Anesthesia has 80% pass on orals. WHich is quite low. BUt 80 perecent is a decent number of people. And the ones who dont pass, don't really take it seriously.

I'm not sure what knowledgebase you're basing your assessment that those that don't pass "don't take it seriously". I've graduated residents over the last 10+ years. I get a chance to work with them and watch their study habits for 3 years, see their ITE scores each year, and then their ABIM performance. Most of my residents pass. Some number have failed. A few didn't take it seriously - "I've never failed anything before, I'm sure I'll do fine" is the usual refrain. But most take it very seriously, recognizing it's a huge problem if they fail -- and they still failed. I'm just one program, so I can't really generalize to everyone. But neither can you, and I think my experience base is likely more extensive than yours.

Regarding MSPE's, they are very hard to assess. Some schools state that they put all comments, unedited, into their MSPE's. Some pick and choose -- something seems "out of place" they leave it out. But maybe that one person was the one telling the truth? And commonly there's a bunch of nice sounding comments, and then one concern. And often the concern is vague and hidden: "XYZ's presentations started off as somewhat disorganized, but by the end of the rotation they were at the level expected of a medical student." What does this mean? Does it mean that the student is fine? Or did they start off terrible, and end up just barely good enough to pass? At my own school, I can tell you that either is possible with a statement like this.

what if everyone gets a 250/250 on their boards? Then what...... Add another layer of exams to stratify that application pool further?

This is an interesting comment. It's of course very unlikely that everyone gets a 250. But, if the step is changed to P/F, this is EXACTLY what happens. Everyone who doesn't fail gets a Pass, and all the "scores" are the same. If that happens, options are to add another exam with a score (that might be specialty specific), or I'll just need to make decisions based on something else. It would mean that grades might hold more weight -- no longer could you get a HP in IM and a 250 on S2 and get an interview, now if you get a HP and Pass on S2, you don't get an interview (because I put more weight on the grade). Note that I'm not saying this is good or bad -- all we're doing is shuffling the deck. I only have a maximum number of IV slots, and more applicants that can fit. I need to decide, in some way, whom to interview. Take away the USMLE, and I'll have to use something else. And it might be "school reputation". Or whether I know someone at your school I can call to find out the "real story". Or any number of other factors that are not under your control. Personally, I'd rather have something under your control. But, like, that's just my opinion?

Does IM do milestones? I know they’re done in Peds, but I’m not sure if it was a ACGME wide thing or just the Peds part. Do you think if LCME started doing milestones that schools would report them accurately? Obviously it is in the school’s best interest for the student to match well, so there’s a conflict of interest, hence my curiosity.

Yes, we do. And honestly, I think they are completely useless. It looks like Peds did a much better job with their milestone development. There's already EPA's for graduating medical students (and EPA's are supposed to "fix all the problems with milestones". Which they won't.) But ultimately what I need is some sense of how well each student did. Rather than a single scale, it would be nice to know academically (i.e. exams), clinically (taking care of patients), and "other stuff" which might be research, community engagement, administrative, etc (pick something, you can't do it all). But those last two are very difficult to measure and compare.
 
  • Like
Reactions: 1 users
Yes, we do. And honestly, I think they are completely useless. It looks like Peds did a much better job with their milestone development. There's already EPA's for graduating medical students (and EPA's are supposed to "fix all the problems with milestones". Which they won't.) But ultimately what I need is some sense of how well each student did. Rather than a single scale, it would be nice to know academically (i.e. exams), clinically (taking care of patients), and "other stuff" which might be research, community engagement, administrative, etc (pick something, you can't do it all). But those last two are very difficult to measure and compare.

That's certainly fair. And I had forgotten about EPAs. I looked at them extensively last year when I was helping build a curriculum for fourth year students, but I don't remember any scales associated with them. I also haven't been in the med school environment for a number of years, so I'm not sure what the students are being told now.

In the CCCs that I've sat in on (and the ones I've been told about...), all those factors are looked at. We look at milestone performance, we look at ITE scores, and we look at evaluation comments (I guess not so much the 'other' stuff except that they are on target to meet graduation requirements of a scholarly project of some variety). Maybe if schools made an effort to put together an MSPE that did something similar (e.g. progress on EPAs, subjective comments from rotations, and Shelf exam scores), it would be more useful to look at. I know most peds programs offer interviews before the MSPE is out because they don't see it as super useful in the screening process, and my impression is that IM programs are the same. That indicates to me that schools need to work on the MSPE to make it functional for residencies again. But, again, they have a conflict of interest because they want their students to place well, so 'hiding' some of that information is to their benefit.

I don't know a solution, just thinking about other things.
 
  • Like
Reactions: 1 user
If medical schools created CCC's and then evaluated all students on EPA's on a scale and then reported that to us in the MSPE, along with a distribution of where all students were, that would be super helpful. The amount of work this would be for 100 students in a class would be enormous, and how you would actually do it is unknown.

The Med School EPA website is here, in case anyone wants to look: Publications and Presentations - Core EPAs - Initiatives - AAMC

The abridged tookit is the easiest one to look at. The scales for assessment are at the end.

But my understanding is that these EPA's are what's expected of graduating medical students. I think we'll just be told "this student meets all EPA's", which is the same as P/F, so we are back to where we started.
 
  • Like
Reactions: 2 users
What are people's thoughts on the emphasis on research especially in competitive programs and specialities?
I think 1-2 research projects definitely shows commitment to the field and also allows for building connections, but I think it's being over-emphasized, especially with people taking gap years to do research. I think fields would be better off requiring people to a sub-internship type of deal during the gap year to prove their mettle and commitment to the field than waste time doing mindless data collection and chart reviewing.

I guess my point is, not just step 1 but other parameters of residency applications also need to be looked at, one of them being research. 3rd year grades (clinical evaluations aspect of it) are also something that I think need to changed, especially in making them pass or fail but that's another matter which good test takers will like and bad test takers will dislike.
 
Last edited by a moderator:
What are people's thoughts on the emphasis on research especially in competitive programs and specialities?
I think 1-2 research projects definitely shows commitment to the field and also allows for building connections, but I think it's being over-emphasized, especially with people taking gap years to do research. I think fields would be better off requiring people to a sub-internship type of deal during the gap year to prove their mettle and commitment to the field than waste time doing mindless data collection and chart reviewing.

I guess my point is, not just step 1 but other parameters of residency applications also need to be looked at, one of them being research. 3rd year grades are also something that I think need to changed, especially in making them pass or fail but that's another matter which good test takers will like and bad test takers will dislike.
Upper tier programs like to think they are training the future leaders in the field. Research is a flaming hoop you have to jump through to let them think you'll meet that standard. It also helps people identify if they very well might want to do it long term.

I think it's way over-emphasized personally, but that's just me.
 
  • Like
Reactions: 2 users
https://www.ajog.org/article/S0002-9378(17)32811-9/pdf
pdf
so no, you're not aware of any data to back up that claim. Intuitively it makes sense that if you *only* go by step score you end up with inferior physicians, but I would be really surprised if it the idea that higher score "more often than not" was inferior held up.
 
  • Like
Reactions: 3 users
Top