We all know clerkship grading is highly variable, but I hadn't seen it quantified before:

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.
Just because you don't have normalized/z-score based clerkships at your school doesn't mean nobody else does either

I'm not even in medical school lol. I know some schools do that, and in the unlikely event I find myself in a position to choose: I will surely avoid them.

The issue is that grades, especially clerkship evals which are by definition subjective, are always going to cause these problems because of the exact reasons I've mentioned. People are always going to do things differently with grading, and as a direct result comparing them just isn't useful if there's no standard by which to do so. But program directors seem to be doing it anyway. It doesn't matter if we're talking high school, college or med school. As for how to fix it? Pass/Fail grading. I imagine the LCME could require that all rotations be pass/fail graded so just go with that I guess.

I wouldn't want to grade students on an A-F scale for the same reason I don't want want to review my Uber driver. They got me where I needed to go, should I give them 5 stars? If I don't it's gonna hurt them so 5 stars it is. Does the Uber driver that avoided an accident from somebody running a red light only deserve a 5 star review when 5 stars are handed out like candy? The issue with grading isn't practical, it's fundamental. We as a society, need to get over ourselves and stop rating everything and each other.
 
Last edited by a moderator:
Interested to hear @NickNaylor thoughts on this actually - would you be ok with grading per the rubric but then having the clerkship director adjust so you're even with the other evaluators?

Is having your 3 entered as a 4 by them instead of you more acceptable?

I'm not opposed to this kind of idea in principle, but it begs the question of what the purpose of a grading rubric is to begin with if it's simply going to be ignored. If there's some way to evaluate a student's actual clinical performance in a standardized way - maybe OSCEs could serve as a "check" against the provided clinical evaluations, I don't know - then I'm not opposed to it. But arbitrarily saying "this guy gives too many high passes, he's wrong" with no correlation to the actual thing being evaluated makes little sense. If there's no third-party assessment of the student, simply changing grades assumes the conclusion based on nothing. What if I'm actually right and all of the other faculty who gave the student "4s," to use your example, are wrong? How do we deal with that? Should the student's grade be corrected downward? Or is this a one-way-elevator type of situation? With enough data and enough thought put into it, I don't think that attempting to standardize scores based on a specific evaluator's quirks - both inflationary and deflationary - is unreasonable.

To answer your last question, it could be acceptable if based on something other than "this guy gives too few honors, those are rookie numbers, let's pump those up." But in a hypothetical scenario where I get to feel good about submitting 3s that are arbitrarily changed to 4s by the clerkship directors with no actual reason, no, that would not be acceptable to me. Again, you are assuming in your questions that I am grading incorrectly and that my grades are invalid - that the "true" grade, whatever that means, is higher than how I'm rating my students. What if that's actually not the case? What do we do then? If some kind of perfect system were to be implemented, would you be ok if the net result was grade deflation?
 
Medicine is subjective the whole way through from training to attendinghood. You're being analyzed in subjective, unfair, incomplete ways that can have tangible effects on your career. Better to wise up to it sooner rather than later.

If you're going to be a stiff grader as a resident or attending, the hope is that with every student you are giving them a chance at honors by building them up and giving them the feedback they need to materially improve. I've come across hardasses who also weren't keen on student engagement, and it's hard to be enthusiastic about the state of clinical grading in situations like that.
 
I'm not opposed to this kind of idea in principle, but it begs the question of what the purpose of a grading rubric is to begin with if it's simply going to be ignored. If there's some way to evaluate a student's actual clinical performance in a standardized way - maybe OSCEs could serve as a "check" against the provided clinical evaluations, I don't know - then I'm not opposed to it. But arbitrarily saying "this guy gives too many high passes, he's wrong" with no correlation to the actual thing being evaluated makes little sense. If there's no third-party assessment of the student, simply changing grades assumes the conclusion based on nothing. What if I'm actually right and all of the other faculty who gave the student "4s," to use your example, are wrong? How do we deal with that? Should the student's grade be corrected downward? Or is this a one-way-elevator type of situation? With enough data and enough thought put into it, I don't think that attempting to standardize scores based on a specific evaluator's quirks - both inflationary and deflationary - is unreasonable.

To answer your last question, it could be acceptable if based on something other than "this guy gives too few honors, those are rookie numbers, let's pump those up." But in a hypothetical scenario where I get to feel good about submitting 3s that are arbitrarily changed to 4s by the clerkship directors with no actual reason, no, that would not be acceptable to me. Again, you are assuming in your questions that I am grading incorrectly and that my grades are invalid - that the "true" grade, whatever that means, is higher than how I'm rating my students. What if that's actually not the case? What do we do then? If some kind of perfect system were to be implemented, would you be ok if the net result was grade deflation?
I again think we're on the same page here, except I think rubrics are worthless. For a rubric to be worth something, the PD reading our applications would have to reference our grades against the rubric.

We all know that doesn't happen. What does happen is that PDs compare how we did relative to our peers, and how that compares to other applicants.

So yes, right now in the modern system, I think it's wrong to grade to the rubric. I know at face value that seems like nonsense. But when the function of the grade is to compare to other med students, and not to compare against the rubric, then grades should be assigned (or adjusted) by comparison to other med students, and not the rubric.
 
So in this scenario, would grading that's based entirely on the shelf score be considered preferable to you all? My school does things that way, in a H/P/F setting. From what I understand there is a score set to honor and then to pass (I don't think they adjust it based on the % of students that honor or pass or whatever). I haven't started M3 yet but have gone back and forth about how I feel about this system...on the one hand, it makes grading more objective and I don't have to worry about who I get as an evaluator tanking my grade outside of having their comments show up on my MSPE. On the other hand, it seems weird to totally dismiss actual performance on the wards. Thoughts for seasoned M3s?
My school does something where you need to honor both the eval and the shelf to effectively honor.
The nice thing is that you have multiple people evaluating you. So if I do get a 3 bomber it doesnt matter because I have 4 other people who gave me 5's. I know this might be sacriligious , but I am honestly a little bit more of a fan of the rotations where the Program director has more control over the final grade, so even if you are short one of the 5 criteria to honor they can bump you up, or take you down regardless of the numerical totals of the evaluation on the likart.
 
My school does something where you need to honor both the eval and the shelf to effectively honor.
The nice thing is that you have multiple people evaluating you. So if I do get a 3 bomber it doesnt matter because I have 4 other people who gave me 5's. I know this might be sacriligious , but I am honestly a little bit more of a fan of the rotations where the Program director has more control over the final grade, so even if you are short one of the 5 criteria to honor they can bump you up, or take you down regardless of the numerical totals of the evaluation on the likart.

At my sites, practically everyone gives out 5s. So if you get a single 3 bomber even the other four evals wont save you. pretty sad tbh
 
At my sites, practically everyone gives out 5s. So if you get a single 3 bomber even the other four evals wont save you. pretty sad tbh
Im not sure what the threshold is for you to get Honors. 23/25=92.5%
 
I again think we're on the same page here, except I think rubrics are worthless. For a rubric to be worth something, the PD reading our applications would have to reference our grades against the rubric.

We all know that doesn't happen. What does happen is that PDs compare how we did relative to our peers, and how that compares to other applicants.

So yes, right now in the modern system, I think it's wrong to grade to the rubric. I know at face value that seems like nonsense. But when the function of the grade is to compare to other med students, and not to compare against the rubric, then grades should be assigned (or adjusted) by comparison to other med students, and not the rubric.

Fair enough, and I actually don't disagree with your perspective, though IMO simply grading students in comparison to one another would make evaluations even more subjective if there were no criteria at all with which to evaluate them.
 
Im not sure what the threshold is for you to get Honors. 23/25=92.5%
One of my recent clerkships was structured such that Honors required you be above the cohort average in all areas (evals, shelf and SPs). So yeah, if you got a 93 but average was 95, hope you didnt intend to match that specialty...
 
One of my recent clerkships was structured such that Honors required you be above the cohort average in all areas (evals, shelf and SPs). So yeah, if you got a 93 but average was 95, hope you didnt intend to match that specialty...

Our honors according to previous histograms lets about 10-20 percent of the class honor. The cutoffs are pre-established and if the whole cohort gets it , they get it, they will just end up adjusting for the next cohort. It also leads to clear expectations on what the threshold is so there are no grade appeals afterwards. It might have to do with our large class size.

Edit: I dont think this excludes people in my example. Because a good chunk of our class is not gunning for honors, they jsut want to get by. And theoretically people are being dragged down in your example as well, because you arent the only person getting three bombed here. The only real risk has been getting a bunch of three bombers in a row. Even that can be mitigated if you know before hand and can add a few more residents to the evals.
 
Our honors according to previous histograms lets about 10-20 percent of the class honor. The cutoffs are pre-established and if the whole cohort gets it , they get it, they will just end up adjusting for the next cohort. It also leads to clear expectations on what the threshold is so there are no grade appeals afterwards. It might have to do with our large class size.

Edit: I dont think this excludes people in my example. Because a good chunk of our class is not gunning for honors, they jsut want to get by. And theoretically people are being dragged down in your example as well, because you arent the only person getting three bombed here. The only real risk has been getting a bunch of three bombers in a row. Even that can be mitigated if you know before hand and can add a few more residents to the evals.
Bingo on the bolded. After my first clerkship my friends and I learned to game the system and scope out the good evaluators or do damage control on bad evaluators. In the worst cases - someone wanted to go into the specialty and wasn't certain their evaluator was gonna 5bomb them - it's even reached the point of explicitly telling the evaluator about the 4.5/5 average.

As long as PDs use grades to compare students against one another instead of against the rubric, that's going to be the safe route to Honors. It's let me Honor every rotation since, while I slack off more. Completely messed up system that myself and others are succeeding in for all the wrong reasons.
 
After reading through the thread so far, I had a few reflections on my own experience grading medical student secondary applications and MMI interviews.

1. As secondary application graders we received objective reports on how our grading compared to every other grader in the pool, we received stats on our overall average grade for each question, how many times the applications we graded went to further review because of an outlying score (either ours or the other grader), which direction our outliers tended to be, and what percentage of the time we had an outlying score. I took this role very seriously and was under the impression that I was grading to the letter of the criteria. After we received our first report, I found that I was right around the average grader with most questions, except one, and in about 10% of applications reviewed, I was giving a low outlier score for that question and I was giving overall below average scores for responses to this question. This made me check in with that scoring criteria to make sure I was reading it correctly because if I am the negative outlier, perhaps it is me who is interpreting the criteria incorrectly or in a biased way. Maybe reports like this would be helpful for those evaluating medical students.

2. Everyone is going to approach grading guidelines with their own bias. That is why it is important to eliminate it from criteria as much as possible. My school has a grading scale based on meeting/exceeding/falling below "expectations". Everyone is going to come to the table with a different interpretation of what expectations for a medical student are, informed by many different things. I did an away rotation and I was floored by the grading criteria when I finished - it was based on actions! Not some nebulous expectations, but the student never/sometimes/often/always doing an action. I think a grading system rooted in what the student actually does or does not do under observation is far superior.

3. I don't understand why schools try to obscure grading into 1/2/3/4/5 or 100/90/80/70/60. Just put the actual grade and let attendings use their judgement as to whether they think the student is Honors/High Pass/Pass/Fail for a certain criteria so we can all communicate in the same language. A resident who was an IMG (unfamiliar with our grading system) told me that she gave everyone 3s because her understanding was that that was a pretty great grade! Not knowing of course that for us that is the equivalent of a Pass/70%.

4. Criteria itself. On one hand, grading guidelines are important to make sure students are being assessed on the things that institutions want them to be assessed on - professionalism, ability, aptitude, growth, etc and not things they don’t want them assessed on. But as a grader I found some of the institutional guidelines/what they wanted students graded on to be not capturing the full scope of student responses. There were times where students were exceptional in ways that the criteria did not give me latitude to grade them on. There were times during interviews where I had to change the order or wordings of questions because it was clear the admin guidelines were not working. I was unwilling to continue to hurt students on the guidelines if the follow-up questions were not adequately getting them there. With both of these issues, I would typically put this in the students comments and discuss with the admin in charge who at the very least seemed appreciative. I would hope that attendings would take the time to take issues like this up with admin, at least once. We know you don't have a dog in the fight and it's onerous and might lead to nothing, but please try for us. You have a better chance than we do at making some actual changes.
 
Honestly I feel like a lot of people should get honors and a lot pass and a small percentage fail if they’re a sociopath or something.

A lot of people will think someone is worse just because they don’t have a personality similar to their own and they misconstrue that to mean the person isn’t good. I just honor everybody because the system sucks and just because I wouldn’t hang out with you after work it doesn’t mean I should stand in the way of their specialty choice. 3rd year goal should be to figure out what you wanna dowith your life, and objective stuff should drive the rest. Tests aren’t perfect but at least it’s objective.

Yes and no. I've given plain passes to people (never failed anyone), but you have to really suck to get a plain pass from me. In one particular case, the student routinely showed up late, always tried to leave early, never went to see his own patients unless I told him to, tried to get out of doing notes, and even challenged me on a treatment plan (and by challenge, I mean instead of writing his note on a different patient, he began asking other attendings what they would do. Fortunately, none of my colleagues entertained his nonsense/splitting behavior). His knowledge base was actually decent, but his performance was unbelievably bad and that matters. I think some of my colleagues may have failed him just for being a tool.
 
Medicine is subjective the whole way through from training to attendinghood. You're being analyzed in subjective, unfair, incomplete ways that can have tangible effects on your career. Better to wise up to it sooner rather than later.

If you're going to be a stiff grader as a resident or attending, the hope is that with every student you are giving them a chance at honors by building them up and giving them the feedback they need to materially improve. I've come across hardasses who also weren't keen on student engagement, and it's hard to be enthusiastic about the state of clinical grading in situations like that.

Agreed. You need to standardize yourself as a grader. I give a lot of high passes just for being competent. True stars get honors from me and I've had a few. There was one student who was planning to go into general surgery, but on her psych rotation with me, she was an absolute star. She'd get there early without being told to (it's psych, we're chill and I don't care if you show up after me), would read up on all her patients and all the patients I'm following alone, would look up things on UpToDate and when I'd come in, she'd say something like "you probably know this, but in case you don't, I found literature on so and so." Usually, it would be stuff I already knew and when I told her that, she'd still read up on it for her own knowledge. She would study at night and come in with legitimate questions the next day (not just questions she'd ask to feign interest). I'd often let her leave early, but she never once asked or assumed she could. She was there to get the work done and I learned quickly to trust her interview and exam because she was so thorough and she knew her stuff. Her notes lacked a lot of detail we expect in psych, but that's a minor thing that's easily corrected and she was great at receiving and implementing feedback. If I could have given her triple honors, I would have.

There were less impressive students I also gave honors to. Basically, I'm looking for someone who is willing to learn and is decent to be around. If you're acting like I'm sending you to the death chamber by asking you to see a patient or stay past 5 pm to complete a late consult, chances are that won't be received well. Personality matters.
 
Yes and no. I've given plain passes to people (never failed anyone), but you have to really suck to get a plain pass from me. In one particular case, the student routinely showed up late, always tried to leave early, never went to see his own patients unless I told him to, tried to get out of doing notes, and even challenged me on a treatment plan (and by challenge, I mean instead of writing his note on a different patient, he began asking other attendings what they would do. Fortunately, none of my colleagues entertained his nonsense/splitting behavior). His knowledge base was actually decent, but his performance was unbelievably bad and that matters. I think some of my colleagues may have failed him just for being a tool.
Yea obviously if they’re crazy it’s on them. Most people are not like this though.
 
Since the point hasn’t been made yet but is well documented in the literature and was the motivating rationale for at least UCSF to switch to P/F Clerkship grading going to put these papers on how current clerkship grading schemes also disproportionately disadvantage URMs:


and before anyone says some whack stuff about Step 1 scores or something here’s a study from UWash that accounts for that:



This is of course not an exhaustive literature review, but when your grade is better predicted by your rotation site or ethnicity more than by any academically relevant factor, obviously there’s a big problem.

I don’t think P/F would work for most schools, but one could theoretically develop an evaluation system that both provides meaningful differential feedback, performance review AND doesn’t rely on producing a one-dimensional output like a “letter” grade.

Don’t want to go into it, but plenty of literature in psych, business, education journals on performance evaluation to back that up. We would first have to start with a hard cap on residency applications or no one will have the time to properly apply these methods.
 
Since the point hasn’t been made yet but is well documented in the literature and was the motivating rationale for at least UCSF to switch to P/F Clerkship grading going to put these papers on how current clerkship grading schemes also disproportionately disadvantage URMs:


and before anyone says some whack stuff about Step 1 scores or something here’s a study from UWash that accounts for that:



This is of course not an exhaustive literature review, but when your grade is better predicted by your rotation site or ethnicity more than by any academically relevant factor, obviously there’s a big problem.

I don’t think P/F would work for most schools, but one could theoretically develop an evaluation system that both provides meaningful differential feedback, performance review AND doesn’t rely on producing a one-dimensional output like a “letter” grade.

Don’t want to go into it, but plenty of literature in psych, business, education journals on performance evaluation to back that up. We would first have to start with a hard cap on residency applications or no one will have the time to properly apply these methods.
Interesting, unless I'm reading this incorrectly, their findings include Female > Male as well as White = URM > ORM (non-URM minority).

So the most disadvantaged person in their clerkship grading would be...an Asian male? Surprising.
 
Interesting, unless I'm reading this incorrectly, their findings include Female > Male as well as White = URM > ORM (non-URM minority).

So the most disadvantaged person in their clerkship grading would be...an Asian male? Surprising.

in the multivariate model for UW that is right, but it's worth commenting that variation comes from many things that most of us would consider irrelevant (clerkship site, which rotation, etc.; relevant odds table attached)

1586276793193.png


I would caution *over* interpreting this because of the variables involved and the fact this is a single institution (e.g. in the UCSF paper their specific findings are a bit different, although the conclusion is essentially the same, and inter-institutional differences are difficult to account for in this kind of study). The bigger takeaway is essentially the same as in OP: a variety of subjective biases (likely many of them unconscious) and uneven practices lead to high variability in clerkship grading, reducing its effectiveness as a metric for both med students and PDs.
 
in the multivariate model for UW that is right, but it's worth commenting that variation comes from many things that most of us would consider irrelevant (clerkship site, which rotation, etc.; relevant odds table attached)

View attachment 301399

I would caution *over* interpreting this because of the variables involved and the fact this is a single institution (e.g. in the UCSF paper their specific findings are a bit different, although the conclusion is essentially the same, and inter-institutional differences are difficult to account for in this kind of study). The bigger takeaway is essentially the same as in OP: a variety of subjective biases (likely many of them unconscious) and uneven practices lead to high variability in clerkship grading, reducing its effectiveness as a metric for both med students and PDs.
The Canadian system is looking better and better.
 
can that even be implemented in US

Sure? These aren’t laws of nature. Question is: are the levers in place to pressure people with power to change these things to do so. Step 1 going P/F is a big source of pressure because PDs will be hungry for ways to differentiate applicants. Again, it’s a similar choice as before: do nothing and things continue to not make any sense or find something new and try it in spite of uncertainty.
 
I could make a Grade Adjuster 2000 in excel in about ten minutes. There's no excuse not to do this. Glad to hear it's common practice at your school.
Sorry, I know this quote is from page 1 and is a bit old. It's not so easy to do this, at least not well. If a grader gave everyone Honors, would you be OK with changing all of their grades to High Pass? Because mathematically that's what you should do. Or does your adjuster only adjust upwards, at which point we should probably just give everybody Honors. Which is pointless.
Still not sure why the rubric isn't adjusted relative to MS3 performance. Like why is a 5 set to be the level of a resident?
I agree, but can tell you with more than 20 years of experience, no matter what you put on a 5 point rubric, everyone gets 4 or 5 with a scattering of 3's.

I agree the situation is frustrating. Clinical evals are not well suited to summative evaluations. In residency everything is formative -- there is no "grade", we just tell you what you're doing well and what you need to work on. But you need some sort of summative evaluation as a student because, somehow, residency programs need to pick their residents. It's a competitive process -- if grades were to go away (i.e. P/F), then something else would need to replace it.

One thing that we now do in our program is group assessments -- faculty get together once a month and review all of the residents who were on service for the month. We get much more useful information this way -- faculty might be worried about submitting "a problem" to our evaluation system, but they are quite willing to talk about it in a group, and if others disagree then we let it drop at that.
 
Sorry, I know this quote is from page 1 and is a bit old. It's not so easy to do this, at least not well. If a grader gave everyone Honors, would you be OK with changing all of their grades to High Pass? Because mathematically that's what you should do. Or does your adjuster only adjust upwards, at which point we should probably just give everybody Honors. Which is pointless.
Doesn't have to be that complicated. Just select that cell reporting "=AVERAGE" and change it to "=MEDIAN". Boom, now the lone 3-bomber isn't tanking anyone anymore.

If the result of that change is that everyone has 5/5 Honors, so be it. Congrats, your school is producing nothing but impressive students. Ignoring low-outlying evaluators or rotation sites to protect a fake distribution is not a better option.
 
I'm not opposed to this kind of idea in principle, but it begs the question of what the purpose of a grading rubric is to begin with if it's simply going to be ignored. If there's some way to evaluate a student's actual clinical performance in a standardized way - maybe OSCEs could serve as a "check" against the provided clinical evaluations, I don't know - then I'm not opposed to it. But arbitrarily saying "this guy gives too many high passes, he's wrong" with no correlation to the actual thing being evaluated makes little sense. If there's no third-party assessment of the student, simply changing grades assumes the conclusion based on nothing. What if I'm actually right and all of the other faculty who gave the student "4s," to use your example, are wrong? How do we deal with that? Should the student's grade be corrected downward? Or is this a one-way-elevator type of situation? With enough data and enough thought put into it, I don't think that attempting to standardize scores based on a specific evaluator's quirks - both inflationary and deflationary - is unreasonable.

To answer your last question, it could be acceptable if based on something other than "this guy gives too few honors, those are rookie numbers, let's pump those up." But in a hypothetical scenario where I get to feel good about submitting 3s that are arbitrarily changed to 4s by the clerkship directors with no actual reason, no, that would not be acceptable to me. Again, you are assuming in your questions that I am grading incorrectly and that my grades are invalid - that the "true" grade, whatever that means, is higher than how I'm rating my students. What if that's actually not the case? What do we do then? If some kind of perfect system were to be implemented, would you be ok if the net result was grade deflation?

Thanks for your perspective @NickNaylor. Good to hear from a freshly minted attending who was a M3 not too long ago. Also nice to see you around these parts after all these years.

That being said, when you were a M3,

1) Did you think your clinical clerkship evals were fair?
2) Did you feel that the weight assigned to your shelf grades was fair? I remember you mentioning years ago that you felt that they were too subjective with too narrow standard deviations

3) What are your thoughts on transitioning to P/F clinical grades now that you're an attending and evaluate applicants? How do you view applicants to your program coming from Top 20 schools w/ P/F clinical grading such as Yale? For instance, our former IM PD would throw out Yale applications for that reason since he felt that they weren't quite as clinically strong as others.
 
I switched one of our rotations to a different site so i could avoid a harsh grader. Had to commute an hour and a half everyday. My other classmates who didn't think ahead got stuck with this particular preceptor and it's now reflected on their MSPE (2 out of 3 got horrible comments). I'm hoping the school will take care of it before apps go out but it's a crappy situation nonetheless.

I'm not sure what the solution is. I'm just glad i'm done with the nightmare that is 3rd year.
 
There's one more curse to grading: Even if you get good grades, you always keep doubting yourself because the process is so subjective and your evals and shelf scores come down to luck.

For instance, I honored IM and Peds and scored 90%+ on both shelves. But my clinical evals in IM were purely a function of the luck of the draw and in getting phenomenally nice attendings. In Peds, the order of my subrotations was set up in such a way (due to luck) that I started out in a specialized area and then ended on the broadest area. This was to my advantage since as my knowledge grew through studying, so did my apparent competency to evaluators. My friends on that same Peds rotation with the inverse subrotation schedule as me got absolutely hammered.

My shelf scores are purely due to knowing how to "hack" the logic of a test and recognizing buzzwords from UWorld. Whenever I'm working up patients, I am painfully slow because I keep doubting myself on my differential and whether or not I missed something in the chart. I feel that I can easily nail MC questions, but struggle when I have a live patient in front of me to think of a Ddx based both on common and likely things rather than weird obscure stuff. I'm freaking out atm as I am now transitioning into 4th year and have audition SubI's coming up because I simply just don't feel confident in my abilities and that I'm a fraud
 
There's one more curse to grading: Even if you get good grades, you always keep doubting yourself because the process is so subjective and your evals and shelf scores come down to luck.

For instance, I honored IM and Peds and scored 90%+ on both shelves. But my clinical evals in IM were purely a function of the luck of the draw and in getting phenomenally nice attendings. In Peds, the order of my subrotations was set up in such a way (due to luck) that I started out in a specialized area and then ended on the broadest area. This was to my advantage since as my knowledge grew through studying, so did my apparent competency to evaluators. My friends on that same Peds rotation with the inverse subrotation schedule as me got absolutely hammered.

My shelf scores are purely due to knowing how to "hack" the logic of a test and recognizing buzzwords from UWorld. Whenever I'm working up patients, I am painfully slow because I keep doubting myself on my differential and whether or not I missed something in the chart. I feel that I can easily nail MC questions, but struggle when I have a live patient in front of me to think of a Ddx based both on common and likely things rather than weird obscure stuff. I'm freaking out atm as I am now transitioning into 4th year and have audition SubI's coming up because I simply just don't feel confident in my abilities and that I'm a fraud

It feels like across schools, 3rd year = finding the right attendings to get as many honors as possible.
 
It feels like across schools, 3rd year = finding the right attendings to get as many honors as possible.
Part of the third year game is learning how to read your evaluators. Did I just get a vibe that this person is going to three bomb me, or screw me on the eval. Ok I’m going to select another person where I didn’t get that vibe. Sometimes you don’t jive with people or you don’t shine in front of people. The key to third year is realizing both situations and trying to mitigate with enough good impressions .

There are absolutely instances where I got threes and another person in my cohort got 5s from the same preceptor. There wasn’t really much performance difference between the two of us, the other person just had better rapport with the preceptor .
Interesting, unless I'm reading this incorrectly, their findings include Female > Male as well as White = URM > ORM (non-URM minority).

So the most disadvantaged person in their clerkship grading would be...an Asian male? Surprising.
no URMs were worse off.
“Female participants, younger students, and those with higher USMLE Step 1 scores and final clerkship exam scores consistently received higher final clerkship grades.”

the racial disparities decreased after adjusting for step 1 . However didn’t completely disappear. They did however not simultaneously adjust for clerkship grade and step 1 in the analysis of race.

I wonder what my grades would have been if I was white, younger and female.

Edit: I meant clerkship Exam instead of clerkship grade.
 
Last edited:
Doesn't have to be that complicated. Just select that cell reporting "=AVERAGE" and change it to "=MEDIAN". Boom, now the lone 3-bomber isn't tanking anyone anymore.

If the result of that change is that everyone has 5/5 Honors, so be it. Congrats, your school is producing nothing but impressive students. Ignoring low-outlying evaluators or rotation sites to protect a fake distribution is not a better option.

We agree to disagree, then. Everyone getting honors is another huge problem. If this became widespread, then grades would be ignored and we would select based upon something else.
 
We agree to disagree, then. Everyone getting honors is another huge problem. If this became widespread, then grades would be ignored and we would select based upon something else.

Perhaps. But it is well known that schools giving out mostly honors are helping their students since most PDs don't bother to adjust for % of students getting honors.
 
We agree to disagree, then. Everyone getting honors is another huge problem. If this became widespread, then grades would be ignored and we would select based upon something else.
Everyone getting Honors is problematic because you aren't stratifying. Everyone getting Honors except those with the bad luck to get a certain site/preceptor is equally problematic, because it means you are stratifying by chance instead of by real performance differences.

What do you disagree with in that statement? Because if the latter really feels like a better option to you just because it generates you a range of grades, then so should rolling dice or drawing names out of a hat.
 
I just want to point out how insane this situation is for surgical applicants in particular. The handful of failed GS,
uro, ortho, and ENT matches at my school that I’ve talked to over the past few years had perfect apps except for high passes on their surgery rotations secondary to evals or shelf. Funnily enough most of them soapd into another surgical specialty


it’s to the point where people don’t even consider themselves competitive surgical applicants with 250+ step 1s and don’t want to commit to their field until 10 weeks after they’ve finished their surgery rotation when evals come out
 
Thanks for your perspective @NickNaylor. Good to hear from a freshly minted attending who was a M3 not too long ago. Also nice to see you around these parts after all these years.

That being said, when you were a M3,

1) Did you think your clinical clerkship evals were fair?
2) Did you feel that the weight assigned to your shelf grades was fair? I remember you mentioning years ago that you felt that they were too subjective with too narrow standard deviations

3) What are your thoughts on transitioning to P/F clinical grades now that you're an attending and evaluate applicants? How do you view applicants to your program coming from Top 20 schools w/ P/F clinical grading such as Yale? For instance, our former IM PD would throw out Yale applications for that reason since he felt that they weren't quite as clinically strong as others.

1) At the time I didn’t, but in retrospect I think that’s just because I had no idea what I should be doing or what I was being graded on. I didn’t receive clear criteria for evaluations, what was expected with respect to shelf performance, etc. - the only information we got was how grades were determined proportionally (e.g., 70% evals, 30% shelf grade). I never saw clearly described criteria like we get for the students at my current institution. Maybe they existed and I just wasn’t aware of them - I’m not sure - but knowing how I was being graded would’ve been helpful. Just for the record, I generally did well on my rotation evals, so it’s not like I’m taking a “good for me but not for thee” position here.

2) I thought that the shelf was dumb and I still think it’s dumb. I generally honored evals but did poorly on shelf exams. My experience was that the shelf exams simply brought my overall clerkship score down. That personal experience and attendant bias aside, I think it’s absurd that a single exam can account for a substantial portion of a clerkship grade compared to the actual time spent, you know, doing clerkship. And from the perspective of someone trying to evaluate residency applicants, the shelf exam has extremely limited utility, and comments about a student’s performance are much more valuable. I recognize the need for an “objective” and “standardized” assessment, but in my own opinion this should simply be a P/F type of thing.

3) I have no problem transitioning to P/F grades because grades are nearly useless because of how inflated they are. When 30-40% of people on a clerkship are getting honors, 40-50% are getting high pass, and the rest pass, using a grade as a general, summative assessment of a student’s performance is pointless. I agree with [mention]aProgDirector [/mention]in that formative assessments - which is what you primarily see in the MSPE - are much more valuable. I would fully support a transition to P/F grades because it can’t possibly be anymore useless than the current system.
 
2) I thought that the shelf was dumb and I still think it’s dumb. I generally honored evals but did poorly on shelf exams. My experience was that the shelf exams simply brought my overall clerkship score down. That personal experience and attendant bias aside, I think it’s absurd that a single exam can account for a substantial portion of a clerkship grade compared to the actual time spent, you know, doing clerkship. And from the perspective of someone trying to evaluate residency applicants, the shelf exam has extremely limited utility, and comments about a student’s performance are much more valuable. I recognize the need for an “objective” and “standardized” assessment, but in my own opinion this should simply be a P/F type of thing.

My school sets pretty reasonable cut-offs for the shelf exams to get Honors. You need 78 on Surgery or 80 on IM to be eligible, then the rest is determined by your evals. I think this works well, testing baseline knowledge without having students feel pressured to go home and study all the time. I know some schools set crazy expectations like >90th percentile to be eligible which I can see as being disruptive to clinical education.
 
We agree to disagree, then. Everyone getting honors is another huge problem. If this became widespread, then grades would be ignored and we would select based upon something else.

But what's wrong with everyone who earns honors (based on pre-determined metrics) getting honors? You can certainly make it harder to earn honors, but if everyone does (and chances are if you make it harder, not everyone will meet that threshold), I don't see a problem with giving them all honors.
 
But what's wrong with everyone who earns honors (based on pre-determined metrics) getting honors? You can certainly make it harder to earn honors, but if everyone does (and chances are if you make it harder, not everyone will meet that threshold), I don't see a problem with giving them all honors.

I think he's talking about across the med student population. That would make differentiation based on clinical performance impossible.
 
I just want to point out how insane this situation is for surgical applicants in particular. The handful of failed GS,
uro, ortho, and ENT matches at my school that I’ve talked to over the past few years had perfect apps except for high passes on their surgery rotations secondary to evals or shelf. Funnily enough most of them soapd into another surgical specialty


it’s to the point where people don’t even consider themselves competitive surgical applicants with 250+ step 1s and don’t want to commit to their field until 10 weeks after they’ve finished their surgery rotation when evals come out
I agree that there are countless problems with clerkship grades but this is kind of a far leap in logic. I really don't think anybody's going unmatched solely because they got an HP instead of an H in surgery.
 
I think he's talking about across the med student population. That would make differentiation based on clinical performance impossible.

But I don't think it would make it impossible. If you make it harder (for instance, to get an honors, you need to get an 85 on the shelf and score in the top percentile of your classmates in these particular areas). That would still insure that not everyone got honors, but the best of the best got honors, no matter how many of them there were. Not everyone is going to score an 85 and not everyone is going to score in the top percentile of the metrics, whatever they may be. That would take at least some of the subjectivity out of the grading.
 
But I don't think it would make it impossible. If you make it harder (for instance, to get an honors, you need to get an 85 on the shelf and score in the top percentile of your classmates in these particular areas). That would still insure that not everyone got honors, but the best of the best got honors, no matter how many of them there were. Not everyone is going to score an 85 and not everyone is going to score in the top percentile of the metrics, whatever they may be. That would take at least some of the subjectivity out of the grading.

I don't disagree. aPD was responding to efle saying that in the hypothetical event that literally everyone or most got honors, you would not be able to compare applicants practically. Even if you removed the 3-bombers, the majority still wouldn't get honors at most schools. That's why I would agree with @efle's proposal. The only people that would get screwed are the ones that screwed themselves, not the ones that refused to play the game out of principle.
 
3) I have no problem transitioning to P/F grades because grades are nearly useless because of how inflated they are. When 30-40% of people on a clerkship are getting honors, 40-50% are getting high pass, and the rest pass, using a grade as a general, summative assessment of a student’s performance is pointless. I agree with [mention]aProgDirector [/mention]in that formative assessments - which is what you primarily see in the MSPE - are much more valuable. I would fully support a transition to P/F grades because it can’t possibly be anymore useless than the current system.

Thanks Nick - completely agree w/ you here
 
I just want to point out how insane this situation is for surgical applicants in particular. The handful of failed GS,
uro, ortho, and ENT matches at my school that I’ve talked to over the past few years had perfect apps except for high passes on their surgery rotations secondary to evals or shelf. Funnily enough most of them soapd into another surgical specialty


it’s to the point where people don’t even consider themselves competitive surgical applicants with 250+ step 1s and don’t want to commit to their field until 10 weeks after they’ve finished their surgery rotation when evals come out
My ortho roomie was losing his mind over a High Pass in Medicine. Not even Surgery, Medicine! Apparently a lot of big names want to see Honors on both now? I guess when you have 100 applicants per seat and most of them rocking a 250+, you can put all the additional filters in place and still have more than enough to interview.

1) At the time I didn’t, but in retrospect I think that’s just because I had no idea what I should be doing or what I was being graded on. I didn’t receive clear criteria for evaluations, what was expected with respect to shelf performance, etc. - the only information we got was how grades were determined proportionally (e.g., 70% evals, 30% shelf grade). I never saw clearly described criteria like we get for the students at my current institution. Maybe they existed and I just wasn’t aware of them - I’m not sure - but knowing how I was being graded would’ve been helpful. Just for the record, I generally did well on my rotation evals, so it’s not like I’m taking a “good for me but not for thee” position here.
Knowing the criteria makes it even easier to Honor and even less satisfying to do so. I've also been doing well but I know it's 100% because I'm asking the right people for evals, not because I'm a better clerk. It's dumb, and I'm hype to see Pass/Fail clerkships catching on at the big names that can afford to experiment with it.

I don't disagree. aPD was responding to efle saying that in the hypothetical event that literally everyone or most got honors, you would not be able to compare applicants practically. Even if you removed the 3-bombers, the majority still wouldn't get honors at most schools. That's why I would agree with @efle's proposal. The only people that would get screwed are the ones that screwed themselves, not the ones that refused to play the game out of principle.
My school weights their absurdly inflated evals 3x heavier than the shelf, and has no minimum shelf cutoffs. Unless you're scoring absolute crap on shelves, like consistently bottom quartile, the 3-bomber landmines are the only thing between you and the H.
 
Everyone getting Honors is problematic because you aren't stratifying. Everyone getting Honors except those with the bad luck to get a certain site/preceptor is equally problematic, because it means you are stratifying by chance instead of by real performance differences.

What do you disagree with in that statement? Because if the latter really feels like a better option to you just because it generates you a range of grades, then so should rolling dice or drawing names out of a hat.
Mt practical experience at my school is that the grades in the Medicine clerkship, the only thing I really have any good assessment of at all, seem relatively accurate. The best students, the one's I'd really want to have in our program, get Honors. The perfectly-fine-but-not-stellar students get HP. The not-very-good-I-hope-they-go-somewhere-else students also get HP. The really-terrible-I-wouldn't-wish-them-on-anyone students get Pass. Perhaps your school is different. Which one of us is an outlier is unknown.

My ortho roomie was losing his mind over a High Pass in Medicine. Not even Surgery, Medicine! Apparently a lot of big names want to see Honors on both now? I guess when you have 100 applicants per seat and most of them rocking a 250+, you can put all the additional filters in place and still have more than enough to interview.
If all clerkship grading went away, then what exactly would your roomie have to stand out from the crowd?
 
I’m just thankful that psych at my school is fairly easier and people do quite well if they want to and put in the work. The true learning of a specialty happens in residency, and I couldn’t imagine getting someone who grades disproportionately harder for it and drives my interest away and hurts my prospects. Not that I’m interested in psych for sure but it’s a matter of principle

The right thing to do is to collaborate with other faculty and agree and abide by an equitable distribution, but I’m sure that’s a hassle so why bother?

Therein lies the problem
 
Top