Data: let's find out if we can learn anything from the 2019-2020 IM Applicant Spreadsheet

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

Lucca

Will Walk Rope for Sandwich
Moderator Emeritus
10+ Year Member
Joined
Oct 22, 2013
Messages
8,599
Reaction score
19,862
sup nerds, read on for some potentially spicy data related to step 1, med school prestige, top IM residency applicants, etc (ur too cool to care about those things but lets face it you're going to post about it anyway so might as well have some graphs to point to).

so every year, residency applicants on the internet will use spreadsheets to anonymously share information about interview invitations, experiences etc. Well, these spreadsheets also happen to have a wealth of information on the applicants themselves but the information has never, to my knowledge, been presented in a particularly helpful way. But Lucca, you might say, dont we have the NRMP Charting the Outcomes data, the best possible source of residency related information available to anyone???? Yes! That's completely right. But this is SDN where we also have endless arguments about things the AAMC (and probably no one else) doesnt care about, like how much the prestige of your med school matters. After doing some data cleaning I made up some graphs to try to extract some basic information from this snapshot of 171 anonymous IM applicants.

caveats and disclaimers:
-yes, this data is anonymous and comes from the internet so obviously grain of salt and all that
-this data (n=171) is a small fraction of the ~11,000 total US IM applicants so it's a snapshot. Think of it like shotgun sequencing a cup of water from a lake to get an idea of whats going on in the lake, but certainly not a thorough census of all life in the lake ecosystem.
-data cleaning: ppl often had funky answers to the school ranking question like Mid-Low tier (?????) so I've lumped all into just 3 buckets: Low, Mid, Top. DO and IMG were automatically sorted to the "Low" category. "Top" was reserved for people claiming Top tier or T20. Everything else was sorted into "Mid". N/A from the original spreadsheet data were removed for the sake of simplicity.

Now that the boring part is over let's look at some graphs.
here's the overall step 1 distributions by school tier:

q1RjRtZ.png


obviously the folks on the IM spreadsheet tend to have pretty high Step 1s with the median floating around 240! Definitely not representative of the overall IM applicant population, but something good to keep in mind. There is also no meaningful difference from any of these distributions as far as Step 1 is concerned.


First, lets look at II to App Yield vs Step 1. Points are individual applicants. Density contours added to aid in visualization. App yield is defined as # of IIs / # of apps sent out.
UQAECoz.png


These density maps are almost completely overlapping, with a clear trend for higher yields with increasing step 1 score. This suggests that looking at all potential IM programs in aggregate, Step 1 is a better predictor of overall App Yield compared to School Ranking. Matches up with the received wisdom passed down on Reddit/SDN. Neat.

yKhI5gz.png


That said, it looks like applicants from higher ranked medical schools tend to be more likely to send fewer applications (indeed, up to less than half as many) than those from lower ranked medical schools. The number of IMGs in this dataset is small enough that I don't think IMGs are skewing the low tier distribution.

Hmmm...but what happens when he zoom in on just the applicants who received an II to one of the Top 8 IM residencies as defined by the Top 100 rankings on the spreadsheet itself? Let's see what the Step 1 distributions look like for those programs as represented by applicants on the spreadsheet who reported receiving an invitation to one or more such places.

lBXqkkr.png


BWH = Brigham and Women's, MGH = Mass General, JHH = Hopkins. Only main-sites were included in this analysis, so places like UCSF-Fresno or Hopkins-Bayview were not counted with JHH or UCSF.

The Step 1 distributions for applicants reporting IIs at the T8 IM residencies are remarkably similar! Medians floating near 250 with short tails on the 25th percentile end and longer tails on the 75th percentile end, with the notable exceptions of Columbia and UCSF which appear to be more symmetrically distributed.

What does the school tier representation look like now that we've zoomed in on these so-called "top" programs?

78duBlB.png


Schools from all tiers appear to be pretty equally distributed across all T8 IM interviewees! We can more directly compare the difference in prevalence from the overall pool like this:

YgBdyTF.png


In other words, low and top tier schools are slightly overrepresented in the t8 Interviewee dataset compared to the overall IM spreadsheet applicant pool dataset; mid-tiers slightly underrepresented.

Have the step 1 distributions changed for applicants receiving IIs to T8 IM programs ?

tAWxR6E.png


Unsurprisingly, the distributions have jumped up on the Step 1 scale compared to the overall pool. Somewhat surprisingly, the distribution from the "Low" tier population his a bit lower than those in the mid and top tier bracket. This might challenge the oft-repeated wisdom that applicants from "lower tier" medical schools will have to score higher on Step 1 than their Mid and Top tier counterparts to break into the more elite academic IM programs. Given the similarity between these distributions, the better answer might simply be: Score high, period.

So myth debunked! there is no bias for more prestigious medical schools in IM resident selection. Well...let's not be too hasty.

What happens when we look at App Yield vs. Step 1 in the t8 IM cohort? After all, we can assume that these applicants regardless of what their medical school ranking is or isn't (or whatever their Step 1 scores are!) are the cream of the crop in many different dimensions of their CV.

fCrDTJG.png


It appears that in this cohort of so-called "elite" IM applicants, students from higher ranked schools tend to, on average, have much better app yields at equivalent Step 1 score levels compared to their counterparts at lower ranked schools.
U0GV4Xm.png


Furthermore, the difference between total number of apps sent by school tier here in the stratosphere of academic IM programs is much more pronounced.

Also, consider this:

VKCwOGf.png


It was far more common for applicants from higher ranked schools to receive interview invitations to multiple of the T8 IM programs.

Although the data is highly limited, I thought it was interesting how well even this small snapshot was able to reproduce most of what we take to be basic wisdom about the way Step 1 score and school ranking are used in residency selection.

discuss
 
Last edited:
Unless I'm misreading this the median step1 reported by this cohort is a ~243? That's insane. It really does seem like having a good score isn't enough if you're from a non-feeder to these programs. You need the score and an established name to net several interviews
 
I question whether Step scores are very important in the IM match at all. Check out a school like FAU, where 1/9 students match into a Top 40 IM program (Doximity) even though their average step score is in the 240s. Also FIU, where 3/18 students match into a top 40 also with decent Step scores (235+ if I recall). Compare this to a school like Pitt, where even though it doesn’t have super high Step scores, 80-90% of students match into a Top 40. It will probably be better to compare the match lists themselves for overall quality (even though everyone, even in the premed forums, are against it)

That’s what I’m saying, top residencies will take top people even if their scores aren’t as high. Got flamed in the premed section for saying as much.
 
Unless I'm misreading this the median step1 reported by this cohort is a ~243? That's insane. It really does seem like having a good score isn't enough if you're from a non-feeder to these programs. You need the score and an established name to net several interviews

Yup, I question whether Step scores are very important in the IM match at all. Check out a school like FAU, where 1/9 students match into a Top 40 IM program (Doximity) even though their average step score is in the 240s. Also FIU, where 3/18 students match into a top 40 also with decent Step scores (235+ if I recall). Compare this to a school like Pitt, where even though it doesn’t have super high Step scores, 80-90% of students match into a Top 40. It will probably be better to compare the match lists themselves for overall quality (even though everyone, even in the premed forums, are against it)
 
That’s what I’m saying, top residencies will take top people even if their scores aren’t as high. Got flamed in the premed section for saying as much.

Not true and I dont think the data here supports that.
 
Not true and I dont think the data here supports that.

The biggest confounding variable of this data set is that the majority of people with 250+ are also top 25% in their class while those under 230 are mostly bottom 25%. We all know how much AOA and clinical grades affect the IM match.

You can also look at any Top 20 school and see that over 80-90% of the students get into a Top 40 in doximity. For the Top 10, 80% of students get into a Top 20. This is saying that even a person with average clinical grades and average step scores are getting into top tier residencies at these top schools.
 
The biggest confounding variable of this data set is that the majority of people with 250+ are also top 25% in their class while those under 230 are mostly bottom 25%. We all know how much AOA and clinical grades affect the IM match.

You can also look at any Top 20 school and see that over 80-90% of the students get into a Top 40 in doximity. For the Top 10, 80% of students get into a Top 20. This is saying that even a person with average clinical grades and average step scores are getting into top tier residencies at these top schools.

That's my point. It's more important to come from Harvard Med with a 240 than it is to have a 255 from VCU.
 
That’s what I’m saying, top residencies will take top people even if their scores aren’t as high. Got flamed in the premed section for saying as much.

That means you're right 😉

I think it's pretty clear that everything matters, and school name plays a major role
 
The biggest confounding variable of this data set is that the majority of people with 250+ are also top 25% in their class while those under 230 are mostly bottom 25%. We all know how much AOA and clinical grades affect the IM match.

You can also look at any Top 20 school and see that over 80-90% of the students get into a Top 40 in doximity. For the Top 10, 80% of students get into a Top 20. This is saying that even a person with average clinical grades and average step scores are getting into top tier residencies at these top schools.
just to add some numbers to this comment:

so for the t8 IM II recipient data most actually reported no AOA or N/A (14 N/A, 21 Non-AOA, 29 AOA // 45% with AOA overall ).
stratifying by school ranking ( n/a, non-AOA, AOA)

low: 7 / 11 / 3 (~14% with AOA)
mid: 1 / 5 / 17 (~74% with AOA)
top: 6 / 5 / 9 (~45% with AOA)*

*keep in mind that many top schools either dont give out AOA / rank their students or only give out AOA after ERAS is submitted.

here's what the step 1 scores look like stratifying by rank and AOA. n are rly small here so any apparent difference between these distributions is most likely not meaningful, but showing it anyway in case anyone wanted to see it.
jumTCGX.png


it would be better to look at AOA vs. class rank vs. Step 1 with the full IM spreadsheet but I didnt clean the data for those variables in the full IM spreadsheet. Maybe if I'm bored next week.

but doing a quick tabulation takes no time at all and in the overall IM spreadsheet 21% had AOA, 48% didn't have AOA, 30% reported N/A so unsurprisingly AOA is overrepresented in the t8 IM data, non-AOA is underrepresented.
 
Last edited:
bumping this so these graphs have somewhere to live:

Comparing popularity of T30 or so IM programs as measured by frequency at the top of spreadsheet users' ROL vs. program ranking.

You will notice that even as soon as #3 on the ROL you lose almost any correlation with program ranking.
ool8Kcp.png

wj4O9SX.png

02v807Z.png



when u look at all in the top 3 on ROL together u can get a more complete impression of who "over" and "under" performs on ROL popularity compared to "rank". (overperform = above 95 CI / gray-shaded area of linear regression; vice versa for underperform).

DNtNDdZ.png


one can speculate that a place like Cornell associated with Sloan Kettering may be more popular with applicants who are interested in going into heme/onc and gain an additional boost from being located in NYC but the main takeaway is not really all that surprising: program popularity is due to a variety of factors other than program reputation.
 
Top