Deconstructing the US News Rankings: statistics gurus click here

MacGyver

Membership Revoked
Removed
15+ Year Member
Aug 8, 2001
3,757
5
Visit site
Status
Pre-emptive strike: Yes, I know I've got a lot of time on my hands to be worried about this stuff, so please dont waste everybody's time with a bunch of "you've got too much time on your hands" BS. Humor me on this.

I have downloaded the US News data for 60 schools that they published. I'm trying to recreate their rankings in an Excel spreadsheet. There are a total of 119 schools included in the rankings, but since US News doesnt publish data on all 119 schools, thats going to be one source of error between my rankings and US News. How big of an error that introduces is something I dont know.

This Excel spreadsheet includes both the data published by US News as well as a section of columns to the right side that has data generated by me, trying to simulate the US News ranking methodology.

You can download the Excel spreadsheet here (click on link below, then click on "US News Rankings" link. Excel spreadsheet will come up. Then you can save the file to your computer using the "save as" option under "file" menu)

http://macgyver25.freewebspace.com/USnews.htm

Here's a review of the US News methodology:

http://www.usnews.com/usnews/edu/grad/rankings/about/05med_meth.php

The 125 medical schools fully accredited by the Liaison Committee on Medical Education plus the 19 schools of osteopathic medicine fully accredited by the American Osteopathic Association were surveyed for the ranking of research medical schools; 119 schools provided the data needed to calculate the research rankings based on the indicators used in the research model. The same medical and osteopathic schools were surveyed for the primary-care ranking; 119 schools provided the data needed to calculate the primary-care ranking. Both rankings are based on a weighted average of seven indicators, six of them common to both models. The research model factors in research activity; the primary-care model adds a measure of the proportion of graduates entering primary-care specialties.

Quality assessment(weighted by .40): Peer assessment surveys were conducted in the fall of 2003, asking medical and osteopathic school deans, deans of academic affairs, and heads of internal medicine or the directors of admissions to rate program quality on a scale of "marginal" (1) to "outstanding" (5). Survey populations were asked to rate program quality for both research and primary-care programs separately on a single survey instrument. The response rate was 56 percent. A research school's average score is weighted .20; the average score in the primary-care model is weighted .25. Residency program directors were also asked to rate programs using the same 5-point scale on two separate survey instruments. One survey dealt with research and was sent to a sample of residency program directors in fields outside primary care including surgery, psychiatry, and radiology. The other survey involved primary care and was sent to residency directors in those fields. The response rate for those sent the research survey was 28 percent. The response rate for those sent the primary-care survey was also 28 percent. Residency directors' opinions are weighted .20 in the research model and .15 in primary care.

Research activity (.30 in research model only): Activity was measured as the total dollar amount of National Institutes of Health research grants awarded to the medical school and its affiliated hospitals, averaged for 2002 and 2003. An asterisk indicates schools that reported only research grants to their medical school in 2003.

Primary-care rate (.30 in primary-care model only): The percentage of M.D. school graduates entering primary-care residencies in the fields of family practice, pediatrics, and internal medicine was averaged over 2001, 2002, and 2003.

Student selectivity (.20 in research model, .15 in primary-care model): This includes three components, which describe the class entering in fall 2003: mean composite Medical College Admission Test score (65 percent), mean undergraduate grade-point average (30 percent), and proportion of applicants accepted (5 percent).

Faculty resources (.10 in research model, .15 in primary-care model): Resources were measured as the ratio of full-time science and clinical faculty to full-time M.D. students in 2003.

Overall rank: The research-activity indicator had significant outliers; to avoid distortion, it was transformed using a logarithmic function. Indicators were standardized about their means, and standardized scores were weighted, totaled, and rescaled so that the top school received 100; other schools received their percentage of the top score.
I've tried to follow the steps they've laid out, but I'm doing something wrong, because I'm not getting the same rankings they are. I think most of it has to do with the logarithm transformation they speak about. I dont know if this is a simple log(x) or ln(x) function or if there is a scaling factor or a more complex "logarithmic" function they are using. I've thought about emailing US News, but I doubt they would release this trade secret.

All you statistics/excel gurus, please download the sheet, check my numbers (right side of the worksheet shaded in blue) and tell me what I'm doing wrong.

Here's a detailed breakdown:

1) First, I took the research funding and applied a log(x) function to it. This is the step with the most uncertainty in my mind, because US News doesnt describe it with sufficient detail.

1) I used the Excel STANDARDIZE function, along with AVERAGE and STDEVP functions to standardize each indicator. This STANDARDIZE function returns negative values for numbers that are far enough below the mean. I'm not sure if thats a problem or not. US News says that they "standardized indicators about their means" so I'm not sure if there's enough detail given there to reconstruct what they are doing

2) After standardizing, I applied the weights given by US News.

3) After weighting, I totaled all indicators into a raw score. My raw scores dont agree with US News. My ranking is as follows:

Harvard
JHU
WUSTL
UPenn
Duke
UCSF
UMich
Columbia
UWashington
Stanford
Yale
Cornell
Baylor
UCLA
Mayo
Vanderbilt
UPittsburgh
UCSD
UTSW-Dallas
Emory
Case-Western
UNC
Northwestern
UChicago
Mount-Sinai

This doesnt jive with US News, so something is wrong with my math. Download the worksheet, check it out, and let me know what I'm doing wrong.
 
About the Ads

canada

Member
7+ Year Member
15+ Year Member
Feb 18, 2004
91
1
Visit site
Status
not sure if this has an effect but on the very right, you list the assessment scores 0.2 each for a total of 0.4 but then for the selectivity, you let the subcomponents add up to 1.0 instead of the total 0.2.
 

MacGyver

Membership Revoked
Removed
15+ Year Member
Aug 8, 2001
3,757
5
Visit site
Status
Originally posted by canada
not sure if this has an effect but on the very right, you list the assessment scores 0.2 each for a total of 0.4 but then for the selectivity, you let the subcomponents add up to 1.0 instead of the total 0.2.
The methodology says that the overall selectivity is weighted 0.2, OF that 0.2, 65% is MCAT, 30% is GPA, and 5% is acceptance rate.

These are the weights I calculated:

Peer review: 0.2
Residency directors: 0.2
Research money: 0.3
Faculty: 0.1
MCAT: 0.65*.2 = 0.13
GPA: 0.30*.2 = 0.06
Accept rate: 0.05*.2 = 0.01

Those numbers should add up to 1.0, and as long as they do, then I think the weighting is correct.
 

AlternateSome1

Burnt Out
15+ Year Member
Aug 2, 2002
988
28
NY
Status
Attending Physician
Ok, this is a guess after a pretty quick look at your stats.

Rather than making research equal 30% of the total decision, you have multiplied the research points by .3 and just tacked those points onto the final raw score. I think you need to establish the highest in each category, mark that as 100% of that category (for instance, Harvard would have .3 points from research), then weight the others based on what percent they have compared to the top scoring school.

~AS1~
 

MDEntropy

Member
7+ Year Member
15+ Year Member
Mar 18, 2004
131
1
Visit site
Status
Attending Physician
Slight twist to Alternate

I think you need to find the mean and standard deviation for each category. Then you need to come up with a percentile rank for each school (standardization - excel can do this) within each category beforing applying the weight for that category.
 

exmike

NOR * CAL
10+ Year Member
15+ Year Member
May 19, 2003
4,206
11
42
Bay Area
Status
Fellow [Any Field]
perhaps US News uses the MCAT and GPA to calculate the selectivity, then uses the selectivity as a single variable. that would probably have some effect.
 

MacGyver

Membership Revoked
Removed
15+ Year Member
Aug 8, 2001
3,757
5
Visit site
Status
Originally posted by AlternateSome1
Ok, this is a guess after a pretty quick look at your stats.

Rather than making research equal 30% of the total decision, you have multiplied the research points by .3 and just tacked those points onto the final raw score. I think you need to establish the highest in each category, mark that as 100% of that category (for instance, Harvard would have .3 points from research), then weight the others based on what percent they have compared to the top scoring school.

~AS1~
I see what you are saying.

If I'm understanding your setup right, if there was a school who had the highest score in EVERY category (e.g. research, MCAT, GPA, etc) they would have a score of 1.0.

Therefore, Harvard would have a 0.3 score (perfect) in the research funding category, whereas Mayo with its 11 faculty/student ratio would have a 0.1 perfect score in the faculty resources category.

I tried this idea, but the rankings are still off.

check out this version of the rankings (version #2) here:

http://macgyver25.freewebspace.com/USnews.htm
 

MacGyver

Membership Revoked
Removed
15+ Year Member
Aug 8, 2001
3,757
5
Visit site
Status
Originally posted by exmike
perhaps US News uses the MCAT and GPA to calculate the selectivity, then uses the selectivity as a single variable. that would probably have some effect.
Good point. I did what you suggested, and now the top 10 schools match the official US news ranks. There are still lots of discrepancies among the other ranks though, so its not htere yet.

apparently there are other things going on that dont match what US News does.
 

AlternateSome1

Burnt Out
15+ Year Member
Aug 2, 2002
988
28
NY
Status
Attending Physician
Ok, two more suggestions. Try weighting the peer and residency director review by the 5 point scale rather than the highest within that scale. After that, I would try rounding to different variables at different points in the calculation. If they round to 2 or 3 variables at the pre-raw score combination, it could alter school placements by a position or two.

~AS1~
 

AlternateSome1

Burnt Out
15+ Year Member
Aug 2, 2002
988
28
NY
Status
Attending Physician
A way to test how close your numbers are would be to convert your raw scores similar to the way the weighted sections were. For instance, Harvard can be set to 100, then calculate the percent of each and round to a two digit number. This can at least help tell how far away some schools are from where they should be, and it could explain some ties.

~AS1~

edit: This number should come up close to the USNEWS Overall Score.
 

Yogi Bear

2K Member
7+ Year Member
15+ Year Member
Oct 11, 2001
2,416
5
Visit site
Status
hey,

i tried analyzing usnews methodology awhile back and i think what they end up doing may possibly be to take the average of a bunch of mini-ranks. (i.e. they'd rank mcat from 1-125, gpa 1-125, % admission from 1-125), etc. from these rankings, they'll do a weighted average to calculate the total score (w/ the lowest score being the highest ranked, and standardizing it to 100 points). [if this is the case, the position of each category matters, not the actual values...so harvard's 1 trillion funding only goes so far)..and its conceivable that hopkins/washu can become #1 if they have high enough mcat/gpa] you can try this by using the excel function: =rank(). (look it up under help).

-one possibilty what ur numbers don't add up is that u don't have a list of all the schools. i.e. if it is indeed by a series of mini-ranks (1-125) for 10 categories and u only have the top 1-60 listed online, a school that's ranked 50 overall may have a research ranking's that's 100. that's why your number gets messed up.

just a theory. try implementing it.
 

liquid magma

Member
10+ Year Member
15+ Year Member
Aug 15, 2002
46
0
Visit site
Status
If I were you, I would email US News directly and request an explanation. If those hosers didn't follow proper and/or published statiscial methods, then this issue needs to be addressed publicly. But at the very least, as part of journalistic integrity and for the sake of credibility, they are required to make public the specifics of their methodology.

Very interesting...

--LM
 

Yogi Bear

2K Member
7+ Year Member
15+ Year Member
Oct 11, 2001
2,416
5
Visit site
Status
why do they factor in ratio of faculty:students as a component but use total nih funding rather than a ratio of nih$:students?
 

MacGyver

Membership Revoked
Removed
15+ Year Member
Aug 8, 2001
3,757
5
Visit site
Status
I've just about given up on this thing. I tried the mini-ranks idea, and this is what happened:

1) The schools below Harvard got really close to the US News overall ranks

2) however, the relative rankings were all screwed up.
 

MacGyver

Membership Revoked
Removed
15+ Year Member
Aug 8, 2001
3,757
5
Visit site
Status
I think there is something wrong with the way I'm standardizing the data. The reason I say this is because my ranks are still giving Harvard a huge edge over the other schools (100 vs 84 for nearest competitor, when the official ranks the discrepancy between Harvard and #2 is only 4 percentage points).

Specifically, the research funding category seems suspect to me. I used a LN and a LOG function for it, and there is still a very large deviance between Harvard and the other schools. US News just says they used "a logarithmic function to transform the data." I'm not sure exactly what they mean by that.

I'm using the Excel STANDARDIZE function which has 3 inputs:

1) The value to be standardized
2) The standard deviation of the array
3) The average of the array

This function uses a Z score to transform the data, and gives negative values (because the curve is centered around zero). The US News methodology says they "standardized indicators about their means"

Is that the same thing as what I'm doing? Are negative values generated by the standardization process a problem?
 
About the Ads