Scoring

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

kedhegard

Senior Member
15+ Year Member
Joined
Sep 24, 2003
Messages
191
Reaction score
1
Will somebody please explain to me how this exam is scored...

There are 350 questions total, but a 250 is a killer score. Does that mean that I can miss 100 questions (almost a third of the test) and still get a 99 percentile? Not that I'm gonna do that great, but it seems to me that there must be some kind of rounding off done. Thanks for any info.
 
Nobody really knows how it is scored
.
 
Jalby said:
Nobody really knows how it is scored
.

Sure we do...the average score is tabulated, and that equals something like a 215. 1.5 SD below that is minimum passing (around 182). The mean score is given a two digit of 82 and the lowest passing score gets a 75 (per federal regulations).

However, we have no idea how they get to those scores...and since 50 of the 350 questions are noted for being "experimental" and not graded, that makes the likelihood of anyone getting 250+ just that much more difficult.
 
The dean of my medical school writes questions for the USMLE. His course notes have been money for our NBME shelf exams. It's amazing actually; he only bolds certain points throughout his notes, and a significant portion of these end up on our subject exams. I only used his notes for the anatomy and physio nbme exams, and I got a 99 raw on both.

His explanation of USMLE scoring is this: The highest score anyone has ever seen on the USMLE is 275 which must correspond to a perfect score...therefore, 275 is the number of scored items. Basically, you get 1 point for each correctly answered scored item.

Now, I noticed someone on this board mentioned that someone at their school scored 285. If this is true, then I guess my dean is wrong...this is just what he told us.
 
Take the number of questions you think you got correct plus the number of questions you might have gotten correct, minus the number you couldn't have possibly gotten correct, multiply by your mental age, then divide by your chronological age, take one half of that number, and multiply by pi.


It's so simple I can't believe I didn't think of it sooner
 
Stinger... I used your formula and got a 104...with some change for Pi.. This is about right. Stinger finally figured out the formula.. well done.. 🙂

People.... no one knows how it is scores....
 
RonaldColeman said:
His explanation of USMLE scoring is this: The highest score anyone has ever seen on the USMLE is 275 which must correspond to a perfect score...therefore, 275 is the number of scored items. Basically, you get 1 point for each correctly answered scored item.

Now, I noticed someone on this board mentioned that someone at their school scored 285. If this is true, then I guess my dean is wrong...this is just what he told us.

I am laughing very hard at this right now, although I would wager that it is not very likely that someone got a 285.

They take the average and set a mean. They go by SD to get your three digit score, so technically a 300+ is possible, I suppose, but the test is scaled. You may get one point per correct item, but it doesnt seem likely.

edit: of course, this means that it would be possible to get a 400+, should you perform that much higher than your test group.
 
Here is my initial theory on scoring. I have no proof since Step 1 scoring is closely guarded, however I feel very confident.

1. It is commonly known that if you get about "60 percent of the questions right, you pass.". This is also mentioned in First Aid.

2. There are 350 questions on Step 1.
3. Article in Journal of American Medicine admits there are experimental questions. I will look up article when I have time at the medical school library. This makes sense because LSAT, MCAT, SAT, and GRE all have experimental questions.
4. The GRE has an experimental section that is not graded but used to "test" future questions.


With these factors, I propose that Step 1 has one experimental section. One block i.e. 50 questions. 350-50= 300

1) I believe a perfect score on Step 1 is a 300.
- this makes sense because no one has ever heard of a score above 300
but around the 280 range.
2) 60% of 300 questions is the magical 180. This is very close to what you
need to pass.
3) One Block is not graded but is used to "field test" questions in a simulated
exam period.
4) Just like the MCAT I doubt the experimental section is at Block 1 or Block 7. Because testmakers know that Block 1 is a warming up block and block 7 is a fatigue block. The point of the experimental section is to closely simulate a set of questions.
5) I believe that there is no such thing as a "scaled" score on Step 1. Rather I believe all scores are raw scores i.e. there is no curve. Thus, your score represents how many questions that answered correctly. Furthermore, I argue, how can you have a curve when everyone has a different exam. Granted, the 300 questions come from a test bank of around 10000.
6) Since the grades are raw, and not scaled. The www.usmle.org warns against comparing scores between years. This is the reason that the USMLE removed the percentile score on your grade report.
7) The 182 you need to pass are the "gimme" questions on Step 1. The other questions are tougher questions that reward top students for retention of minutiae and/or "two step" ability to integrate concepts.

I'll update this when I have time. If you are interested in my MCAT theory, you can look it up under my previous posts.

Good Luck.
 
chandler742 said:
Here is my initial theory on scoring. I have no proof since Step 1 scoring is closely guarded, however I feel very confident.

1. It is commonly known that if you get about "60 percent of the questions right, you pass.". This is also mentioned in First Aid.

2. There are 350 questions on Step 1.
3. Article in Journal of American Medicine admits there are experimental questions. I will look up article when I have time at the medical school library. This makes sense because LSAT, MCAT, SAT, and GRE all have experimental questions.
4. The GRE has an experimental section that is not graded but used to "test" future questions.


With these factors, I propose that Step 1 has one experimental section. One block i.e. 50 questions. 350-50= 300

1) I believe a perfect score on Step 1 is a 300.
- this makes sense because no one has ever heard of a score above 300
but around the 280 range.
2) 60% of 300 questions is the magical 180. This is very close to what you
need to pass.
3) One Block is not graded but is used to "field test" questions in a simulated
exam period.
4) Just like the MCAT I doubt the experimental section is at Block 1 or Block 7. Because testmakers know that Block 1 is a warming up block and block 7 is a fatigue block. The point of the experimental section is to closely simulate a set of questions.
5) I believe that there is no such thing as a "scaled" score on Step 1. Rather I believe all scores are raw scores i.e. there is no curve. Thus, your score represents how many questions that answered correctly. Furthermore, I argue, how can you have a curve when everyone has a different exam. Granted, the 300 questions come from a test bank of around 10000.
6) Since the grades are raw, and not scaled. The www.usmle.org warns against comparing scores between years. This is the reason that the USMLE removed the percentile score on your grade report.
7) The 182 you need to pass are the "gimme" questions on Step 1. The other questions are tougher questions that reward top students for retention of minutiae and/or "two step" ability to integrate concepts.

I'll update this when I have time. If you are interested in my MCAT theory, you can look it up under my previous posts.

Good Luck.
using your theory:

if the usmle.org 3 block test is any indication of the real thing then:

there are 7 blocks of 50q on USMLE step 1
i average 38/50 questions right per block
you get 1 point per question answered correctly
38 x 7 = 266

yeah right.
 
Actually imtiaz, it would be 38*6 since 50 Qs are thrown out. Therefore a 228 would be your score. basically take your percentage and multiply by 3 to get a rough score. Things is NO ONE really knows. The only thing that does hold true.. is that the test was scaled in 1991(?) to a 200 average. Over the past 10+ yrs the average has gone up to about 216. 1 SD is 15-25 points. NO one knows any more or any less. They then relate peoples scores to the "test" questions they get right and wrong and go from there.
 
chandler742 said:
Here is my initial theory on scoring. I have no proof since Step 1 scoring is closely guarded, however I feel very confident.

1. It is commonly known that if you get about "60 percent of the questions right, you pass.". This is also mentioned in First Aid.

2. There are 350 questions on Step 1.
3. Article in Journal of American Medicine admits there are experimental questions. I will look up article when I have time at the medical school library. This makes sense because LSAT, MCAT, SAT, and GRE all have experimental questions.
4. The GRE has an experimental section that is not graded but used to "test" future questions.


With these factors, I propose that Step 1 has one experimental section. One block i.e. 50 questions. 350-50= 300

1) I believe a perfect score on Step 1 is a 300.
- this makes sense because no one has ever heard of a score above 300
but around the 280 range.
2) 60% of 300 questions is the magical 180. This is very close to what you
need to pass.
3) One Block is not graded but is used to "field test" questions in a simulated
exam period.
4) Just like the MCAT I doubt the experimental section is at Block 1 or Block 7. Because testmakers know that Block 1 is a warming up block and block 7 is a fatigue block. The point of the experimental section is to closely simulate a set of questions.
5) I believe that there is no such thing as a "scaled" score on Step 1. Rather I believe all scores are raw scores i.e. there is no curve. Thus, your score represents how many questions that answered correctly. Furthermore, I argue, how can you have a curve when everyone has a different exam. Granted, the 300 questions come from a test bank of around 10000.
6) Since the grades are raw, and not scaled. The www.usmle.org warns against comparing scores between years. This is the reason that the USMLE removed the percentile score on your grade report.
7) The 182 you need to pass are the "gimme" questions on Step 1. The other questions are tougher questions that reward top students for retention of minutiae and/or "two step" ability to integrate concepts.

I'll update this when I have time. If you are interested in my MCAT theory, you can look it up under my previous posts.

Good Luck.



I don't care what anyone else says, I've always agreed with this same logic. However, I'm not sure I agree with the concept that a single block has all the practice questions. That skews the statistics. It'd be more accurate to spread out the 50 experimental questions over all the blocks than focus them all in one. Think about it: what if many of the experimental questions were poorly written or really difficult.. these intimidating factors would most definitely build up during that particular block and affect your performance in the entire block overall (and not necessarily indicate students' real performance on all these questions). Of course, this could happen in any block with any questions, experimental or not, but it'd be much more accurate (i.e. you wouldn't have to worry about that bias due to a lack of "random placement") to have these questions spread out amongst the rest.

Therefore, I believe they take a block of 42-44 real questions and add in 6-8 experimentals. You end up with a particular raw score out of 42-44 on each block, and this stuff is all added together to get a score out of around 300.
 
I believe they would be more likely to sprinkle in the experimentals, but maybe not...your version sounds better than the others I have heard, but I think that they do distribute a percentile report to the student.

I also sincerely doubt that you have to get 80% right to get a 240...once you get closer to 300, the curve has to shrink a little bit. Since it is pretty well known that the passing grade is 1.5 SD below the mean (hard to fake that), I will always believe that the test is standardized. BUT none of this explains why the mean keeps going up (i.e. they should set a mean, like the COMLEX does, at a score of 200 or 210...whatever). I think it has to do with the test administration (all the time) rather than the way the COMLEX is done (once a year) it is probably tough to standardize.
 
The poster is definititely about this method breaking down at either extreme of the scale. There is no way that answering 80% correctly will only get you a 240.
 
How do you know for a fact? I am basing my hypothesis based on the scientific method. Sure, you can argue that 80% of the questions will not equal a 240. But what is your logic? I am not singling you out just for the sake of it. I am curious how you can make such a strong statement without any theory to the contrary. But rather, just being dismissive, because you might think it is too simplistic.



The scientific method is the best way yet discovered for winnowing the truth from lies and delusion. The simple version looks something like this:


1. Observe some aspect of the universe.
2. Invent a tentative description, called a hypothesis, that is consistent with what you have observed.
3. Use the hypothesis to make predictions.
4. Test those predictions by experiments or further observations and modify the hypothesis in the light of your results.
5. Repeat steps 3 and 4 until there are no discrepancies between theory and experiment and/or observation

bigfrank said:
The poster is definititely about this method breaking down at either extreme of the scale. There is no way that answering 80% correctly will only get you a 240.
 
The problem is, you have a sample set of zero...so its all speculation and will continue to be, even after you get your score. There either is a standardized score or there isnt. If there isnt then your method is perfect. If there is then it falls apart at the edges Either way, its pretty decent.
 
Although I would love to know how they score the test, not to mention, know my score now, it is all supposition.

Chandler, I have to apologize for this post but...

How impressive a reprimand and description of the scientific method! I had no idea how to perform such a difficult task. I am sure others on this thread are very happy for the free education!
 
Idiopathic, the reason I think the average keeps going up is because like I said, there is no curve but a raw average based on the number of correct answers. An average cannot keep going up, if it is normalized or curved. This also explains why periodically they "change" the pass score. If it was normalized they wouldn't change the pass score, they would normalize the Gaussian Curve.

Secondly, the NBME is very adamant about using a certain score to compare students. They warn against using a high score for cutting people off for a residency. I believe this is because they know it isn't curved.

Lastly, I hate to say this but I think the reason the grades keep going up is because of "word of mouth" on actual questions that are seen on the exam. In fact, on the USMLE.org. It states that they keep track of various message boards to see if actual questions are released, and they would prosecute these individuals. This indirectly proves that they recycle questions.

Thus, if they recycle questions, and students talk to each other. The tough questions are no longer tough. I believe there is a leak within the various medical schools. It wouldn't surprise me if some students had a list of old exam questions on Step 1 in a test file.

Granted, these questions in old test files will not number 10000, but I can easily see REMEMBERED questions that people stew over because they found it difficult on the actual exam would be water cooler gab material after the exam. Some resourceful medical students could make a list of these "tough" questions. And easily have a list of 100 that is passed down to future classes.





Idiopathic said:
I believe they would be more likely to sprinkle in the experimentals, but maybe not...your version sounds better than the others I have heard, but I think that they do distribute a percentile report to the student.

I also sincerely doubt that you have to get 80% right to get a 240...once you get closer to 300, the curve has to shrink a little bit. Since it is pretty well known that the passing grade is 1.5 SD below the mean (hard to fake that), I will always believe that the test is standardized. BUT none of this explains why the mean keeps going up (i.e. they should set a mean, like the COMLEX does, at a score of 200 or 210...whatever). I think it has to do with the test administration (all the time) rather than the way the COMLEX is done (once a year) it is probably tough to standardize.
 
Lastly, I don't take it personally if people attack my theory and say it is ridiculous or stupid. Only thing that I ask is that we keep it a logical forum to exchange ideas.

I believe comments like "that can't be the case, or that is too simplitic" does not hold any water. Similarly, 80% of the correct answers cannot be 240. Your arguement is based on what?

Why not? Do you feel as though getting 80% of the questions right on the exam is too easy or too difficult.

In my mind, 80% seems very tough.
 
chandler742 said:
Idiopathic, the reason I think the average keeps going up is because like I said, there is no curve but a raw average based on the number of correct answers. An average cannot keep going up, if it is normalized or curved. This also explains why periodically they "change" the pass score. If it was normalized they wouldn't change the pass score, they would normalize the Gaussian Curve.

Secondly, the NBME is very adamant about using a certain score to compare students. They warn against using a high score for cutting people off for a residency. I believe this is because they know it isn't curved.

Lastly, I hate to say this but I think the reason the grades keep going up is because of "word of mouth" on actual questions that are seen on the exam. In fact, on the USMLE.org. It states that they keep track of various message boards to see if actual questions are released, and they would prosecute these individuals. This indirectly proves that they recycle questions.

Thus, if they recycle questions, and students talk to each other. The tough questions are no longer tough. I believe there is a leak within the various medical schools. It wouldn't surprise me if some students had a list of old exam questions on Step 1 in a test file.

Granted, these questions in old test files will not number 10000, but I can easily see REMEMBERED questions that people stew over because they found it difficult on the actual exam would be water cooler gab material after the exam. Some resourceful medical students could make a list of these "tough" questions. And easily have a list of 100 that is passed down to future classes.


I agree with what you say, and I'm sure there are many more reasons for the periodic increases in mean performance. I tend to think that review texts as well as review courses are beginning to "catch up" to the USMLE, and they have started to almost predict the kind of content that if studied will help you score at the mean or better (take goljan, for instance). And as many of you know, review courses and texts are absolutely HUGE nowadays and are continually and substantially updated and upgraded. Plus, as mentioned, many review sources and question banks claim to have "student input" on how they write their review questions, which basically means students are taking the exam, writing down the questions they remembered, and submitting them anonymously to the review organization. So I don't think it's much of a stretch to consider how the national mean steadily increases. It probably also indirectly explains why the passing scores keep going up as well.

And yes, they recycle questions. About 10 questions on my exam were either seen on the NBME assessment exams, or on one of my second year shelf exams. Not exactly a huge number, but remember the NBME has an enormous number of questions in the bank, so it's relatively substantial.
 
My assertion that 80% warrants a greater score than 240 is based on my own experiences. Isn't that what any score speculations are based on? There is nothing else on which these speculations can be based.

1. On the NBME-offered/graded practice exam (paper/pencil), I got a 255 and would bet my life I answered less than 80% correctly. This was back in March.

2. On the NBME-offered/graded 2 computerized online scores, I got a 700 and a 710, which is 2+ standard deviations over the mean and I would bet I answered 80% correctly.

On #1 above, my score report specifically stated that it was scored to be accurately scaled to be a representative score for first-time test takers. Since everyone seems to "know" that a raw scale of questions answered correctly is used, I reasoned that the score is fairly representative and seems to be consistent with my other NBME-based practice exams.
 
chandler742 said:
Similarly, 80% of the correct answers cannot be 240. Your arguement is based on what?

Why not? Do you feel as though getting 80% of the questions right on the exam is too easy or too difficult.

In my mind, 80% seems very tough.

Its more a gut feeling. a 240 is not even +1 SD, which means that over 15% of test takers exceed this score. I do not believe that 15% of test takers get over 80% right. I believe that 80% correlates more with a 250+. But again, it is all conjecture. Your idea is as good as any.
 
Top