Generating a Curve is VERY difficult! Maybe this can relieve some of your stress

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

BerkReviewTeach

Company Rep for now-defunct Course & Bad Singer
Vendor
15+ Year Member
Joined
May 25, 2007
Messages
4,119
Reaction score
962
I figured with all the anxieties caused by the variability in test difficulties and topics covered on different sittings for the MCAT, a dilemma we have encountered might shine some light on what AAMC is probably encountering.

First, a little history on how we generated our CBT scales. We started with a pool of passages that we used before in the paper format for which we have a large N-value and thus a solid curve. On each of our CBTs, we placed three to five of these passages per section as well as some free-standing questions. When our students took the CBT exams, we collected question-by-question results and generated a curve based solely on those previously scored questions. We then compared this curve with the curve for the EXACT same questions from the paper format. It wound up that people were doing notably worse on the CBT version than the paper version (about 7% lower in terms of a raw score). We formulated all sorts of theories, but many came back to the fact that (1) people work more slowly on a computer than on paper and (2) not being able to write next to the question hurts one's performance if they cannot do it in their head.

From there, we then looked at the curve for the new questions and passages for which we had no previous data. Granted, these questions and passages were a bit more obscure than our field tested questions and passages, so we expected that the results would have been different. What we found was that the curve on these was pretty dang close to the field tested curve. The sequence of the exam was altered (just once so far) to see if that had an impact. I can definitely say that the sequence of the passages impacts one's performance, which supports the notion that ongoing anxieties play a crucial role in one's MCAT score.

By the time we got done analyzing the data, we realized that there were so many factors involved (from how much a student had reviewed to the time of day someone took the exam.) There are so many different factors to analyze that it's to the point of diminishing returns. So we picked a date to release our curve and basically said "WTF?, here it is. In many cases, the curve we got after much analysis was very close to the paper curve scaled down for the reduced number of questions and then shifted down by 7%. The bottom line is that you can curve anything.

So now for the present:
What has amazed us this year is that after getting scores back from our students who took the CBT in April, May, and June, is that they are getting a higher average score (by almost a full point), but the range is larger. We typically used to see a range of about 20 to 42 with an average of 30 to 31. Now we are seeing a range of 18 to 42, but with an average of 30.9 to 31.8. We modified our course to incorporate some CBT test-taking component, but I honestly believe that the increase is likely do to exposure to about two WTF type of weird passages on our practice exams. Freaking out during practice exams hopefully lightens the impact on the real exam.

So here is my hairbrained conclusion based soley on speculation. What I assume is happening is that the test is curved like before with various non-field tested questions being tossed out of the data pool after post-exam scrutiny. They analyze the data six ways to Sunday and realize it fits a Bell no matter how you cut it. Each separate sitting gets the same standardized results. So they are pleased with their curves. But the variability is not being accounted for, probably because they have no way of evaluating that. Unless test-takers take the exam multiple times, they have no way of seeing the impact from date to date. The reality is that with so few questions that they can't cover all of the topics, one exam is going to be better suited than another to a test taker's strengths.

You see it from practice exam to practice exam, with the fluctuation of your score based on the topics and how you are feeling that day. But AAMC does not see that variation; they just see Bell curve after Bell curve.

All you can really do is imporve your game plan for this exam. Learn from what others have encountered. The test is going to have some weird crap on it, and you have to figure out how to not freak out. Realize that some questions can be answered even when you don't really understand what it is asking. Employ POE (process of elimination) to the answers and you'll be fine in many instances.


  • Get a good monitor for the real exam. If you get a bad one, demand a change.

    If your first passage isn't easy at first glance (i.e., if it's a weird one), skip to the second passage. While this wastes a little time, it helps with the psyche.

    Do as much as you can in your head, especially calculations. Visualize pictures and scribble minimalistic pictures when you need them. You get no partial credit for drawing quality.

    Remember that the exam is curved. If a question is driving you crazy realize that it might be tossed out if everyone is randomly guessing. And if they keep it, the curve will be generous. When you get right down to it, the middle-difficulty questions and the difficult questions that can be analyzed using POE are the main components of the curve's spread. Do well on those and you'll do well on the exam.

    The test has a great deal of randomness. If it doesn't go as you wish, there is a very good chance you will do better the next time. Learn from your experience.

I apologize to the few of you here who took our class and have heard some of these mantas/rants chanted over and over. But it's good to know the logic behind the suggestions.
 
Outstanding post.

Could the small increase in average scores from your students be due to the fact that some people have little experience doing exams on the computer and their score drops, raising the average for everyone else?
 
Outstanding post.

Could the small increase in average scores from your students be due to the fact that some people have little experience doing exams on the computer and their score drops, raising the average for everyone else?

It could be due to a bunch of different things. I'm assuming some of it had to do with the fact that very few of our students got stuck with the crappy strobing computer screens. Picking the right center was bound to have some impact. It may have also been due to the fact that we forced students to do things in their head by projecting questions on a screen in the classroom, building good test-taking habits early on. Or it could have been as you say, a computer familiarity factor, but don't all courses have CBTs?

Who knows why, I'm just really happy for our students. It may sound cliche, but you really get emotionally attached to your class and when they do well, it is highly empowering.
 
I just wish they could disclose our raw scores/answers like the SAT score report does. It would make figuring out these statistics much easier.
 
I just wish they could disclose our raw scores/answers like the SAT score report does. It would make figuring out these statistics much easier.

For what it's worth, our raw scores have an average of roughly:

PS hardest exam 26.6/52 to easiest exam 31.4/52

VR hardest exam 21.2/40 to easiest exam 22.9/40

BS hardest exam 25.9/52 to easiest exam 34.0/52

Our students comments have ranged from "our tests are harder than the real thing" to "our exams are right on the mark." I'd have to assume that their raw scores are probably in the same ball park, if perhaps slightly higher than ours. It's all speculation, but not unreasonably so.
 
now i wish i was back in california to enroll in the berkeley review course. i should have done that last year. oh well, guess i will know in a few weeks if i need to reprioritize my schedule lol.
 
I've noted previously that Berkeley Review's practice tests (I only have experience with the paper/pencil version) are the closest to the actual test of any of the tests I've experienced. They also have the most accurate curve of any of the tests I've experienced. Some people were surprised that I scored the exact same score on EACH SECTION of TBR exams 2-8 as I did on the actual MCAT (13, 13, 13) but I think that speaks to the similarity and accurate curve of the TBR tests. My $0.02.
 
Figured I'd bump this. If you combine this read with the AAMC reply in another thread about curves, some light might be shined upon the curve they create. Perhaps we both follow the same methodology for getting a curve based on previous material, given that it's a standard statistical practice.
 
Figured I'd bump this. If you combine this read with the AAMC reply in another thread about curves, some light might be shined upon the curve they create. Perhaps we both follow the same methodology for getting a curve based on previous material, given that it's a standard statistical practice.


Thank for the thread,

Are the exams which you curve this way also for sale on your website, or are they only used in your classroom courses?
 
Thank for the thread,

Are the exams which you curve this way also for sale on your website, or are they only used in your classroom courses?

Hi LaCasta,

I am so sorry, but the CBT exams are just a part of the class at this time. There are paper versions for sale on the website, but not CBT. The CBT versions get slightly lower raw scores than their paper equivalent, so I am a strong believer in students doing CBTs once they reach a point where they feel "as reviewed as they are probably going to get." If you are feeling reviewed, then buying the paper versions might not be the best thing. If you are still reviewing, then paper is fine, as you can scar it up and use it as a learning tool.
 
Top