Discussion to end "curve" discussion

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

AttemptDOnowDPM

Senior Member
7+ Year Member
15+ Year Member
20+ Year Member
Joined
Aug 1, 2003
Messages
116
Reaction score
0
I read a post once in another forum stating the reason to deduce that there is no "curve" on the MCAT. i.e. My score on DG is compared to everyone else on DG and curved accordingly based on how well other people did.

It went like this....

"Every year AAMC reports averages for each section on the MCAT. If these scores were normalized based on relative scoring of each individual, don't you think the average would be the same every year? But, they're not."

This was an interesting point to me, but maybe I don't understand my statistics well enough to discern whether this person is right in his/her thinking.

What'd'y'think?
 
Originally posted by zinjanthropus
the average is the same every year

a deviation of +/- 0.5 isn't significantly different

The average among applicants and matriculants has gone up each year from 1992-2003.

AAMC MCAT Scores
 
Originally posted by thackl
Wow! What's up with the huge decline in the number of applicants?

I don't know really. I think for a while the economy was booming and people headed towards the internet boom careers. Now with the economy in a down swing many have expected the numbers in applicants to go back up. I heard that they did but the numbers don't support that.

I guess another factor is the continued troubles with our Health system, HMO's, etc. People just don't want to deal with the hassles.
 
the fact that the average score for matriculants has increased over the past 10 years is an entirely seperate issue then the scoring of the actual exam.

there is a positive trend in the data for matriculants, however, the average scaled score in each section has not changed (about 8)
 
Originally posted by zinjanthropus
the fact that the average score for matriculants has increased over the past 10 years is an entirely seperate issue then the scoring of the actual exam.

there is a positive trend in the data for matriculants, however, the average scaled score in each section has not changed (about 8)

Where does your data come from? I've always heard the average among test takers is about 8 per section. I'm just not sure where those numbers are coming from.
 
http://www.aamc.org/students/mcat/examineedata/pubs.htm

just to give you some idea about the last 10 years, there has been very little variation about the mean with a very slight increase possibly occuring over time (0.1 - 0.5 points) in verbal and physical sciences. However, data can't really address this trend yet since so many variables have changed with the verbal and physical sciences section (less passages on verbal, reverse order for the 2 sections). There does seem to be a modest increase in bio from 7.9 in 1993 to 8.5 in 2003. However, if you to take the mean of those 10 years, it's still about an 8. In addition, if you take the mean of the three sections over 10 years it's a little lower then 8 and in 2003 the mean for the three sections was 8.26

April 2003
P 8.1, V 8.2, B 8.5

April/August 1999
P 8.1, V 7.9, B 8.4

1996
P 7.9, V 7.8, B 8.3

1993
P 7.8, V 7.7, B 7.9
 
There is a lag time with the enrollment bottom due to the economy, the 2005 applicant and may be this years numbers will go up IMO.

Does anyone know if the passages/questions are recycled? If that were the case, they could reasonably predict the score distributions and preset the curve. Certainly, repeated use of passages back-to-back years or tests would be silly.
 
Originally posted by zinjanthropus
the average is the same every year

a deviation of +/- 0.5 isn't significantly different

Just because there is a small deviation in the mean from year to year does not mean the MCAT is curved. The MCAT is a standardized test so curving it would render it useless in many ways. This is because the quality of students taking it from year to year and from test form to test form will vary overtime.
If you look closely at the percentiles for different scaled scores, you would notice that they vary significantly from year to year. In some years a 10 in physical science maybe 78-82%tile whereas in other years or test administrations it may be 75-80%tile. The fluctuations in percentiles is reflective of the difference in quality of the test-takers in different years. This is exactly what happens with the SAT, LSAT, GRE etc.
 
Standardized means that everyone takes the test under similar conditions, at the same time (approximately, considering time zones), and takes the same test (or in the case of the MCAT, one of the forms). How would curving a standardized test make it useless?

~AS1~
 
right, not only is the test standardized in terms of those external conditions, but it is also normalized (curved) by taking the raw data and converting it to scaled scores. The conversion works out to be roughly the same each year, while the conversion factor might vary slightly.
 
Originally posted by AlternateSome1
Standardized means that everyone takes the test under similar conditions, at the same time (approximately, considering time zones), and takes the same test (or in the case of the MCAT, one of the forms). How would curving a standardized test make it useless?

~AS1~

Standardized does not only refer to test conditions but most importantly grading. This is why curving standardized tests render them useless. An example will be instructive. For instance, If I take the exam in April and you take it in August and we both score a 10 on the biological science section, it is possible that our 10's will mean different things if we took it with different people. If you took it with smarter kids than I did, your curve would be more stringent than mine and thus your 10 would be stronger than mine. How is that fair? How are medical schools to differentiate between the two?
The point of standardization is to create a uniform standard by which students can be judged. A 10 on any section 4 years ago is equivalent to a 10 on the same section two years ago. Ofcourse different test forms have different scales due to the differences in difficulty but the scale for any particular test form is FIXED regardless of how well the students do. This means that if 6R was a real test form and everyone taking it scored a 71/77 on Bio then everyone would get a 13 on Bio. According to your curving method, a 71/77 for these test takers would be an 8 since 71/77 is the mean. This is clearly absurd since a different body of students could have averaged 55/77 and that would have been an 8 for them. We are then forced to equate 71/77 and a 55/77 on the same test form. If the AAMC were actually doing ridiculous things like this, they would have been out of business a long time ago.

P.S. It is interesting that you did not comment on the significant variation in percentile rank for various scaled scores from year to year. I wonder why.
 
But if you think about the probability of having all the smart kids taking one form of the mcat is highly unlikely. All the "smart" and "dumb" kids are randomly distributed across all the forms of the mcat. And the number of people taking the mcat in one administration is so big (N=35000) that the number of "smart" kids taking form CK, and the number of "smart" kids taking form AG, is going to end up being approximately equal.
 
Originally posted by mosfet
But if you think about the probability of having all the smart kids taking one form of the mcat is highly unlikely. All the "smart" and "dumb" kids are randomly distributed across all the forms of the mcat. And the number of people taking the mcat in one administration is so big (N=35000) that the number of "smart" kids taking form CK, and the number of "smart" kids taking form AG, is going to end up being approximately equal.

How does that refute my argument? I used extreme cases to make my point more obvious. It is a known fact that there is geographic variation in performance on standardized tests. Also, not every MCAT form is offered at every location; therefore, there could still be unintended biases in scaled scores, however slight. How do you explain the fact that the average bio score was a 7.9 in 1993 and an 8.4 in 1999? If the MCAT is curved and is thus renormalized every year, there should NOT be a .5 jump is scaled scores.
The .5 is not due to random events as zinjanthropus was suggesting. If the variation in scores were random, the likelihood of a forward bias would be balanced by the likelihood of a backwards bias and overtime there would be no variation either way. However, that is NOT the case. There has been a steady increase in the means of ALL sections. This reflects a slight increase in the knowledge base or test savvy of the students. I don't know how you can logically or mathematically justify to yourself that the MCAT or ANY standardized test is curved.
Also, I want you to explain to me why the percentiles for various scaled scores vary significantly overtime. This should never happen with tests that are renormalized on every administration.
 
There seems to be a lot of hostility building up over something that really means very little here....

Anyway, if the MCAT is not curved in some way, how come the number of questions you need to get correct varies from test to test? Yes, the grading is the same from location to location, but since everyone has different test forms, would it make sense for them to assume that a form with an average score of 4 was the same difficulty as a form with an average score of 10?

~AS1~
 
Originally posted by AlternateSome1
There seems to be a lot of hostility building up over something that really means very little here....

Anyway, if the MCAT is not curved in some way, how come the number of questions you need to get correct varies from test to test? Yes, the grading is the same from location to location, but since everyone has different test forms, would it make sense for them to assume that a form with an average score of 4 was the same difficulty as a form with an average score of 10?

~AS1~

Have you read any of the preceding posts? I said earlier and I quote, "different test forms have different scales due to the differences in difficulty [but] the scale for any particular test form is FIXED regardless of how well the students do." Curving a test usually means that you renormalize scores to fit a distribution if the raw scores do not already do that. The MCAT does NOT do that. To do that would prevent the test from being standardized. It would mean that a 10 would mean different things on different administrations depending on the quality of the students taking it. Aren't "standards" supposed to be immutable metrics against which objects are judged? I don't know how else to explain this. Perhaps an introductory text in statistical analysis or a book on the etymology of english words would be instructive.

P.S. There is no hostility. I am just frustrated at people making claims and stubbornly defending it when they have no idea what they are talking about. They rather "feel" they are "right" than concede an error in judgement and move on.
 
Originally posted by AlternateSome1
Ok, I guess this is semantic, but when do you suppose each form is standardized? If you set a scale for a test form after everyone takes it, isn't that the same thing as curving?

~AS1~
It is not semantic. There is a distinct difference between our respective positions. The test forms are scaled BEFORE anyone takes them. The scale for different forms are determined according to how people did on them when they were used as EXPERIMENTAL passages. That is why every section of the MCAT contains atleast one experimental section that does not count towards your score.
Once the statistical properties of various test items are determined, they are normalized against a particular standard test or tests and then the scale is determined. If the predetermined scale said 71/77 was a 13 and everyone taking the test form scored 71/77, then everyone gets a 13. The AAMC will NOT change the scale to adjust for student performance. This gives medical schools an objective way of judging science or verbal aptitude regardless of when the MCAT was taken. A 10 in 1992 is the same as a 10 in 2000.
 
"The test forms are scaled BEFORE anyone takes it. The scale for different forms are determined according to how people did on them when they were used as EXPERIMENTAL passages."

Negative. experimental passages are only that, experimental. they test new question types and styles. They dont use them again...how did you think that up?

If you have 7 passages and only one is experamental that would mean that it would take 6 unique exams from last year to create one exam this year. The math doesnt work. think about it. Unless your making the proposerous claim that passages are recyled SEVEN times before they are dicarded! Of couse that is BS. Think before you speak.
 
Originally posted by hightrump
"The test forms are scaled BEFORE anyone takes it. The scale for different forms are determined according to how people did on them when they were used as EXPERIMENTAL passages."

Negative. experimental passages are only that, experimental. they test new question types and styles. They dont use them again...how did you think that up?

If you have 7 passages and only one is experamental that would mean that it would take 6 unique exams from last year to create one exam this year. The math doesnt work. think about it. Unless your making the proposerous claim that passages are recyled SEVEN times before they are dicarded! Of couse that is BS. Think before you speak.
No buddy YOU need to think before you speak. Each MCAT section has ATLEAST one experimental passage. The AAMC doesn't tell precisely how many experimental passages they give.
Also, your argument is downright silly. If experimental passages are ONLY experimental then how are new test forms created and normalized? The fact is, the AAMC cannot theorize about the statistical properties of a test item. They HAVE TO test it before using it. Psychometrics is not an exact science, it is a social science. Statistics is therefore paramount in psychometrics. They might not use EVERY single experimental passage but they use MANY of them in creating tests. Also, a new test form does not contain all new passages. They contain both new and recycled passages. Test forms often share many passages. They are not completely unique in their passage content. Therefore, you don't need many experimental passages to create a new test form. The reason why the AAMC guards its test contents so jealously is that they reuse MANY test items for several administrations.

You must be in a mood for an ego trip today, but unfortunately for you, you are barking up the wrong tree. You have no idea what you are talking about. Read up on standardizing methods in statistics before offering your next sermon.
 
Originally posted by irie
These are all hypothetical claims. None of us know the true details of MCAT scoring. Calling someone an idiot or other name because they don't agree with your hypothetical answer is a waste of time.

I have a question. Would it work to assign a number score (0-15) to a specific percentile range? It seems like the average percentile would be 50 and each year this could be assigned a number score around 8.

These are not all hypothetical claims. The proposed details of how the MCAT is scored is somewhat hypothetical since there are many ways of standardizing a test and the AAMC does not divulge their unique methods but the fundamental theory remains the same.

As regards your question, percentiles fluctuate from test administration to test administration depending on the quality of students taking the test. An 8 could be any from a 45%tile to a 55%tile.
 
So are the experimental passages scored? If not, that would suck if we spent time on it and didn't have time to finish a passage that will actually be scored.
 
Originally posted by Persistence101
So are the experimental passages scored? If not, that would suck if we spent time on it and didn't have time to finish a passage that will actually be scored.

Yes they are scored to determine their statistical properties but they do not count towards your score.
 
"If experimental passages are ONLY experimental then how are new test forms created and normalized?"

They are not normalized. not perfectly at least.

Your big retort to my observation was that they could have many experamental passages, and that they may only use some old questions in new exams. THAT GETS YOUR POINT NOWHERE.

If you have HALF experamental passages than

A) the test is meaningless becasue two people will time difficulty may both only do half, but depengin on which half they do one guy could get a 1 and one guy get a 15. *****ic. You cannot have 1/2 the exam not count.

That would leave the next exam takers with the old reussed half to take and then a new, ungraded half.

So whan happpens? Well you are still judging only against ONE applicant pool, only now insted of judging you against you current peers their jurging you against lat your peers.
But wait...score wise their not. Since the people last your wernt GRADED on your questions from this year, (the merely answered them) your score still cannot be compared to thiers or anyones onther than your peers becasue NOONE was ever grade on those passages before. So a 30 this year and last year are still different.

Who said anything about absoluse consistancy from year to year? There isnt any. unless you administer the SAME questions many many times you cant have it. And to do that would be ridiculous. you think one shippment of exams being lifted undetected would forever destroy the credibility of th MCAT? No. The exams are new every year.

Every year, we all get together on the examcrackers site and the ones of us with better memory recreate all of the questions. No one ever seem them agiain. People have take 5 and 6 mcats, no one ever sees redundancy...

You have been spounting out poorly thought out gibberish for as long as youve been at SND....
 
Originally posted by hightrump
"If experimental passages are ONLY experimental then how are new test forms created and normalized?"

They are not normalized. not perfectly at least.

Your big retort to my observation was that they could have many experamental passages, and that they may only use some old questions in new exams. THAT GETS YOUR POINT NOWHERE.

All I said was that there was atleast one experimental passage on each section. The AAMC staff have said this several times on their e-mcat forum. They did not get specific as to how many passages were experimental. They also made it clear that experimental passages are not scored. My point was that you don't need MANY experimental passages to create a new form as you were claiming. I don't know why you can't get this.

Originally posted by hightrump

If you have HALF experamental passages than

A) the test is meaningless becasue two people will time difficulty may both only do half, but depengin on which half they do one guy could get a 1 and one guy get a 15. *****ic. You cannot have 1/2 the exam not countThat would leave the next exam takers with the old reussed half to take and then a new, ungraded half.
.
You are obviously not a perceptive thinker. Lets have a little thought experiment to see where your "logic" takes us. Lets assume that there was only one experimental passage and that it was placed in a test randomly. Also, lets say that student B had his experimental passage as the last passage on his test, where as student C had his experimental passage as the first. Lets also assume that they both run out of time and couldn't finish the last passage. Lets then assume that both students got the same percentage of questions correct and student C got all the questions on the first passage correct and their tests are of the same level of difficult. Student B will get a better scaled score than student C due to the fact that student B missed the passage that doesn't count and student C lost credit for his spectacular performance on his first passage (which doesn't count). According to your line of thinking this would be unfair and thus it is "*****ic." Well let me break this to you gently, THE AAMC HAS SAID THAT THERE ARE EXPERIMENTAL PASSAGES AND THAT THEY DO NOT COUNT TOWARDS YOUR SCORE. This means that the above scenario does happen and it is fair. It is fair because you are SUPPOSE to finish the test. It is NOT the job of the AAMC to help you maximize your score. Their job is to try as hard as possible to administer the test in a "standard" way. Since the likelihood of being screwed by an experimental passage is the same for EVERYONE, it is a fair test.
 
Originally posted by hightrump
So whan happpens? Well you are still judging only against ONE applicant pool, only now insted of judging you against you current peers their jurging you against lat your peers.
But wait...score wise their not. Since the people last your wernt GRADED on your questions from this year, (the merely answered them) your score still cannot be compared to thiers or anyones onther than your peers becasue NOONE was ever grade on those passages before. So a 30 this year and last year are still different.
Although the quality of students taking the test in a 10 year period may improve significantly, the quality is not likely to improve dramatically from year to year therefore standardizing August tests with data obtained from April in the same year and August of the previous year is not unreasonable. The accuracy of this standardizing method becomes more apparent when one considers the fact nearly 50% of the test-takers retake the exam. The assumption of steady-state conditions at the boundaries of dynamic processes is used quite often and judiciously in the sciences. It is very often used in chemical kinetics, thermodyamics and population dynamics with a great degree of accuracy. Another error you make is that you assume that test items are tested only once. This is not true. Testing an item more than once makes it more likely that you will accurately determine its statistical properties. Another fact you are neglecting is that scales for test forms are not only determined by how well students who took it(experimentally) did but also how well they did compared to how well they could have done (theoretically) on some predetermined "standard" test or tests. That is why the MCAT is called a "standardized" test.
These are all common methods used in standardizing tests. I am NOT saying that the AAMC uses my proposed algorithms. My algorithms are all speculative. I don't know exactly how they standardize their tests since there are many valid ways of doing it. But I know what they don't do and that is "curve" the tests just because this administration's students found it "hard". I know this because the fundamental theory behind statistical standardization is the same no matter who does it.

Originally posted by hightrump

Who said anything about absoluse consistancy from year to year? There isnt any. unless you administer the SAME questions many many times you cant have it. And to do that would be ridiculous. you think one shippment of exams being lifted undetected would forever destroy the credibility of th MCAT? No. The exams are new every year.

Since scaled scores for "curved" tests would be based on percentile ranking, the percentiles for particular scaled scores would be APPROXIMATELY the same OVER TIME. This is because although there might be slight fluctuations from test to test, the net backward bias would be balanced by the net forward bias. However, this is NOT the case for the MCAT as is evidenced by the monotonic increase in scaled scores in the past 10 years; therefore, the MCAT is NOT curved. I don't know how else to make this any simpler for you.
Originally posted by hightrump

Every year, we all get together on the examcrackers site and the ones of us with better memory recreate all of the questions. No one ever seem them agiain. People have take 5 and 6 mcats, no one ever sees redundancy...
Given that there are THOUSANDS of possible test items the AAMC can use and given that for each passage there are many questions of slightly different wording (but with the same level of difficulty) and also given that there are many test forms randomly given out, what do you think the likelihood of getting atleast one passage you got from a previous administration is? Out of 30000 students taking the tests, what are the odds that the EXTREMELY FEW people who got a passage or question from a previous administration they partook in will be on the EK board? Add the fact that most retakers didn't do particularly well the first time they took it and thus are very nervous during the retake and you would realize that it is very unlikely they are going to remember the details of a test that has 214 questions and is stressful mentally
Originally posted by hightrump

You have been spounting out poorly thought out gibberish for as long as youve been at SND....

Wow you must already harbor some deep, negative feelings towards me. You probably resent me because I took you to task in past threads for your non sequitors. I hope you can forgive me.
 
Top