Is it possible to know everything for the MCAT? AAMC content outline

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.
There's no randomness associated with 520+ MCAT scores. What people perceive as "randomness" is a factor that cannot be prepped for. That is, that factor is how well you've already prepared your analytical reading/analytical science skills from the classes you've taken. On the new MCAT, the number one factor that determines whether you're in the mid-90s percentiles or the 99th+ percentile is whether you can take scientific data and make reasonable inferences from it based on your background knowledge. That sort of analytical reasoning is the most important skill to have and not something that can be easily prepped for.
Only 75% of the test relies on reasoning skill. 25% are discrete. Say you with the big brain can always get all the 75% right (this is easy), how can you be sure that you always get almost all the discrete right? Because to always score 520+, you have to score at least 97% in all section (or if you are assuming your CARS is bad, you have to score 99%+ in all 3 science sections) - out of those 14 discrete you cannot miss more than 2 (or 1). THAT is the randomness that people are talking about.

And this does not include random mistakes, brain fart, distraction, misreading questions etc...
 
Last edited:
There's no randomness associated with 520+ MCAT scores. What people perceive as "randomness" is a factor that cannot be prepped for. That is, that factor is how well you've already prepared your analytical reading/analytical science skills from the classes you've taken. On the new MCAT, the number one factor that determines whether you're in the mid-90s percentiles or the 99th+ percentile is whether you can take scientific data and make reasonable inferences from it based on your background knowledge. That sort of analytical reasoning is the most important skill to have and not something that can be easily prepped for.
One phrase. Confidence interval bands . The scores literally have them. What do think that is?
 
One phrase. Confidence interval bands . The scores literally have them. What do think that is?

Mid-90s percentiles do not overlap with 99+ percentiles in the confidence bands. For example, the confidence band for a 526 is like 522-528. I'm referring to that score bracket as similar. In other words, there is no randomness associated with getting 520+ as opposed to getting <520. The skills associated with getting a 520+ are distinct from the skills required to get, say, 515-520. Now, if you're sitting at 520, the confidence interval will overlap both subsets but that just means that you have some incomplete mix of these two distinct sets of skills.

Also, note that statistical significance does not signify actual or substantial significance. In other words, two scores that are statistically indistinguishable may actually be distinguishable. The fact that they are statistically indistinguishable only says that our method of measurement may not be capturing the full essence of each score.
 
Only 75% of the test relies on reasoning skill. 25% are discrete. Say you with the big brain can always get all the 75% right (this is easy), how can you be sure that you always get almost all the discrete right? Because to always score 520+, you have to score at least 97% in all section (or if you are assuming your CARS is bad, you have to score 99%+ in all 3 science sections) - out of those 14 discrete you cannot miss more than 2 (or 1). THAT is the randomness that people are talking about.

That's not how it works. Everything is graded on a standardized curve so that how many you can get wrong depends on the difficulty of the version of the exam you get. Getting the discrete questions right have to do with understanding that 1) you cannot possibly know every single detail of every topic that's listed on the topic list and 2) that you can use process of elimination and reasoning to your advantage here, too. If it's an obscure fact, then almost nobody will be getting it except a very fortunate few who happened to stumble upon it and the curve will reflect that. If it's a fact that someone who has mastered the material should know or be able to figure out, then the curve also recognizes that and will be harsh on that question.

There's no randomness to it. There's a reason why I was able to consistently score >520 on both FLs and ended up in that range on test day. If it were random, one would explain significant fluctuations (i.e. <520 in this case) between FLs because the discretes were all different between tests.
 
Also, note that statistical significance does not signify actual or substantial significance. In other words, two scores that are statistically indistinguishable may actually be distinguishable. The fact that they are statistically indistinguishable only says that our method of measurement may not be capturing the full essence of each score.

How?
 

The specific answer would depend on the specific circumstances used by the AAMC to calculate those intervals, which I do not believe is known to anybody other than the AAMC.

But in general, many people get caught up in the concept of "statistical significance" without actually thinking about what it means. I'll illustrate using an example. Imagine you're measuring weight gain on diet A versus diet B. You find that with a sample size of 100 people, you get a weight gain of 10 +/- 2 on diet A and 12 +/- 3 on diet B. The error is rather large because outliers tend to influence the result more when you only have a small sample size. Now you would say that this is statistically insignificant and move on. But now increase that sample size to 1000 or 10,000 patients. Now, you do the same measurements but because you have many more data points, by the central limit theorem, you're going to converge on the actual mean pretty hard. So you get a result of weight gain of 10 +/- 0.1 on diet A and 12 +/- 0.2 on diet B. Now these are statistically significant. All you've done is collect more data in order to improve your confidence in the measurement.

So the question is, is there actually a difference between diet A and diet B? Well, statistical significance can't answer that question! As you've seen, the same measurement when performed over a different sample size gives two different answers. So then if statistical significance may not reflect actual significance, then what is statistical significance? Well, it's simply your confidence in your measured numbers being arbitrarily close to the actual or real numbers. If you do more measurements over a larger sample, you're going to be more confident that your measured number will be closer to the real number, right?
 
There's no randomness to it. There's a reason why I was able to consistently score >520 on both FLs and ended up in that range on test day. If it were random, one would explain significant fluctuations (i.e. <520 in this case) between FLs because the discretes were all different between tests.

There are people who take the AAMC scored and get like a 510-512 a few days before the test and then somehow pull a 520+ on test day. The opposite has also happened. Admittedly it's more rare but I'm guessing they did a good job reviewing the few days before the test and/or they simply got a test that included topics they were strong on. There is undeniably a small luck factor with this test especially after a certain percentile.
 
There are people who take the AAMC scored and get like a 510-512 a few days before the test and then somehow pull a 520+ on test day. The opposite has also happened. Admittedly it's more rare but I'm guessing they did a good job reviewing the few days before the test and/or they simply got a test that included topics they were strong on. There is undeniably a small luck factor with this test especially after a certain percentile.

If you think so. It is my opinion that there is no luck component that will alter one's score by more than 1 point in either direction, given that one has prepared adequately by the AAMC outline (i.e. if you skip a whole topic, then don't expect to magic out the answers to discretes on that topic).
 
If you think so. It is my opinion that there is no luck component that will alter one's score by more than 1 point in either direction, given that one has prepared adequately by the AAMC outline (i.e. if you skip a whole topic, then don't expect to magic out the answers to discretes on that topic).

We are all entitled to our opinions.
 
The specific answer would depend on the specific circumstances used by the AAMC to calculate those intervals, which I do not believe is known to anybody other than the AAMC.

But in general, many people get caught up in the concept of "statistical significance" without actually thinking about what it means. I'll illustrate using an example. Imagine you're measuring weight gain on diet A versus diet B. You find that with a sample size of 100 people, you get a weight gain of 10 +/- 2 on diet A and 12 +/- 3 on diet B. The error is rather large because outliers tend to influence the result more when you only have a small sample size. Now you would say that this is statistically insignificant and move on. But now increase that sample size to 1000 or 10,000 patients. Now, you do the same measurements but because you have many more data points, by the central limit theorem, you're going to converge on the actual mean pretty hard. So you get a result of weight gain of 10 +/- 0.1 on diet A and 12 +/- 0.2 on diet B. Now these are statistically significant. All you've done is collect more data in order to improve your confidence in the measurement.

So the question is, is there actually a difference between diet A and diet B? Well, statistical significance can't answer that question! As you've seen, the same measurement when performed over a different sample size gives two different answers. So then if statistical significance may not reflect actual significance, then what is statistical significance? Well, it's simply your confidence in your measured numbers being arbitrarily close to the actual or real numbers. If you do more measurements over a larger sample, you're going to be more confident that your measured number will be closer to the real number, right?

Thanks for the detailed example illustrating the differences. But still a bit confused. Aren't the confidence intervals for MCAT scores quite small (and thus more precise and less overlap in standard error) due to a large sample size of test takers (around tens of thousands)? I'm not sure why then is there a difference between statistical and actual significance. Wouldn't two statistically indistinguishable scores almost always be actually indistinguishable?
 
We are all entitled to our opinions.

Of course. But if you achieve those percentiles and still believe that luck played a substantial role in your achieving that score, then I would be more inclined to believe you.
 
Thanks for the detailed example illustrating the differences. But still abit confused. Aren't the confidence intervals for MCAT scores quite small (and thus more precise) due to a large sample size of test takers (around tens of thousands)? I'm not sure why then is there a difference between statistical and actual significance. Wouldn't two statistically indistinguishable scores almost always be actually indistinguishable?

What do you define as small? Like I said, the confidence interval for 526 is from 522 to 528, a 6-point range. Many people would kill for a 6-point boost in their MCAT. I'm not sure whether that narrows out as the mean score is approached.

The point is, statistical significance says nothing about actual significance. In other words, actual significance can't be determined by the sample size - that's a completely arbitrary thing. It's either significant or not - there's no "it's only really significant if you do the measurement on 10,000 people." Therefore, since statistical significance relies on sample size, then statistical significance can't say anything about actual significance. Which brings me to the elephant in the room. How does one measure actual significance? There's no good measure for actual significance that involves number crunching. Sorry to the statisticians, but there's simply no good way. But there are other ways one can tell if an effect is actually significant. In medicine, for example, it doesn't matter whether some drug lowered some biomarker by some statistically significant amount or not. If that drug has an actual effect, then it should cure what it's supposed to be curing, biomarkers be damned. But then you run into the problem of drugs that aren't 100% effective. So then a portion of patients are cured and a portion are not. Then you have to crunch the numbers again and see if the number cured with the treatment is significantly different from the placebo.

So it's just a huge, self-perpetuating cycle and we use it because we don't have a better method of measuring actual significance. Statistical significance is a good measure of actual significance in many cases but one always has to keep in mind that it's an arbitrary measure that we use. Like all arbitrary things, it may not apply in all cases and likely does not.

Whether the MCAT score confidence intervals are actually significant would depend on how the AAMC is estimating its own error.
 
That's not how it works. Everything is graded on a standardized curve so that how many you can get wrong depends on the difficulty of the version of the exam you get. Getting the discrete questions right have to do with understanding that 1) you cannot possibly know every single detail of every topic that's listed on the topic list and 2) that you can use process of elimination and reasoning to your advantage here, too. If it's an obscure fact, then almost nobody will be getting it except a very fortunate few who happened to stumble upon it and the curve will reflect that. If it's a fact that someone who has mastered the material should know or be able to figure out, then the curve also recognizes that and will be harsh on that question.
1) You are wrong. I knew every single detail listed on the content outline so when I saw a question I could immediately tell whether it was on it or not. Do you know why? Because I didn't take any class. I studied by google. If a topic requires depth 3, I would be damn sure to be at least 6 into it. BUT I did that for fun, it should not be necessary.

2) lol. They will never have an obscure discrete with 4 obscure answer choices. People who can get the 75% reasoning questions correctly are smart enough to at least eliminate 1 or often 2. Then you have 50% or 33% chance of getting each of those questions correctly. If you think 50%/33% will only result in "a very fortunate few," more power to you. Plus, the curve around 520 reflects that.
The confidence interval for 526 is from 522 to 528, a 6-point range. Many people would kill for a 6-point boost in their MCAT.
Your 6 point range may be equivalent to 6 correct answers. I posit that at around 505 (~the mean) your 6 correct answers if spread evenly may result in between ... 0 and 1 point increase. So no, the majority of people will not give a anything for your 6 correct answers.

There's no randomness to it. There's a reason why I was able to consistently score >520 on both FLs and ended up in that range on test day. If it were random, one would explain significant fluctuations (i.e. <520 in this case) between FLs because the discretes were all different between tests.
N =1 and I can find examples that nullify yours.

And yes if I discounted CARS, I would consistently score at least 130 in all other sections. But for me to consistently score >520, I have to score at least 132 (lol)/131 . That consistency is impossible due to the randomness.

And you did not get 528. If your method is of such certainty, why did you not score 528? What went so horribly wrong? Are you sure if you retake the test, you will score exactly the same in all 3 science sections? If there is no randomness (your words, not mine) why is there even a range to begin with? If you did not literally mean "no randomness," how much/ little randomness will qualify?

Edit: The thread is in relation to the AAMC content outline. Just to be clear, I don't argue that you or some particular person can consistently score 520+. I am saying that if you don't step out of the content outline. How far out? It depends on what"classes" you take or the correct "background." It's not probable if you rely on the AAMC 100%. It is not even fair.
 
Last edited:
The confidence bands only really show that maybe due to random chance you may have gotten one or two more wrong/correct, yet there is more to the story. In each section there are 59 questions (53 on CARS), and often the difference at the high end of the scale (129 vs 131 vs 132) is one question, maybe two at most, which could be guessed on/misread/nerves got the best/whatever. If someone is a consistent top ~ish scorer (>514-516) there is potential on test day that they will see a huge jump or huge decrease, or something inbetween. mid-90s cannot really be differentiated from the 99th if a person is realistic about it; it would be pretty silly to think that someone who got a 225/230 questions right is a genius, but boy do those kids in the 220/230 range struggle with analytical and reading skills.

On top of this; the test is just to see if you have the aptitude to pass the boards as a student. That is it, and anything over 500 is supposed to be 'good enough.' It's not for measuring who gets their tiara as the little miss genius of America pageant winner, and to assume that it does so is incorrect.

There's no randomness associated with 520+ MCAT scores. What people perceive as "randomness" is a factor that cannot be prepped for. That is, that factor is how well you've already prepared your analytical reading/analytical science skills from the classes you've taken. On the new MCAT, the number one factor that determines whether you're in the mid-90s percentiles or the 99th+ percentile is whether you can take scientific data and make reasonable inferences from it based on your background knowledge. That sort of analytical reasoning is the most important skill to have and not something that can be easily prepped for.
 
The MCAT does not test much specialized knowledge. Knowing a little bit about everything is sufficient. Being able to pull the important information out of the passages and synthesize it with basic scientific principals will bring you much farther than knowing every fact on the list.
 
If you think so. It is my opinion that there is no luck component that will alter one's score by more than 1 point in either direction, given that one has prepared adequately by the AAMC outline (i.e. if you skip a whole topic, then don't expect to magic out the answers to discretes on that topic).
I think it is incredible that you think that you would score the exact same score down to the subsection scores if you took multiple MCATs with the same level of preparation. That is the entire point of the CI. There is a +-2 point band . That is four points the difference between a 516 and 520. You may revolve around your mean but that doesn't mean you can't get a softball with a few passages based on your strongest prep or fall on though luck with some passages from topics that are difficult for you. There is an incredible amount of randomness to life in general saying that the exam is perfect is arrogant. I haven't even delved into the personal performance issues you may face. Like quality of sleep, nerves, digestive trouble , all of those things can ultimately impact performance on a personal level really swinging scores.
 
1) You are wrong. I knew every single detail listed on the content outline so when I saw a question I could immediately tell whether it was on it or not. Do you know why? Because I didn't take any class. I studied by google. If a topic requires depth 3, I would be damn sure to be at least 6 into it. BUT I did that for fun, it should not be necessary.

2) lol. They will never have an obscure discrete with 4 obscure answer choices. People who can get the 75% reasoning questions correctly are smart enough to at least eliminate 1 or often 2. Then you have 50% or 33% chance of getting each of those questions correctly. If you think 50%/33% will only result in "a very fortunate few," more power to you. Plus, the curve around 520 reflects that.

That's the mindset that can get you into the 515-520 range but what puts one into the 520+ range isn't luck. I think your argument there is more self-serving than not. Does not breaking into the 520+ range mean that you weren't lucky? No. Does it mean that you didn't prep enough? No. What puts someone into 520+ isn't the level of prep they did in the months leading up to the exam but rather whether they have mastered the critical reasoning and analysis skills required to put them there in their four years of undergraduate education. That's why I don't teach chemistry - even organic - with an emphasis on memorization or content knowledge. Knowing what reagents to use to reduce an aldehyde selectively over a ketone is good but in real life, you can always look that up easily. What I care about - and what I test on - is the ability to reason based on information given. I do that because based on our experience, students who develop those skills do very well on the MCAT (i.e. getting into the 520+ range as opposed to the 515-520 range).

N =1 and I can find examples that nullify yours.

And yes if I discounted CARS, I would consistently score at least 130 in all other sections. But for me to consistently score >520, I have to score at least 132 (lol)/131 . That consistency is impossible due to the randomness.

And you did not get 528. If your method is of such certainty, why did you not score 528? What went so horribly wrong? Are you sure if you retake the test, you will score exactly the same in all 3 science sections? If there is no randomness (your words, not mine) why is there even a range to begin with? If you did not literally mean "no randomness," how much/ little randomness will qualify?

Edit: The thread is in relation to the AAMC content outline. Just to be clear, I don't argue that you or some particular person can consistently score 520+. I am saying that if you don't step out of the content outline. How far out? It depends on what"classes" you take or the correct "background." It's not probable if you rely on the AAMC 100%. It is not even fair.

Your definition of randomness and mine are not the same. Like I said to someone else above, I'm referring to there being no luck or magic in getting a score in the 520+ range as opposed to the 515-520 range. I'm not saying that there is no fluctuation about the score itself. You're taking all this to mean that there is no variability in the score itself. There can be variability in the number - small variability. If someone with a 526 retakes the exam, he or she might score a 522 or a 528. That's variability (for a discussion on statistical analysis, see above post). But there's nothing random about that person getting a 522 as opposed to a 516. There's nothing random about getting a 520+ score versus getting a 515-520 score. That's what I'm saying.

In response to your edit, I believe this is where you can find some sense in what I'm saying. Doing everything on the content outline to the adequate depth (as you yourself admit to doing) will get you into the 515-520 range. Speak with other high scorers (520+) and you'll find that many agree with me. But what gets you into 520+ isn't on the outline. Not because it's extraneous knowledge but rather because it relies on reasoning ability. That's why people have so much trouble on CARS. You can get by with good scientific reasoning skills with excellent, over-the-top content knowledge in the science sections. But in order to do well in CARS, you have to have well-developed reasoning skills and you can't get away with anything less than that. So if reasoning is weak, then one would have to make up for that with very good content knowledge (if you have good content knowledge, you can rule out many answer choices even in data analysis questions). To do consistently well (>130 on each section), you have to have excellent reasoning skills and content knowledge. That puts you in the top.
 
Last edited:
The confidence bands only really show that maybe due to random chance you may have gotten one or two more wrong/correct, yet there is more to the story. In each section there are 59 questions (53 on CARS), and often the difference at the high end of the scale (129 vs 131 vs 132) is one question, maybe two at most, which could be guessed on/misread/nerves got the best/whatever. If someone is a consistent top ~ish scorer (>514-516) there is potential on test day that they will see a huge jump or huge decrease, or something inbetween. mid-90s cannot really be differentiated from the 99th if a person is realistic about it; it would be pretty silly to think that someone who got a 225/230 questions right is a genius, but boy do those kids in the 220/230 range struggle with analytical and reading skills.

On top of this; the test is just to see if you have the aptitude to pass the boards as a student. That is it, and anything over 500 is supposed to be 'good enough.' It's not for measuring who gets their tiara as the little miss genius of America pageant winner, and to assume that it does so is incorrect.

No, actually confidence bands, by definition, show how confident the person doing the measuring is that the measurement being performed accurately measures the variable being measured and to what extent. It says nothing about whether the people being measured are actually different in terms of something concrete. For example, one could make measurements of MCAT score between black people and white people and, using very large sample sizes, may get something like 514 +/- 0.1 for black people and 513 +/- 0.4 for white people. Now, there is a statistical difference in IQ score between these groups of people. That's mathematical fact. But are these people actually different in terms of scientific reasoning ability or whatever the MCAT purports to measure?

That question goes into uncharted territory. An answer to that question can only be based in opinion, because nobody knows the true value of these intangibles. Is someone who got 225/230 a genius and someone who got a 220/330 an idiot? No. You're thinking only at the extremes. Does someone who got a 520+ have better analytical skills than someone who got a 516? Perhaps but that's my opinion. That's how I feel. My point here is that statistical analysis cannot tell you the answer.

While the MCAT is correlated with Step 1 scores, neither of these exams measure how good of a doctor you will be. Some excellent doctors failed their Step 1 first time around. Some bad ones passed with flying colors. Both of these exams purportedly measure analytical ability but does analytical ability really translate into being a good doctor? No. Someone with excellent analytical ability isn't going to necessarily be a better doctor than someone with only good analytical ability. Much more goes into that. But the question I'm trying to answer here is quasi-tautological: does someone with excellent analytical ability have better analytical skills than someone with good analytical ability? Yes. But does that really matter in the grander scheme of things? No. Someone with a 520+ MCAT score isn't going to be any better a doctor than someone with a 515-520. But the former does have better analytical reasoning abilities - in my opinion.
 
I think it is incredible that you think that you would score the exact same score down to the subsection scores if you took multiple MCATs with the same level of preparation. That is the entire point of the CI. There is a +-2 point band . That is four points the difference between a 516 and 520. You may revolve around your mean but that doesn't mean you can't get a softball with a few passages based on your strongest prep or fall on though luck with some passages from topics that are difficult for you. There is an incredible amount of randomness to life in general saying that the exam is perfect is arrogant. I haven't even delved into the personal performance issues you may face. Like quality of sleep, nerves, digestive trouble , all of those things can ultimately impact performance on a personal level really swinging scores.

The exam is not perfect. If you miss questions because of topics that are difficult for you, then perhaps spending more time on reviewing those topics would be more beneficial. The idea is to be strong (I would say at least 7/10) on all topics that could show up. At that point, there will not be more than a 1 point difference in subsection score when you take it multiple times, ceteris paribus - in my opinion. If you don't get enough sleep the night before or decide to eat curry, that's on you. If it's an unforeseen medical problem, then yes of course there will be a difference between test dates but that's not my point. If you have nerves on one exam, you'll likely have it on the next, too, so it's not like it'll affect one performance disproportionately. I'm saying that if you go into two test dates under the same conditions (i.e. similar level of sleep, preparedness, nerves, etc.) but take two different versions of the test, there would not be more than a 1 point difference (I'm talking about subsection scores - I'm not sure if that was clear) between your scores.

Now let me get into the problems with confidence intervals (I don't know why people on here seem to accept everything at face value). Think about how a confidence interval is created. You perform some measurement on some large sample of people. That's the basic premise. The AAMC is using real data - not perfect statistical data. So bias is already introduced into that data - most people who go into that data set aren't at the 7/10 strength on all topics, perhaps they got much less sleep the first time, etc. So of course at the population level, there will be variability based on individual biases. However, that doesn't mean that the confidence interval can apply to everyone. When you try to apply a confidence interval constructed from a population measurement to one individual, you have to assume that the individual's characteristics match those of the population on average. See the problem here? The confidence interval says nothing about somebody who goes into the exam at at least 7/10 strength on all topics, well-rested, and with no freak accidents. For a better and more thorough statistical analysis of the confidence interval, see above posts.
 
Everything is graded on a standardized curve so that how many you can get wrong depends on the difficulty of the version of the exam you get.

This isn't true. The MCAT is not graded on a curve, it is scaled. It is scaled to the historical scores that someone with your percentage correct has recieved in the past. The idea that each exam is curved is a myth
 
This isn't true. The MCAT is not graded on a curve, it is scaled. It is scaled to the historical scores that someone with your percentage correct has recieved in the past. The idea that each exam is curved is a myth

Do you understand what a curve is? I never said each exam is curved based on other test-takers that day. I believe that most people know that is untrue. Each exam is curved based on a standardized curve - that's a curve that has already been constructed based on previous results. Last year, they updated the curve once more people had taken the MCAT and each MCAT now is based on that curve. So yeah, each exam is curved. It's not curved based on what other people got on your test day but rather based on a curve that has already been set - that's how you get a scaled score.
 
Do you understand what a curve is? I never said each exam is curved based on other test-takers that day. I believe that most people know that is untrue. Each exam is curved based on a standardized curve - that's a curve that has already been constructed based on previous results. Last year, they updated the curve once more people had taken the MCAT and each MCAT now is based on that curve. So yeah, each exam is curved. It's not curved based on what other people got on your test day but rather based on a curve that has already been set - that's how you get a scaled score.
Soooooooooooo. What you are saying is people have different "versions" of the test, but their scores have no relation to the difficulty/content of the test they took. I used "versions" loosely because one C/P version may have 4 O.Chem passages and 0 biochem while another has the complete opposite. Using your drugs analogy, basically, you are giving people drug X and try to assess their responses. The caveat is that there are, say, 20 ingredients in drug X and you can only pack around 6 with random proportions. Each cohort is given a different version. But you never analyze individual cohort but group all of them.
Now you go to the tail end of an arbitrarily defined reactions scale, then arbitrarily bin them in random percentile and declare that there is a difference between the 99% tile and 95-98%tile.
???

That's the mindset that can get you into the 515-520 range but what puts one into the 520+ range isn't luck. I think your argument there is more self-serving than not. Does not breaking into the 520+ range mean that you weren't lucky? No. Does it mean that you didn't prep enough? No. What puts someone into 520+ isn't the level of prep they did in the months leading up to the exam but rather whether they have mastered the critical reasoning and analysis skills required to put them there in their four years of undergraduate education. That's why I don't teach chemistry - even organic - with an emphasis on memorization or content knowledge. Knowing what reagents to use to reduce an aldehyde selectively over a ketone is good but in real life, you can always look that up easily. What I care about - and what I test on - is the ability to reason based on information given. I do that because based on our experience, students who develop those skills do very well on the MCAT (i.e. getting into the 520+ range as opposed to the 515-520 range).


Your definition of randomness and mine are not the same. Like I said to someone else above, I'm referring to there being no luck or magic in getting a score in the 520+ range as opposed to the 515-520 range. I'm not saying that there is no fluctuation about the score itself. You're taking all this to mean that there is no variability in the score itself. There can be variability in the number - small variability. If someone with a 526 retakes the exam, he or she might score a 522 or a 528. That's variability (for a discussion on statistical analysis, see above post). But there's nothing random about that person getting a 522 as opposed to a 516. There's nothing random about getting a 520+ score versus getting a 515-520 score. That's what I'm saying.
How did you magically conjure up those bins? I just asked you what caused the fluctuation, I couldn't care less about variability within a data set of many test takers (I don't think you took the MCAT more than once right? Or enough to make what may resemble a data set and therefore the confusion). Therefore, why did not you score 528? And in the versions and you scored below 526 why did you not score 526 then? Very simple questions with similarly simple answer, starting with "R."- wink-


In response to your edit, I believe this is where you can find some sense in what I'm saying. Doing everything on the content outline to the adequate depth (as you yourself admit to doing) will get you into the 515-520 range. Speak with other high scorers (520+) and you'll find that many agree with me. But what gets you into 520+ isn't on the outline. Not because it's extraneous knowledge but rather because it relies on reasoning ability. That's why people have so much trouble on CARS. You can get by with good scientific reasoning skills with excellent, over-the-top content knowledge in the science sections. But in order to do well in CARS, you have to have well-developed reasoning skills and you can't get away with anything less than that. So if reasoning is weak, then one would have to make up for that with very good content knowledge (if you have good content knowledge, you can rule out many answer choices even in data analysis questions). To do consistently well (>130 on each section), you have to have excellent reasoning skills and content knowledge. That puts you in the top.
No. I did not have adequate depth, I am beyond deep in every single all the topics in the outline. That was how I know if a word, not even a topic, a word, is not covered. Speak with other high scorers? You cannot seriously think that is a good idea. What do you want me to do? Retake the test to score higher than 520 and declare myself lucky? And stop with the data analysis man, those are easy.

I don't know man. Since I know both of us has absolutely concrete evident in hand your guess is as good as mine. But let's go a bit deeper. That was what I said and what you dodged:

They will never have an obscure discrete with 4 obscure answer choices. People who can get the 75% reasoning questions correctly are smart enough to at least eliminate 1 or often 2. Then you have 50% or 33% chance of getting each of those questions correctly. If you think 50%/33% will only result in "a very fortunate few," more power to you.

If you look at the 130-132 bins of every single section you will notice that the height of each of them is about 50%-33% with respect to the previous bin. Does that mean I was right that to get +520 you had to rely on luck? Of course not. I need more data than that. BUT at least, I have some numerical basis to lean on.
What do you have to explain that?
N=1 plus that the "reasoning ability" that "required to put them there in their four years of undergraduate education." (??)

As to your CARS thing, geez I don't know. How do you explain 126/132/127/128 -> retake 128/128/130/130. Did this person suddenly lose his "reasoning ability" that "required to put them there in their four years of undergraduate education"?

I wonder what the data for CARS look like if we only look at test taker with humanity majors? I think that the curve will shift considerably to the right. Do you wanna bet? What about CARS data for only Canadians who must rely on CARS to get into med school? I think the same will happen. Wanna bet?
 
Last edited:
The big separation on the exam are those passages / questions that are intended to be tough or ambiguous. The highest scorers reason through them using what they know, what they're given, and by process of elimination. A lot of us did that, the highest scorers just do it more consistently and get most of the questions that they "didn't even know" right. I think the exam is written this way on purpose. If it were strictly content based a lot of us would ace it pretty easily. In this sense I don't think the scoring is really that random. I enjoy reading everyone's take on it, though.

For reference I am not a 520+ scorer, but I did well and balanced across my sections. I also hit my wall in studying where I knew that improving my analytical skills to the highest level would take maybe another year of research or something similar for a minimal score increase. Or I would have needed to engage in a more rigorous middle school, high school, and undergrad academic load to get to the ultimate level, which wasn't really needed for me to become a doctor. No short term prep can make up for lifelong developed skills and I think that's more what @aldol16 is saying. (Feel free to correct me if I'm wrong)

I think a lot of people take the test without really being ready, get flustered, and get owned on a lot of the questions. I wonder what the scales would be like if more people studied correctly.
 
Soooooooooooo. What you are saying is people have different "versions" of the test, but their scores have no relation to the difficulty/content of the test they took. I used "versions" loosely because one C/P version may have 4 O.Chem passages and 0 biochem while another has the complete opposite. Using your drugs analogy, basically, you are giving people drug X and try to assess their responses. The caveat is that there are, say, 20 ingredients in drug X and you can only pack around 6 with random proportions. Each cohort is given a different version. But you never analyze individual cohort but group all of them.
Now you go to the tail end of an arbitrarily defined reactions scale, then arbitrarily bin them in random percentile and declare that there is a difference between the 99% tile and 95-98%tile.

Analogy incorrect. More like 20 ingredients in drug X but each of those ingredients has an alternative that is supposed to do the same thing (maybe in a slightly different way). Just like how each passage is supposed to measure similar things - i.e. content knowledge, reasoning from the data, reasoning beyond the text, etc. The only case where your analogy would make sense is for the limited discrete questions you get on the exam, which may not all measure the same thing across exams (although I would argue that if one is prepared for all the topics on the exam, then it doesn't matter if one gets an OChem discrete or a biochem discrete).

Also, scores have very much to do with difficulty of the version of exam. Just not in the way you think. Scores are curved based on the difficulty of that version (based on a set curve). But your scores should not be affected by you getting four passages on topics you're "weak" on because you shouldn't be "weak" in any topics. That's on you. You should gain an adequate understanding to do well especially in your weak subjects. As I said above, I'd put that at 7/10 personally.

There is at least a statistical difference between the 99th+ percentile and 95th percentile. From the curve, a 99th+ percentile score corresponds to 523 and up whereas a 95th percentile score corresponds to a 516. Those confidence intervals do not overlap. That's not a matter of opinion.

What is a matter of opinion is whether I think there are finer distinctions between individuals who have overlapping confidence intervals. I do.

How did you magically conjure up those bins? I just asked you what caused the fluctuation, I couldn't care less about variability within a data set of many test takers (I don't think you took the MCAT more than once right? Or enough to make what may resemble a data set and therefore the confusion). Therefore, why did not you score 528? And in the versions and you scored below 526 why did you not score 526 then? Very simple questions with similarly simple answer, starting with "R."- wink-

Like I said, I do not believe that if one goes into the exam and takes it multiple times, conditions ceteris paribus, that one would get more than a 1-point deviation in each subsection. That is the only luck or "randomness" I would allow for, as I said above to another poster. You're interested in fluctuations in score. I'm not. I'm interested in whether there is any randomness with scoring 520+. As in, is a person scoring 520+ actually different from someone in the 515-520 range or is it just "luck"? I'm not arguing that there's no statistical randomness associated with measuring someone's real MCAT score. But there's nothing random that put them there as opposed to the next lowest level.

No. I did not have adequate depth, I am beyond deep in every single all the topics in the outline. That was how I know if a word, not even a topic, a word, is not covered. Speak with other high scorers? You cannot seriously think that is a good idea. What do you want me to do? Retake the test to score higher than 520 and declare myself lucky? And stop with the data analysis man, those are easy.

I believe that if one has adequate reasoning abilities, one can go into much less depth and still do very well on the exam. As I said, the way you can usually tell is by any discrepancy between CARS and other subsection scores. If someone needs to go into excessive depth to do well, then that person is relying more on the small details getting them to a high score than their reasoning abilities. That shows because on CARS, there's no detail to know. There's only reasoning.

I don't know man. Since I know both of us has absolutely concrete evident in hand your guess is as good as mine. But let's go a bit deeper. That was what I said and what you dodged:

If you look at the 130-132 bins of every single section you will notice that the height of each of them is about 50%-33% with respect to the previous bin. Does that mean I was right that to get +520 you had to rely on luck? Of course not. I need more data than that. BUT at least, I have some numerical basis to lean on.

I believe this has more to do with your statistical and mathematical reasoning than me. I ignored it because it didn't make sense and here's why it didn't make sense. The first problem is that if the 50% and 33% chance of getting a question right has a direct relationship with getting a 130 vs. 131 or 131 vs. 132, then one would expect each one of those successive bins to be equal in height to or half the height of the previous bin, respectively - that is, 100-50% not 50-33%. For example, 100 people are stuck between 131 and 132. Imagine that the score hinges on one question and all of those people have a 50% chance of getting that right. Then one would expect 50 to get it right and 50 to get it wrong. So the first 50 would get the 132 whereas the last 50 would get a 131. The height of those bins would be the same. Now imagine the same scenario but with a 33% chance of getting the question right. You would expect about 33 people to get it right and 67 to get it wrong. Therefore, 33 people would get the 132 and 67 would get a 131. The size of the 132 bin would be half that of the 131 bin.

Second, the above even gives you the benefit of the doubt that there's a direct, 1:1 relationship between getting a question right and scoring that extra point on the MCAT. The 50% drop in height of the bins from 130-132 with each successive point in the histogram only tells you that 50% fewer test takers score in that bin. That is, there are fewer and fewer people getting those scores. The 50% chance of getting a question correct after eliminating two (or two-and-a-half) answer choices would correlate directly with the drop in frequency of people getting into those top bins only if scoring on the MCAT did not take into account the curve (by curve, I mean the standardized curve that is already set, lest anybody think I'm referring to a curve that's set the day of). If it turns out that only 5-10% of test-takers get a question right (let's factor in also all the people who cannot rule out an answer choice), then that question won't be given much weight by the AAMC - or, rather, there will be more leeway on that version of the exam. So there's no 1:1 relationship between bin size and each successive point on the MCAT past 130.

Third, how much weight a question is given (reflected in the leeway allowed in a particular version of the exam) is based not on how many high-scorers get it right but rather on how many people who take the exam get it right. So if a high scorer can eliminate, say, 2 answer choices out of 5, that person now has a 33% chance of guessing correctly. But if a high-scorer only has a 33% chance of getting it right, then what does an average test-taker? 15%? 10%? And what about the other half of people on the other side of the bell curve? 5%? 1%? So a tough discrete on which a high-scorer would have a 33% chance might actually only be answered correctly 5% of the time in the general test-taking population and thus not be accorded much weight by the AAMC. So then your score becomes driven even more by reasoning ability and less by content knowledge.

So no, you actually don't have good numerical bases to lean on.

As to your CARS thing, geez I don't know. How do you explain 126/132/127/128 -> retake 128/128/130/130. Did this person suddenly lose his "reasoning ability" that "required to put them there in their four years of undergraduate education"?

I wonder what the data for CARS look like if we only look at test taker with humanity majors? I think that the curve will shift considerably to the right. Do you wanna bet? What about CARS data for only Canadians who must rely on CARS to get into med school? I think the same will happen. Wanna bet?

No, but that person probably wasn't taking the test under conditions ceteris paribus the second time around. Because that's a rather large change in CARS score. There's a reason why there exists a consensus that CARS score is most difficult to change - and that applies in both directions.

I said that someone with weak reasoning ability would have that exposed in the CARS section even if they do well in the science sections. The reverse doesn't necessarily hold. In other words, just because they do well in CARS doesn't mean that they can reason scientifically, much less have the content knowledge at adequate depth to perform that well in the sciences.

And finally, to answer your question about my own score, I did not score a 528 because although my reasoning abilities were good enough for the exam, my content knowledge in B/BC was only 6/10 at best and my P/S knowledge was 7/10. My C/P knowledge was 9-10/10 because I'm a graduate student in chemistry (the 9 is because I'm not the best at some physics). I had better uses for my time other than going over more content so that I could get a couple extra points at the top end of the scale.
 
Last edited:
I also scored 520+ and agree with most of what aldol has been saying. With practice FLs, my highest score at the beginning of my studying was consistently in CARS. I knew I was on the right track when I saw my other subsection scores rising, and I knew I was ready when my other subsection scores were consistently the same as or even higher than my CARS score. This test is absolutely more of a reasoning test than a content one. You can memorize the Krebs cycle or the glycolysis pathway, enzymes and all, and still not do that well in B/B if you don't have the analytical ability to make sense of the WTF passages, where they throw something completely new at you and you have to extrapolate and piece together the background knowledge that you do know
 
I also scored 520+ and agree with most of what aldol has been saying. With practice FLs, my highest score at the beginning of my studying was consistently in CARS. I knew I was on the right track when I saw my other subsection scores rising, and I knew I was ready when my other subsection scores were consistently the same as or even higher than my CARS score. This test is absolutely more of a reasoning test than a content one. You can memorize the Krebs cycle or the glycolysis pathway, enzymes and all, and still not do that well in B/B if you don't have the analytical ability to make sense of the WTF passages, where they throw something completely new at you and you have to extrapolate and piece together the background knowledge that you do know

I agree it is a critical reasoning test, but there is a not so insignificant amount of fact recall, and to determine ones ability to critically think relative to others falls off at the extremes (maybe even past one std d. from the mean). E.g. 526 is not inherently smarter/better thinker than a 517 from the same/different test. The whole 'confidence band' thing that the AAMC gives is not a true marker of someones real confidence band, which may extend from 510-524 or something similar.
 
This thread has reinforced how much more useless the MCAT has become since the changes -- and it wasn't in a good place when I took it.
 
I agree it is a critical reasoning test, but there is a not so insignificant amount of fact recall, and to determine ones ability to critically think relative to others falls off at the extremes (maybe even past one std d. from the mean). E.g. 526 is not inherently smarter/better thinker than a 517 from the same/different test. The whole 'confidence band' thing that the AAMC gives is not a true marker of someones real confidence band, which may extend from 510-524 or something similar.

That is your opinion and you are certainly entitled to it. But we argue that getting into the 515-520 range can be done by thorough preparation. Breaking into the 520+ range requires analytical reasoning skills developed throughout your life. So in my opinion, someone who got a 526 has better analytical reasoning skills than someone who scored 517. One can parse this apart by looking at CARS as I argued earlier.

You're right - the AAMC's confidence interval says nothing about real differences. But having a real confidence interval that's 510-524 is so large of a difference that it's improbable at best. In fact, that would mean that the test is basically useless at measuring any differences between people who are competitive for medical school. In other words, one would have to argue that Harvard Med students have the same level of analytical reasoning ability as students at the lowest-ranked US med school. Would you bet that?
 
In other words, one would have to argue that Harvard Med students have the same level of analytical reasoning ability as students at the lowest-ranked US med school. Would you bet that?

Yep. You can't measure analytical reasoning entirely by one exam.

Also, if we are parsing scores, WashU and Penn students have better analytical thinking skills than Harvard students because of (statistically) significantly higher MCAT medians. After all, a 521 is better than 518!
 
Yep. You can't measure analytical reasoning entirely by one exam.

You're not betting that the MCAT measures analytical reasoning to a great degree of accuracy. You're betting whether Harvard Med students, if they take an exam that does accurately measure analytical ability to arbitrary uncertainty, will score better than the others. If you wouldn't take the bet, then that validates the MCAT as at least a rough measure of analytical ability. If you'd take that bet, then you shouldn't gamble, buddy.
 
You're not betting that the MCAT measures analytical reasoning to a great degree of accuracy. You're betting whether Harvard Med students, if they take an exam that does accurately measure analytical ability to arbitrary uncertainty, will score better than the others. If you wouldn't take the bet, then that validates the MCAT as at least a rough measure of analytical ability. If you'd take that bet, then you shouldn't gamble, buddy.

The bet has little to do with the overall debate, because you are trying to assert that 520+ group has better analytical skills than 515-520 group. So the analogy would be WashU or Penn students doing better than Harvard and Duke students on the exam, not Harvard students doing better than low-tier MD (and even then, the bet assumes that students at lower tier MD are there because of lower scores and not due to various other reasons).

Sorry to be blunt, but I think the overall argument going here is fairly pointless (besides a few lessons from statistics that were kindly useful). Practical considerations start to matter more after a certain point (usually around a school's matriculant median), and anyone who can score above an 80th percentile on one of the most difficult standardized exams shows good analytical reasoning ability. And i think that is what matters most... rather than asserting that someone who scored in top 1 percent is sharper than someone who scored in top 3 percent, when results at this extreme depend so much on many factors beyond just analytical ability and innate intelligence
 
The bet has little to do with the overall debate, because you are trying to assert that 520+ group has better analytical skills than 515-520 group. So the analogy would be WashU or Penn students doing better than Harvard and Duke students on the exam, not Harvard students doing better than low-tier MD (and even then, the bet assumes that students at lower tier MD are there because of lower scores and not due to various other reasons).

Sorry to be blunt, but I think the overall argument going here is fairly pointless (besides a few lessons from statistics that were kindly useful). Practical considerations start to matter more after a certain point (usually around a school's matriculant median), and anyone who can score above an 80th percentile on one of the most difficult standardized exams shows good analytical reasoning ability. And i think that is what matters most... rather than asserting that someone who scored in top 1 percent is sharper than someone who scored in top 3 percent, when results at this extreme depend so much on many factors beyond just analytical ability and innate intelligence

No, the bet was in response to someone claiming that a real person taking the exam may score anywhere between 510 and 524 and this rests on the assumption that there's no way of distinguishing between someone who scores a 510 and a 524. Last I checked, that wasn't the difference between the few percentiles at the top. You were answering a question I did not ask.

I still maintain that it takes excellent analytical ability to score 520+ whereas good preparation can get you in the 515-520 range. That's why there is even a spectrum of analytical ability among medical students at Harvard or any of the other top institutions. Those schools recognize that analytical ability doesn't necessarily correlate with being a good or competent doctor. In fact, I would argue that analytical ability is only a minor determinant of whether an individual becomes a good doctor. It thus follows that people who score 520+ won't necessarily be better doctors than those who score 515-520 or even 510+. But that doesn't mean that the former aren't better at analytical reasoning.

I agree - the argument is pointless (besides the statistics lesson that very few people will ever remember - pre-meds tend to take confidence intervals at face value and don't question them). As an interlude, this is a good summary for those who are interested: http://gradnyc.com/wp-content/uploa...p-4_Statistical-vs-Practical-Significance.pdf. A lot of times, even good scientists get caught up in the "statistical significance" of a result and forget that there is no practical significance. People who eat red meat might have a 2 +/- 0.1 percent greater chance of getting cancer (made-up statistics) and once the media gets a hold of that, the headline will be "RED MEAT CAUSES CANCER." Nobody really thinks about whether that risk, which is statistically-significant, is actually significant in practice. You might get take on a 2% risk of cancer just from exposure to cosmic rays from flying.

So back to the point - the argument itself is useless and mainly academic. Anybody who can score above the 80th percentile shows good analytical ability. Sure. But I think it would be a mistake to say that somebody who scores in the 81st percentile and someone who scores in the 95th percentile have similar analytical ability. I would say the second person has better analytical abilities. But it's critical to note that analytical ability has little to no correlation with how good a doctor that person will be. That's why after a certain point, there's "good enough" for medical school. Think about it in terms of height requirements to get on a roller coaster (or certain other *meaningless* lengths). After a certain height, it ceases to matter because you can get on the roller coaster whether you're 5'10" or 6'5". The difference between those heights doesn't matter for the purpose of getting on the roller coaster. But to say that there's no difference between the 5'10" person and the 6'5" person would be plain wrong. Does that height difference matter? No. But that doesn't obscure the fact that there is a height difference.
 
No, the bet was in response to someone claiming that a real person taking the exam may score anywhere between 510 and 524 and this rests on the assumption that there's no way of distinguishing between someone who scores a 510 and a 524. Last I checked, that wasn't the difference between the few percentiles at the top. You were answering a question I did not ask.

I still maintain that it takes excellent analytical ability to score 520+ whereas good preparation can get you in the 515-520 range. That's why there is even a spectrum of analytical ability among medical students at Harvard or any of the other top institutions. Those schools recognize that analytical ability doesn't necessarily correlate with being a good or competent doctor. In fact, I would argue that analytical ability is only a minor determinant of whether an individual becomes a good doctor. It thus follows that people who score 520+ won't necessarily be better doctors than those who score 515-520 or even 510+. But that doesn't mean that the former aren't better at analytical reasoning.

I agree - the argument is pointless (besides the statistics lesson that very few people will ever remember - pre-meds tend to take confidence intervals at face value and don't question them). As an interlude, this is a good summary for those who are interested: http://gradnyc.com/wp-content/uploa...p-4_Statistical-vs-Practical-Significance.pdf. A lot of times, even good scientists get caught up in the "statistical significance" of a result and forget that there is no practical significance. People who eat red meat might have a 2 +/- 0.1 percent greater chance of getting cancer (made-up statistics) and once the media gets a hold of that, the headline will be "RED MEAT CAUSES CANCER." Nobody really thinks about whether that risk, which is statistically-significant, is actually significant in practice. You might get take on a 2% risk of cancer just from exposure to cosmic rays from flying.

So back to the point - the argument itself is useless and mainly academic. Anybody who can score above the 80th percentile shows good analytical ability. Sure. But I think it would be a mistake to say that somebody who scores in the 81st percentile and someone who scores in the 95th percentile have similar analytical ability. I would say the second person has better analytical abilities. But it's critical to note that analytical ability has little to no correlation with how good a doctor that person will be. That's why after a certain point, there's "good enough" for medical school. Think about it in terms of height requirements to get on a roller coaster (or certain other *meaningless* lengths). After a certain height, it ceases to matter because you can get on the roller coaster whether you're 5'10" or 6'5". The difference between those heights doesn't matter for the purpose of getting on the roller coaster. But to say that there's no difference between the 5'10" person and the 6'5" person would be plain wrong. Does that height difference matter? No. But that doesn't obscure the fact that there is a height difference.

My only response is that earning a 520+ score is multifactorial, and everything plays a major role: long-term analytical ability, innate intelligence, thorough preparedness, confidence on test day, good sleep the night before, controlled stress/anxiety (and using it to your advantage by increasing awareness), good test day environment, and of course, luck and educated guessing. It's just difficult to emphasize any one factor to be the most important while assuming everything else is the same, and this applies for even those who consistently scored 520+ in practice tests.

But thanks for that article and discussion on statistical vs actual significance. That's pretty useful stuff
 
My only response is that earning a 520+ score is multifactorial, and everything plays a major role: long-term analytical ability, innate intelligence, thorough preparedness, confidence on test day, good sleep the night before, controlled stress/anxiety (and using it to your advantage by increasing awareness), good test day environment, and of course, luck and educated guessing. It's just difficult to emphasize any one factor to be the most important while assuming everything else is the same, and this applies for even those who consistently scored 520+ in practice tests.

I respect your opinion on the matter. My opinion is just that preparedness plays a factor up to a point (as mentioned above) and the other factors do play a role (confidence, sleep, test anxiety, test day environment). I'm not saying that having just excellent analytical ability is sufficient get you to 520+. I'm saying that excellent analytical ability is necessary to get you to 520+. That's another long point many pre-meds who do research don't understand but I'll make it short here. Necessary vs. sufficient is something good to know if you ever want to go into academic medicine in the future or perform any sort of research. It's an important difference. If something is necessary, then you can't get from point A to point B without it. If something is sufficient, then just by having it, you can get from point A to point B. There's a difference. For example, it's necessary to take the MCAT to get into medical school. In other words, you can't get into medical school without taking the MCAT (this is just a made-up example so yes, I am aware of BS/MD and various linkage programs). But taking the MCAT is not sufficient to get into medical school. In other words, you're not in just because you took the MCAT. See the difference?

So in this case, it's necessary to have excellent analytical reasoning ability to get 520+ on the MCAT. That's why it's not random or luck that people score 520+. But having excellent analytical reasoning skills is not sufficient to score 520+ on the MCAT. Various other factors do come into play. But note - and this is very important here - that these other factors should be 1) controllable if you are committed to controlling it (e.g. getting a good night's sleep unless you have sleep problems in general) and 2) consistent from one administration to the next (if you have test anxiety in September, it's probably more of an underlying problem and you'll have it in January too).
 
I respect your opinion on the matter. My opinion is just that preparedness plays a factor up to a point (as mentioned above) and the other factors do play a role (confidence, sleep, test anxiety, test day environment). I'm not saying that having just excellent analytical ability is sufficient get you to 520+. I'm saying that excellent analytical ability is necessary to get you to 520+. That's another long point many pre-meds who do research don't understand but I'll make it short here. Necessary vs. sufficient is something good to know if you ever want to go into academic medicine in the future or perform any sort of research. It's an important difference. If something is necessary, then you can't get from point A to point B without it. If something is sufficient, then just by having it, you can get from point A to point B. There's a difference. For example, it's necessary to take the MCAT to get into medical school. In other words, you can't get into medical school without taking the MCAT (this is just a made-up example so yes, I am aware of BS/MD and various linkage programs). But taking the MCAT is not sufficient to get into medical school. In other words, you're not in just because you took the MCAT. See the difference?

So in this case, it's necessary to have excellent analytical reasoning ability to get 520+ on the MCAT. That's why it's not random or luck that people score 520+. But having excellent analytical reasoning skills is not sufficient to score 520+ on the MCAT. Various other factors do come into play. But note - and this is very important here - that these other factors should be 1) controllable if you are committed to controlling it (e.g. getting a good night's sleep unless you have sleep problems in general) and 2) consistent from one administration to the next (if you have test anxiety in September, it's probably more of an underlying problem and you'll have it in January too).

I agree, but to clarify, my previous point regarding 520+ being multifactorial refers to all the factors I listed being necessary to do well. I'm well aware of the differences between necessity vs sufficiency but I'm cautioning against the implied excessive emphasis on long-term analytical ability when all factors matter to various and comparable degrees.
 
I agree, but to clarify, my previous point regarding 520+ being multifactorial refers to all the factors I listed being necessary to do well. I'm well aware of the differences between necessity vs sufficiency but I'm cautioning against the implied excessive emphasis on long-term analytical ability when all factors matter to various and comparable degrees.

I agree with your analysis of the factors with the exception of luck and randomness. For that part, we will just have to disagree. I also believe that analytical ability plays a much larger role than you think. But we will have to disagree on that as well.
 
I agree with your analysis of the factors with the exception of luck and randomness. For that part, we will just have to disagree. I also believe that analytical ability plays a much larger role than you think. But we will have to disagree on that as well.

How would you categorize educated guessing on a difficult question? Is it a subset of analytical reasoning ability?
 
I agree with your analysis of the factors with the exception of luck and randomness. For that part, we will just have to disagree. I also believe that analytical ability plays a much larger role than you think. But we will have to disagree on that as well.
Marylin Vos Savant and Albert Einstein take the MCAT 1000 times in a row. Will they score the exact same score every time ?
 
Marylin Vos Savant and Albert Einstein take the MCAT 1000 times in a row. Will they score the exact same score every time ?

Again, I am not referring to variability of score - I am referring to there being no stochastic process involved in getting a 520+ score as opposed to not getting a 520+ score. Those who get a 520+ score have excellent analytical ability. Score will vary, but as I posit, within a 1 point margin on each subsection. This is due to intangible factors like individual mood during test day or getting thrown off by unforeseeable circumstances, etc. It is not due to getting "easier" questions in areas you're strong in or getting "hard" questions in areas you're weak in. In other words, it's not about "luck" but about personal variations between days which should be better controlled. If the same individual takes the exam multiple times on separate test dates with conditions ceteris paribus (those that he/she can control) and if that individual has the adequate 7/10 depth of knowledge about the topics, then that person will score within 1 point in each subsection each time. The one-point margin is attributed to uncontrollable circumstances (e.g. if the test center decides to blast the A/C at 60 degrees or if the person gets rattled by some unforeseeable circumstance). Ideally, if all conditions remained ceteris paribus, then that person should get the same score, again assuming he/she has adequate knowledge in each topic and reasoning ability.

I've stated my position clearly above multiple times and if you choose to continue to interpret "no randomness in getting 520+" as "no variability about one's score," then I will not respond because you're trying to take away from the thrust of my point, which is that there is no randomness in getting a 520+ as opposed to a 515-520.
 
How would you categorize educated guessing on a difficult question? Is it a subset of analytical reasoning ability?

You answered your own question. Educated guessing implies that the guess is informed, or reasoned. It wouldn't be an educated guess if there was not some reasoning behind the guess. Over all the difficult questions, those with excellent analytical ability will get more correct on average than those with only good analytical ability. Those difficult questions are designed, by nature, to parse out those at the upper end of the distribution.
 
You type a lot, I will try to answer all your points.

1. There's no randomness associated with 520+ MCAT scores. What people perceive as "randomness" is a factor that cannot be prepped for.[/B] That is, that factor is how well you've already prepared your analytical reading/analytical science skills from the classes you've taken. On the new MCAT, the number one factor that determines whether you're in the mid-90s percentiles or the 99th+ percentile is whether you can take scientific data and make reasonable inferences from it based on your background knowledge. That sort of analytical reasoning is the most important skill to have and not something that can be easily prepped for.


1. You make a pretty large assumption here that everyone is able to prepare equally for the test. The vast majority of people's prep will differ based on the following factors:
a. Background.
b. Prep Material Used.
c. Time alloted for prep.
d. Motivation.
Due to the very nature and variation assocaited with this even people with similar "analytic" skills may have different competencies in the subject matter. Here are two examples:
126, 131, 130, 132 ESL 130, 126, 131, 131
Both of the above candidates clearly have the "analytical" skills as evidenced by 130+scores in 3/4 subsections. Yet somehow fail to meet your 520 ubermench, ayn rand, pcmasterrace cutoff.


2.Mid-90s percentiles do not overlap with 99+ percentiles in the confidence bands. For example, the confidence band for a 526 is like 522-528. I'm referring to that score bracket as similar. In other words, there is no randomness associated with getting 520+ as opposed to getting <520. The skills associated with getting a 520+ are distinct from the skills required to get, say, 515-520. Now, if you're sitting at 520, the confidence interval will overlap both subsets but that just means that you have some incomplete mix of these two distinct sets of skills.

2. This is factually incorrect. The best kind of incorrect.
519 overlaps with 521.

3.That's the mindset that can get you into the 515-520 range but what puts one into the 520+ range isn't luck. I think your argument there is more self-serving than not. Does not breaking into the 520+ range mean that you weren't lucky? No. Does it mean that you didn't prep enough? No. What puts someone into 520+ isn't the level of prep they did in the months leading up to the exam but rather whether they have mastered the critical reasoning and analysis skills required to put them there in their four years of undergraduate education.

3. See 1. Peoples level of prep varies even with the skills available. The esl and poor CP score underscore the fact that these folks may be just as capable in the analytic area but may not have prepared every single topic leading them open to knowledge deficits. Most people are not robots and even with the best of intentions and time they may overlook a single topic that shows up. This is where chance plays a role. Second part is guessing on difficult questions . Lets say 10 questions is the difference between 129 and 132. what is the chance that a monkey clicking the mouse will be able to get all 10 correct? how about 1 correct? how about 5 correct? A large number of people take the test, and the chances that someone who reaches the 515 threshold and just by pure luck gets enough questions correct to reach your magic threshold of 520 are not inconsequential. I would bet there is a good chuck of the 520 scorers who constitute this considering it is a small population to begin with. Does this mean gifted scorers do not exist? No, but chance plays a role in a majority of scorers including 520+ crowd.Yes.

Furthermore, lets say a person has a firm grasp on 7/10 topics being tested on the exam and ends up with passages from the 3/10 section. - What do you think happens here? the person still possesses the same "analytical abilities" . This is further reinforced where people may end up getting 132 in one section yet lag in the others. The "analytical" abilities are still there yet the score is below 520.

Your definition of randomness and mine are not the same. Like I said to someone else above, I'm referring to there being no luck or magic in getting a score in the 520+ range as opposed to the 515-520 range. 4.I'm not saying that there is no fluctuation about the score itself. You're taking all this to mean that there is no variability in the score itself. There can be variability in the number - small variability. If someone with a 526 retakes the exam, he or she might score a 522 or a 528. That's variability (for a discussion on statistical analysis, see above post). But there's nothing random about that person getting a 522 as opposed to a 516. There's nothing random about getting a 520+ score versus getting a 515-520 score. That's what I'm saying.
.
4 .Please tell me why the variability exists if they are taking the test in ideal situations?


The exam is not perfect. If you miss questions because of topics that are difficult for you, then perhaps spending more time on reviewing those topics would be more beneficial. The idea is to be strong (I would say at least 7/10) on all topics that could show up. At that point, there will not be more than a 1 point difference in subsection score when you take it multiple times, ceteris paribus - in my opinion. If you don't get enough sleep the night before or decide to eat curry, that's on you. If it's an unforeseen medical problem, then yes of course there will be a difference between test dates but that's not my point. If you have nerves on one exam, you'll likely have it on the next, too, so it's not like it'll affect one performance disproportionately. I'm saying that if you go into two test dates under the same conditions (i.e. similar level of sleep, preparedness, nerves, etc.) but take two different versions of the test, there would not be more than a5, 1 point difference (I'm talking about subsection scores - I'm not sure if that was clear) between your scores.

5.Just by admitting the variation in 1 point per subsection above you are admitting that a 523 scorer could be a 519 or a 521 scorer could be a 517 thus shattering the magic of 520.

6.Now let me get into the problems with confidence intervals (I don't know why people on here seem to accept everything at face value). Think about how a confidence interval is created. You perform some measurement on some large sample of people. That's the basic premise. The AAMC is using real data - not perfect statistical data. So bias is already introduced into that data - most people who go into that data set aren't at the 7/10 strength on all topics, perhaps they got much less sleep the first time, etc. So of course at the population level, there will be variability based on individual biases. However, that doesn't mean that the confidence interval can apply to everyone. When you try to apply a confidence interval constructed from a population measurement to one individual, you have to assume that the individual's characteristics match those of the population on average. See the problem here? The confidence interval says nothing about somebody who goes into the exam at at least 7/10 strength on all topics, well-rested, and with no freak accidents. For a better and more thorough statistical analysis of the confidence interval, see above posts.

6.You have no proof for the above. Lets say the CI is created by standardized test takers with measured analytical ability. Please provide me with the AAMC methodology of CI creation.
You refute this by saying 1 point variation per subsection.

Analogy incorrect. More like 20 ingredients in drug X but each of those ingredients has an alternative that is supposed to do the same thing (maybe in a slightly different way). Just like how each passage is supposed to measure similar things - i.e. content knowledge, reasoning from the data, reasoning beyond the text, etc. The only case where your analogy would make sense is for the limited discrete questions you get on the exam, which may not all measure the same thing across exams 7.(although I would argue that if one is prepared for all the topics on the exam, then it doesn't matter if one gets an OChem discrete or a biochem discrete).

Also, scores have very much to do with difficulty of the version of exam. Just not in the way you think. Scores are curved based on the difficulty of that version (based on a set curve). But your scores should not be affected by you getting four passages on topics you're "weak" on because you shouldn't be "weak" in any topics. That's on you. You should gain an adequate understanding to do well especially in your weak subjects. As I said above, I'd put that at 7/10 personally.

There is at least a statistical difference between the 99th+ percentile and 95th percentile. From the curve, a 99th+ percentile score corresponds to 523 and up whereas a 95th percentile score corresponds to a 516. Those confidence intervals do not overlap. That's not a matter of opinion.

517 and 521 do according to your variation of 1 point per subsection.

Please see variation in prepration above.

Even newton lost a fortune in the stock market. Robots exist, but they are rarer than the population obtaining 520 scores and above. Also who knows what a robot might freak out over before the test leading to performance issues.

What is a matter of opinion is whether I think there are finer distinctions between individuals who have overlapping confidence intervals. I do.
Please provide a source. Also please inform the AAMC so they can get rid of them!

Like I said, I do not believe that if one goes into the exam and takes it multiple times, conditions ceteris paribus, that one would get more than a 1-point deviation in each subsection. That is the only luck or "randomness" I would allow for, as I said above to another poster. You're interested in fluctuations in score. I'm not. I'm interested in whether there is any randomness with scoring 520+. As in, is a person scoring 520+ actually different from someone in the 515-520 range or is it just "luck"? I'm not arguing that there's no statistical randomness associated with measuring someone's real MCAT score. But there's nothing random that put them there as opposed to the next lowest level.
CI bands overlap you claimed a variation of 1 point per subsection in measurement.
 
Last edited:
I'm on my phone now so I'll answer some of your comments now and others more thoroughly later.

1. You make a pretty large assumption here that everyone is able to prepare equally for the test. The vast majority of people's prep will differ based on the following factors:
a. Background.
b. Prep Material Used.
c. Time alloted for prep.
d. Motivation.
Due to the very nature and variation assocaited with this even people with similar "analytic" skills may have different competencies in the subject matter. Here are two examples:
126, 131, 130, 132 ESL 130, 126, 131, 131
Both of the above candidates clearly have the "analytical" skills as evidenced by 130+scores in 3/4 subsections. Yet somehow fail to meet your 520 ubermench, ayn rand, pcmasterrace cutoff.

First candidate may have the analytical skills and may not have prepared at sufficient depth for C/P whereas second candidate most likely did what some do and review/memorize a lot of content that gets them by on the science sections with only good analytical reasoning skills but the discrepancy shows on CARS because you can't memorize your way out of that.

The beauty of having a 7/10 level of depth on all topics criterion is that it is independent of time allotted, background, motivation, etc. No matter where each individual begins, he or she should reach that level of adequacy, whatever it takes, if he/she wants a high score.

In response to (2), last I checked, 519 was 98th percentile. I don't think anybody would say that's "mid-90s." Again, if you want to debate on statistical analysis (which is what confidence intervals are), then please read the discussion on that above first. Statistical significance is not the same as practical significance. Pre-meds tend to have a difficulty distinguishing the two - I don't blame them. Many of my fellow PhD scientists also have that same problem - especially in the biological disciplines.

In response to 6+, the AAMC does not release data with respect to how it calculates confidence intervals. But use your superior analytical reasoning skills. They created the first ones based on *educated estimates* because no one had taken the test yet. They updated that data at the 1 year mark when enough people had taken the test. That's why the percentiles were updated.

The idea is that one cannot know the "true" analytical ability of test-takers unless one uses other benchmarked standards such as certain "intelligence" tests to measure it first. Therefore, the AAMC relies on real data. The fact that they updated the percentiles is consistent with this.

If you wish to believe in any number that you are fed, you are free to do so. But that is hardly scientific. My point is this: confidence intervals AS A CONSTRUCT (caps because I can't italicize on a phone) cannot speak to real or practical difference. My discussion above on statistical analysis discusses this. Confidence intervals were created in order for scientists to make rough, first-order comparisons. Since then, they have been used without question and in some cases (especially in the medical field), lead to useless recommendations at best and harmful ones at worst.

Oh, and since you find issue with my semi-arbitrary grouping of 515-520 and 520+, do you understand what a confidence interval is? It's the 95% confidence that the true value is within the interval. That percentage is completely arbitrarily chosen. As that percentage goes down, the confidence interval becomes much smaller. So, for example, a 95% confidence that the true score is 519 +/- 2 could very well translate to a 90% chance that the score is 519 +/- 1 or an 80% chance that the score is 519 +/- 0.5. The exact numbers would depend on how the AAMC calculates it but the rule is the same - as the percentage you arbitrarily pick goes down, the confidence interval shrinks. If I say there's an 80% chance you have a tumor, you would be quite worried. So if I said there's an 80% chance that your score was between 518.5 and 519.5, what would your response be?

3-5 are more involved, so I will answer those within 24 hours once I get back on the computer.
 
Last edited:
Analogy incorrect. More like 20 ingredients in drug X but each of those ingredients has an alternative that is supposed to do the same thing (maybe in a slightly different way). Just like how each passage is supposed to measure similar things - i.e. content knowledge, reasoning from the data, reasoning beyond the text, etc. The only case where your analogy would make sense is for the limited discrete questions you get on the exam, which may not all measure the same thing across exams (although I would argue that if one is prepared for all the topics on the exam, then it doesn't matter if one gets an OChem discrete or a biochem discrete).
Common man. You cannot seriously believe this. Supposed to do the same thing? Like methanol and ethanol? You design an experiment like that and your PI will throw you out. How can you design a kinematic passage to "test the same thing" as an glucose metabolism passage? The only way is to make it so easy that a monkey can do it. Thankfully, with all its flaws the MCAT does not stoop to that level.

Also, scores have very much to do with difficulty of the version of exam. Just not in the way you think. Scores are curved based on the difficulty of that version (based on a set curve). But your scores should not be affected by you getting four passages on topics you're "weak" on because you shouldn't be "weak" in any topics. That's on you. You should gain an adequate understanding to do well especially in your weak subjects. As I said above, I'd put that at 7/10 personally.

And you know this how? Not only it's baseless but also impossible because they don't reuse the exams. Difficulty is also subjective.
As for "But your scores should not be affected by you getting four passages on topics you're "weak" on because you shouldn't be "weak" y,in any topics."
Lamo. Basically, you are saying that if I am lucky I will score really high. And it's not even that hard to get lucky on passage. From the MCAT threads, you can see that the tests are very topical. 7/10? lol what? I don't think we use the same scale here (oh the irony). When I said the AAMC content outline required 3/10 on a topic I mean that was the level of detail and understand of a normal cookiecutter undergrad. I kinda chuckled when you claim to have 9/10 knowledge in physics/chem because I was thinking that I can open a statics textbook and pick any problem and you can solve it. Or perhaps you can tell my why magnetic force only appears when a charge is moving within a field but not the other way around in layman terms. Just don't throw big words at me like my physics TA ( he was on phD track or something) did lolz. I judge the quality of my TAs by asking "simple" questions that I have already known the answers.Bad ones love to throw smoke screen 😀

But anyway, how do you determine if a version is hard or easy? Familiarity with a topic makes reading a related passage way faster and thus allow more time to do the critical thinking part. Please don't argue this. Plus, let's do a simple thought experiment. Suppose we have 2 groups of test takers with the same intellectual capacity/ test taking strategies. Both are given the two set of tests: both of which have the same 10 passages - 4 difficult, 3 medium, 3 easy. The only difference is that the first group has the passages in easy, medium difficult while the second easy, medium, difficult. Which group do you think will have the higher average?

How do you determine if a question is easy or hard? By percentage of people who answers it correctly?If so let's try this:

Question 1: A patient with a history of cardiovascular disease is given Coumadin. After the injection of the drug, this patient, compared to the general population, most likely has increased chance of:
I. Internal bleeding
II. Blockage of blood vessels
III. Blood in urine

A. I
B. II
C. I and III
D. I, II and III

Question 2: A patient with a history of cardiovascular disease is given a blood thinning drug. After the injection of the drug, this patient, compared to the general population, most likely has an increased chance of:
I. Internal bleeding
II. Blockage of blood vessels
III. Blood in urine

A. I
B. II
C. I and III
D. I, II and III
Your answer is wrong.
They are the same question but I bet that question 1 will have a higher % of correct response simply because fewer people will know what Coumadin is.

There is at least a statistical difference between the 99th+ percentile and 95th percentile. From the curve, a 99th+ percentile score corresponds to 523 and up whereas a 95th percentile score corresponds to a 516. Those confidence intervals do not overlap. That's not a matter of opinion.

What is a matter of opinion is whether I think there are finer distinctions between individuals who have overlapping confidence intervals. I do.
Since we both don't know what method the AAMC used to determine this interval let's not assume things. But surely you can see some nonsense they put up for public consumption.
Let's start with some facts:
1) In terms of correct answers, it is harder to get from 125-126 than from 131-132.
2) That the confidence interval of all subsection scores are +/-1 regardless of the numerical range,
So 1+2 = BS. Plus, how the hell did they sum up the four range +/-1 intervals to a +/-2 total score.




Like I said, I do not believe that if one goes into the exam and takes it multiple times, conditions ceteris paribus, that one would get more than a 1-point deviation in each subsection. That is the only luck or "randomness" I would allow for, as I said above to another poster. You're interested in fluctuations in score. I'm not. I'm interested in whether there is any randomness with scoring 520+. As in, is a person scoring 520+ actually different from someone in the 515-520 range or is it just "luck"? I'm not arguing that there's no statistical randomness associated with measuring someone's real MCAT score. But there's nothing random that put them there as opposed to the next lowest level.
When you said "take it multiple times" you meant taking the same version multiple times? I mean, what? Nobody hear is arguing with that angle.
............
Wait a minute! I hope I wasn't arguing with the premise if one takes the same test- as in the same test with the same questions/order of questions, they would score the same with little randomness?????
?????
Please no!!!


I believe that if one has adequate reasoning abilities, one can go into much less depth and still do very well on the exam. As I said, the way you can usually tell is by any discrepancy between CARS and other subsection scores. If someone needs to go into excessive depth to do well, then that person is relying more on the small details getting them to a high score than their reasoning abilities. That shows because on CARS, there's no detail to know. There's only reasoning.
There's a reason why there exists a consensus that CARS score is most difficult to change - and that applies in both directions.
Okay. Gratz. You have bugged me enough with this CARS thing. Congratz.
Let's go back to your basis claim: that CARS somehow has this magical "hard to improve" quality. Please tell, what reason is that? What do you have to back that up?

Anecdotal evidences are not allowed (waste of time). The only thing we have is the distribution curve of the subsections provided by the AAMC. All of them look the same save for the ~1 lower mean and 0.1 STD of CARS.

Your statistical analysis skill must be mad because I cannot for the life of mine figure it out! Was it the 0.1 STD that tipped it off?!

I mean, I don't have such mad skill in statistic but I suggest that if we look the CARS score of humanity majors and to a lesser extent, of Canadian, it will look the same as the other 3 subsections! Revolutionary idea I know! Wanna bet on that?

I also have this earth-shattering theory that people who spend more time on one subject will generally score higher... Should I write to the NIH asking for a grant to start this project??

I believe this has more to do with your statistical and mathematical reasoning than me. I ignored it because it didn't make sense and here's why it didn't make sense. The first problem is that if the 50% and 33% chance of getting a question right has a direct relationship with getting a 130 vs. 131 or 131 vs. 132, then one would expect each one of those successive bins to be equal in height to or half the height of the previous bin, respectively - that is, 100-50% not 50-33%. For example, 100 people are stuck between 131 and 132. Imagine that the score hinges on one question and all of those people have a 50% chance of getting that right. Then one would expect 50 to get it right and 50 to get it wrong. So the first 50 would get the 132 whereas the last 50 would get a 131. The height of those bins would be the same. Now imagine the same scenario but with a 33% chance of getting the question right. You would expect about 33 people to get it right and 67 to get it wrong. Therefore, 33 people would get the 132 and 67 would get a 131. The size of the 132 bin would be half that of the 131 bin.

Second, the above even gives you the benefit of the doubt that there's a direct, 1:1 relationship between getting a question right and scoring that extra point on the MCAT. The 50% drop in height of the bins from 130-132 with each successive point in the histogram only tells you that 50% fewer test takers score in that bin. That is, there are fewer and fewer people getting those scores. The 50% chance of getting a question correct after eliminating two (or two-and-a-half) answer choices would correlate directly with the drop in frequency of people getting into those top bins only if scoring on the MCAT did not take into account the curve (by curve, I mean the standardized curve that is already set, lest anybody think I'm referring to a curve that's set the day of). If it turns out that only 5-10% of test-takers get a question right (let's factor in also all the people who cannot rule out an answer choice), then that question won't be given much weight by the AAMC - or, rather, there will be more leeway on that version of the exam. So there's no 1:1 relationship between bin size and each successive point on the MCAT past 130.

Third, how much weight a question is given (reflected in the leeway allowed in a particular version of the exam) is based not on how many high-scorers get it right but rather on how many people who take the exam get it right. So if a high scorer can eliminate, say, 2 answer choices out of 5, that person now has a 33% chance of guessing correctly. But if a high-scorer only has a 33% chance of getting it right, then what does an average test-taker? 15%? 10%? And what about the other half of people on the other side of the bell curve? 5%? 1%? So a tough discrete on which a high-scorer would have a 33% chance might actually only be answered correctly 5% of the time in the general test-taking population and thus not be accorded much weight by the AAMC. So then your score becomes driven even more by reasoning ability and less by content knowledge.

So no, you actually don't have good numerical bases to lean on.
But they don't differ by one question... 59 is 132. 58-57 131. 56-? is 130. So....can you do the math again?

As for the bold part, Idk mane. You can try by asking that tough question, in English without a translator, in say.... Fengzu, China. I don't know how smart the Chinese is but I suspect that 20% of them will get it right (with 4 answer choices). Give it a try....
No, but that person probably wasn't taking the test under conditions ceteris paribus the second time around. Because that's a rather large change in CARS score.
Before I reply to this, I just want to sure that you didn't mean that person was taking the exact CARS he took the first time.
I said that someone with weak reasoning ability would have that exposed in the CARS section even if they do well in the science sections. The reverse doesn't necessarily hold. In other words, just because they do well in CARS doesn't mean that they can reason scientifically, much less have the content knowledge at adequate depth to perform that well in the sciences.
Moving that goal post but anyway.
And finally, to answer your question about my own score, I did not score a 528 because although my reasoning abilities were good enough for the exam, my content knowledge in B/BC was only 6/10 at best and my P/S knowledge was 7/10. My C/P knowledge was 9-10/10 because I'm a graduate student in chemistry (the 9 is because I'm not the best at some physics). I had better uses for my time other than going over more content so that I could get a couple extra points at the top end of the scale.
So you are saying you didn't deserve that 132 but you get it anyway?! Does that mean that you were ... wait for it....LUCKY?!
 
The wall of text is getting strong here

5S4RfCR.png
 
Here's 3-5:

3. See 1. Peoples level of prep varies even with the skills available. The esl and poor CP score underscore the fact that these folks may be just as capable in the analytic area but may not have prepared every single topic leading them open to knowledge deficits. Most people are not robots and even with the best of intentions and time they may overlook a single topic that shows up. This is where chance plays a role. Second part is guessing on difficult questions . Lets say 10 questions is the difference between 129 and 132. what is the chance that a monkey clicking the mouse will be able to get all 10 correct? how about 1 correct? how about 5 correct? A large number of people take the test, and the chances that someone who reaches the 515 threshold and just by pure luck gets enough questions correct to reach your magic threshold of 520 are not inconsequential. I would bet there is a good chuck of the 520 scorers who constitute this considering it is a small population to begin with. Does this mean gifted scorers do not exist? No, but chance plays a role in a majority of scorers including 520+ crowd.Yes.

Furthermore, lets say a person has a firm grasp on 7/10 topics being tested on the exam and ends up with passages from the 3/10 section. - What do you think happens here? the person still possesses the same "analytical abilities" . This is further reinforced where people may end up getting 132 in one section yet lag in the others. The "analytical" abilities are still there yet the score is below 520.

If a test-taker receives a lower score because he/she has not prepped to a level of, say, 7/10 depth on all topics on the guide, then that is not luck. That is his/her fault. Just don't overlook a topic on the outline. If your only point is that someone may have overlooked a few topics on the MCAT outline because of time constraints, etc., then I do agree with you that chance will play a role in that person's score simply because of the variations in content between exams. But if your test-taker is like mine, who has the requisite 7/10 depth on all topics, then there should be no chance involved in scoring because that person will have the adequate knowledge on all of the topics tested and it will not matter which subset of the topics is actually tested on his or her version of the exam.

Okay, 10 questions is the difference between 129 and 132 in your scenario. Let's say there are five answer choices (that's what it is, right? I don't recall) for each question. And let's say that on average, these test-takers are fairly decent since they're scoring in the 129 range anyway, so they have at least good analytical reasoning abilities so I'll give them the benefit of the doubt and say they can eliminate 2/5 of the answer choices as being blatantly wrong. So they'll guess from the remaining 3 choices. Okay, so what are the chances that they'll get all ten correct? (1/3)^10, or 0.002%, or roughly 2 in 100,000 tries. How many people take the exam? ~85,000 a year now? It looks like roughly 2-3% of test-takers score 520+ (https://aamc-orange.global.ssl.fast...tion_score_percentile_ranks-update_with_n.pdf). That's 1700 to 2500 test-takers. You'd have roughly 2 people in a year get it all right just by random guessing - for just that section. And they would have to do that in each section, meaning (1/3)^40.

But let's say they're sitting at 129 on all sections and only need an extra point. So in your scenario, let's say they need to get only 3 questions right to get that extra point. They'd need to do that on all sections in order to go from 516 to 520. That's (1/3)^12, or 2 in a million. And that's assuming they can, on average, eliminate 2 answer choices on those tough questions and randomly guess on the rest. So yes, the number of people in the 520+ range who are there because of random chance/luck is inconsequential.

4 .Please tell me why the variability exists if they are taking the test in ideal situations?

5.Just by admitting the variation in 1 point per subsection above you are admitting that a 523 scorer could be a 519 or a 521 scorer could be a 517 thus shattering the magic of 520.

Minute variations in conditions. Even if I say that I control for all the factors I know, there are many life circumstances that cannot be foreseen and controlled for, down to not being used to the computer being used at your testing center, having lower energy levels due to variations in diet, etc. Basically things that don't affect the immediate test conditions but do affect one's mental condition. Also, even the advantage of having seen the test already before in a test setting could confer an advantage because one has already gone through it once before and so may avoid some of the pitfalls. That one is hard to entangle from differences in score. But it's hardly random chance that one is taking the MCAT for the second time.

If you talk to MCAT tutors, either on here or in real life, they will tell you it's no coincidence that people who take it once but then turn around and take it immediately again at the next test administration rarely differ in scores by that margin I mentioned. The ones who do score much better are the ones who had something catastrophic happen during their initial administration - computer failure, severe GI problems during the first exam, etc.
 
Common man. You cannot seriously believe this. Supposed to do the same thing? Like methanol and ethanol? You design an experiment like that and your PI will throw you out. How can you design a kinematic passage to "test the same thing" as an glucose metabolism passage? The only way is to make it so easy that a monkey can do it. Thankfully, with all its flaws the MCAT does not stoop to that level.

Methanol blinds you and ethanol makes you feel good. And kills you. Kind of different. I think I've had a bit more experience designing experiments than you have, but I'll play along (you're right, my PI did kick me out.... after I defended my dissertation). In science as well as medicine, we generally use multiple approaches to test the same thing because that improves statistical power. So if I want to see the degree of backbonding in an organometallic complex, I might use IR along with UV or Raman so that I can verify that what I'm measuring is actually the thing I want to measure. The MCAT measures things like reasoning within the text, reasoning beyond the text, etc. If you want to see what exactly each question is trying to measure, you can go through the AAMC FLs and read the description of the answer choices. The last sentence will tell you the overarching ability it's trying to measure. Sure, the content of the passage may be different but the passages are designed to measure the same abilities. The reason you have so many passages is because that improves statistical power - kind of like an MMI, actually.

And you know this how? Not only it's baseless but also impossible because they don't reuse the exams. Difficulty is also subjective.
As for "But your scores should not be affected by you getting four passages on topics you're "weak" on because you shouldn't be "weak" y,in any topics."
Lamo. Basically, you are saying that if I am lucky I will score really high. And it's not even that hard to get lucky on passage. From the MCAT threads, you can see that the tests are very topical. 7/10? lol what? I don't think we use the same scale here (oh the irony). When I said the AAMC content outline required 3/10 on a topic I mean that was the level of detail and understand of a normal cookiecutter undergrad. I kinda chuckled when you claim to have 9/10 knowledge in physics/chem because I was thinking that I can open a statics textbook and pick any problem and you can solve it. Or perhaps you can tell my why magnetic force only appears when a charge is moving within a field but not the other way around in layman terms. Just don't throw big words at me like my physics TA ( he was on phD track or something) did lolz. I judge the quality of my TAs by asking "simple" questions that I have already known the answers.Bad ones love to throw smoke screen 😀

Here: https://students-residents.aamc.org/applying-medical-school/article/how-new-mcat-exam-scored/. Relevant quotes:

"The conversion of raw scores to scaled scores compensates for small variations in difficulty between sets of questions. The exact conversion of raw to scaled scores is not constant because different sets of questions are used on different exams. The 15-point scale tends to provide a more stable and accurate assessment of a student's abilities. Two students of equal ability would be expected to get the same scaled score, even though there might be a slight difference between the raw scores each student obtained on the test."

"While there may be small differences in the MCAT exam you took compared to another examinee, the scoring process accounts for these differences."

"How you score on the MCAT exam is not reflective of the particular exam you took—including the time of day, the test date, or the time of year—since any difference in difficulty level is accounted for when calculating your scaled scores (see above for information about scaling)."

Again, there is no luck involved if one is at a 7/10 depth on all topics on the content outline. If one is comfortable with all topics listed, then there shouldn't be any surprise when a particular topic is tested on during test day.

You are free to judge the quality of your TAs however you like. But don't get butthurt if you don't understand the words he/she is using because you don't have a PhD-level understanding of the topic. A good TA will be good at explaining the fundamentals to a layperson, yes. But in some cases, it is difficult to explain a concept to another person when there is a huge knowledge gap between the two. The more knowledgeable person assumes that the other person knows more than he/she does and hence the use of "big words." At the same time, I realize that you might have a better knowledge of topic A in physics than I do because my PhD is in chemistry. I won't pretend otherwise. But in terms of what the MCAT tests on, I felt that I had a 9/10 depth in all of the C/P topics on there.

But anyway, how do you determine if a version is hard or easy? Familiarity with a topic makes reading a related passage way faster and thus allow more time to do the critical thinking part. Please don't argue this. Plus, let's do a simple thought experiment. Suppose we have 2 groups of test takers with the same intellectual capacity/ test taking strategies. Both are given the two set of tests: both of which have the same 10 passages - 4 difficult, 3 medium, 3 easy. The only difference is that the first group has the passages in easy, medium difficult while the second easy, medium, difficult. Which group do you think will have the higher average?

How do you determine if a question is easy or hard? By percentage of people who answers it correctly?If so let's try this:

Question 1: A patient with a history of cardiovascular disease is given Coumadin. After the injection of the drug, this patient, compared to the general population, most likely has increased chance of:
I. Internal bleeding
II. Blockage of blood vessels
III. Blood in urine

A. I
B. II
C. I and III
D. I, II and III

Question 2: A patient with a history of cardiovascular disease is given a blood thinning drug. After the injection of the drug, this patient, compared to the general population, most likely has an increased chance of:
I. Internal bleeding
II. Blockage of blood vessels
III. Blood in urine

A. I
B. II
C. I and III
D. I, II and III
Your answer is wrong.
They are the same question but I bet that question 1 will have a higher % of correct response simply because fewer people will know what Coumadin is.

MCAT passages aren't designed to test whether someone has outside knowledge. They have discretes for that. So your question (1) would not appear on the MCAT unless somewhere in the passage it says that Coumadin is a blood-thinning drug. I believe most people who have taken the MCAT would agree with this. Give a better example and I'll have a clearer picture of what you're trying to say.

Since we both don't know what method the AAMC used to determine this interval let's not assume things. But surely you can see some nonsense they put up for public consumption.
Let's start with some facts:
1) In terms of correct answers, it is harder to get from 125-126 than from 131-132.
2) That the confidence interval of all subsection scores are +/-1 regardless of the numerical range,
So 1+2 = BS. Plus, how the hell did they sum up the four range +/-1 intervals to a +/-2 total score.

Total score doesn't have a +/- 2 range. At least not in my experience. The lower range of my confidence interval was 522, not 524. But I believe you just argued that the confidence intervals are BS.

When you said "take it multiple times" you meant taking the same version multiple times? I mean, what? Nobody hear is arguing with that angle.
............
Wait a minute! I hope I wasn't arguing with the premise if one takes the same test- as in the same test with the same questions/order of questions, they would score the same with little randomness?????
?????
Please no!!!

Different version, same conditions. The one-point variation is in there because of the fact that you can't dissociate the impact on your second score of just having been through an MCAT once prior.

Let's go back to your basis claim: that CARS somehow has this magical "hard to improve" quality. Please tell, what reason is that? What do you have to back that up?

Anecdotal evidences are not allowed (waste of time). The only thing we have is the distribution curve of the subsections provided by the AAMC. All of them look the same save for the ~1 lower mean and 0.1 STD of CARS.

Your statistical analysis skill must be mad because I cannot for the life of mine figure it out! Was it the 0.1 STD that tipped it off?!

I mean, I don't have such mad skill in statistic but I suggest that if we look the CARS score of humanity majors and to a lesser extent, of Canadian, it will look the same as the other 3 subsections! Revolutionary idea I know! Wanna bet on that?

I also have this earth-shattering theory that people who spend more time on one subject will generally score higher... Should I write to the NIH asking for a grant to start this project??

The reason is that CARS measures only one skill: everything encompassed under analytical reasoning. You can't memorize something for the CARS section because there's nothing to memorize.

If you believe that common consensus among people who have already taken it on here is the same thing as anecdotal evidence, then I don't know what else to tell you.

Statistics can tell you nothing about how easy or hard it is to improve on CARS. It can only tell you how confident the people are who measured it that the "real" mean of test-takers is within the interval listed. There is no data on differences in subsection scores for people who have taken the CARS multiple times. So the only thing anybody can rely on is consensus. That's better than nothing, which seems to be what you have.

Why spend more time on one subject if you can spend just as much time on all four subjects? There's no limitation on time here except that which is self-imposed. If you want to spend 4 months on one subject to give yourself a 10/10 depth on all topics in that subject, then you might as well do the same for the other subjects. After a certain point (my opinion = 7/10), there are no returns.

I would try NSF. You'd probably have better luck there. But better stop talking like you're 14. Unless you're like Sheldon Cooper or something.

But they don't differ by one question... 59 is 132. 58-57 131. 56-? is 130. So....can you do the math again?

As for the bold part, Idk mane. You can try by asking that tough question, in English without a translator, in say.... Fengzu, China. I don't know how smart the Chinese is but I suspect that 20% of them will get it right (with 4 answer choices). Give it a try....

You did the illogical analysis to begin with. You do the math again correctly and present an argument with mathematical basis on why your 50% or 20% of questions gotten right by guessing translates directly into the score distribution.

Want to tell me where you got those numbers for 59 = 132, 57-58 = 131...? The AAMC doesn't release that data because each exam is different and is curved according to difficulty of the exam itself. The quotes above from the AAMC directly confirm that. Here, I'll quote it again for ya:

"The exact conversion of raw to scaled scores is not constant because different sets of questions are used on different exams."

Before I reply to this, I just want to sure that you didn't mean that person was taking the exact CARS he took the first time.

Different CARS, same conditions.

So you are saying you didn't deserve that 132 but you get it anyway?! Does that mean that you were ... wait for it....LUCKY?!

So you are saying you didn't deserve that 132 but you get it anyway?! Does that mean that you were ... wait for it....LUCKY?!

No. I had an adequate 7/10 depth for that section (most of that depth is from reviewing QPacks and reading well-written articles). Like I said, one can waste weeks or months going into 10/10 depth for the MCAT. But after a certain point, there's no improvement in score. That's my opinion. If you want others' as well, ask that question in the MCAT forums. It's been asked before.
 
how can you deny that its not a result of more than 1 factor or even test to test variation. at the higher end of the scoring spectrum the difference in scores come down to less and less questions. yes the difference between a 515 and a 520 is probably siginficant in amount of questions. but the difference between a 517 and 520 is less significant. and 518 and 520 etc.

I'm not denying that variation could be due to more than one factor. I'm saying that there will be no more than +/- 1 difference in each subsection if a test-taker took the MCAT multiple times under conditions ceteris paribus. Larger variations are not due to test variation but rather the different conditions under which the test is taken.

also i got a 519. my three friends that have also taken the test got 524, 522, 522. The differences between us = only analytical/reasoning skills? I hardly think thats true.

The 524 took the test very early when the new MCAT was created. His test was so incredibly bio & biochem focused that he said the entire C/P and P/S had less than 2 passages each that focused on anything else. And at least 10 questions that were based on memorizing AAs.
Contrast this to the tests I've heard about this year with tremendously discrete passages on crystallography diffraction physics, benzylic synthesis etc.
On top of that I went into my test on ZERO hours of sleep and a fever.

If you went into the test with zero hours of sleep and a fever, then I would not be surprised if you took the test again immediately at the next test date and scored a 524. This is because you're not taking the test under the same conditions. Same conditions meaning that nothing is different except for the fact that time has moved on and you have done the whole process once before.

I get that a person who got a 526 probably wouldn't get below a 520 on several attempts. But to say that there are harsh cutoffs in analytical ability between 515-520 and then >520 is pretty ridiculous. Thats the whole point of the confidence bands that AAMC gives you.

The confidence bands, again, only tell you that the AAMC is 95% certain that your true score/ability lies within that range. It does not tell you whether you're practically different from somebody with an overlapping confidence band. In other words, the AAMC simply can't tell within 95% certainty that your scores are statistically different. For a brief discussion on actual vs. statistical significance, see http://gradnyc.com/wp-content/uploa...p-4_Statistical-vs-Practical-Significance.pdf.

Now, if one decreases the percentage criterion, confidence bands shrink. So there could be an 80% chance, for example, that your score is within 0.5 points of 519. In that case, there's an 80% chance that you and your buddies' scores are actually different. 80% is an arbitrarily chosen number by me - the actual number is known only to the person who actually calculates the intervals. The point is to illustrate common fallacies in assumptions about confidence intervals.

In your case, my argument about no variability in score doesn't apply because if you take it again, it will be under different circumstances (i.e. with sleep, no fever). Now, if you took it again with a fever and on no sleep, then I would say that you're going to see little to no variability (+/- 1 point in each subsection from the effect of just having been through the whole process once and feeling the stressors involved).
 
Top