I guess I don't understand why it doesn't matter when I'm trying to compare data with known figures to get the proper proportion.
Ultimately, this was out of curiosity. I'm sorry if you don't like the way I did it, but why would I bother with more detail than I have when, as is very clear, the vector of this survey is completely invalidating?
Also, I'm not sure if I agree with everyone saying that by not taking all MCAT scores this is an improper survey (though as mentioned above I acknowledge that it is). Isn't that a lot like saying a survey of african american women is invalidated if you don't also survey women from all other races? Or that surveying people who make over 100k is only valid if you also survey people below it? I was only interested in comparing the top 10-15% of scores in relation to itself. It only tells me that those in the top 5% are relatively overabundant in relationship to those in the top 10-15% overall. It obviously doesn't tell me if the top 15% is overabundant.
I also wanted to see just how many of the top scorers are on SDN, but the numbers are far too low still to really get an accurate view of that.