Getting 1 more question right at the 260 level has quite a bit more impact on your score than getting 1 more question right at 240, so I think that is probably the basis of people thinking high end scores are more volatile.
My gut intuition is that super high scores (250s-270s) as more luck based than say a solid high range score (230s-250s). Sure, you prepare all you can but when it comes down to it if those hand full of killer questions that only 20% of people get right are in subjects that you're strong in, then it defines your score. You might get an exam that only has GI pharm which is our strong suite, with no neuro pharm that you weren't always sure aobut. You might get a test where all of the behavioral science questions are straight psych diagnoses and ego defense mechanisms instead of slightly more nebulous "what would you say next" quotes that for me at least are harder to be certain about.
I definitely agree with these statements, although I'd modify the former range you've given to say 265-275+ rather than 250-270. The most disconcerting aspect of the Step is exactly as you've mentioned: luck
does play a huge role. It's very likely that each exam has a set number of easy, medium, hard and "killer" questions, but that the subject areas tested are at random, which explains why some people can come out saying they had absolutely no micro or anatomy, for example, while others can come out literally saying it was their whole exam.
In terms of my experience with Rx so far, ~3 Qs per 48-question block fall into the very tricky category, with usually one or two of those falling into the abstruse/absurdity category (meaning: "there's no way I would have gotten that even if I had studied another year."). For example, I had one asking about which carbon # on a particular medication's molecular structure would be the best position to add a nucleophilic inhibitory group. Now I mean,
come on. If I hadn't been an organic chemistry minor in college, I wouldn't have had a clue, but I knew I got
lucky since only 6% got it right. I remember saying, "why couldn't this have come up on the real exam..because obviously I couldn't get that lucky again." That's why we all pray for our strong suits to confront us. If I were to have an anatomy-heavy exam, for instance, I know for a fact I'd walk out in tears (and probably very humbled to say the least).
I therefore believe the purpose of doing thousands and thousands of questions is to, at the minimum, effectively shut the doors on all of the questions that >20% get right. Chances are, if more than one in five people can answer a question, the information can't be
that ludicrous, and it most likely just takes study and time to get it down.
However, the occasional Qs that <10-15% get right are the true pivotal ones because, unlike the ~20% Qs, where the # answering correctly = the probability of a mere guess (presuming a five-choice question), these suggest that a "swinger" is involved (that is, a stimulus or piece of information that intentionally pushes people away from the correct answer). I've had consecutive blocks as low as 80% and others somehow at 96%, which has made me realize that the real exam is not "labile like the air, stable like the ground." It's labile like the air, period.