tl;dr: avg score not rising; exam NOT getting "harder" per se; example of how questions have evolved over past 10-15 years.
-----------------------
The scores are NOT rising because the NBME keeps the 3 digit score constant and claims a 6 point error, so 224-229 is certainly within that range.
That said, they are eliminating older questions from the pool that were no longer performing well. Performance is not just measured by % that answered correctly, but also by how well that % correlates with overall performance. Put simply: if 20% answer it right, you want those same people to be your top 20% of scores. If only 20% get it right, but those people have an equal chance of being a 190, 230, or 260 scorer, then that question isn't telling you anything.
There are a number of questions that 80-90% of people get right; these are part of how you discern between pass and fail. Then there are many that perform between that and those that only the top 15% are getting correct. Every test gets a relatively similar distribution of questions (of those that actually count anyhow) to maximize its ability to distinguish between scores, and a little recentering is done to keep the 3 digit score constant over time.
Now, imagine you have a highly-performing question that 40-50% of people get right and those are also the top 40-50% of scorers. If this question makes it into FA in such a way that now 80% get it right, it is no longer performing as intended and needs to be revised or thrown out. Those 80% are not the top 80% of scorers, but merely the original 40-50% plus a smattering of whoever else remembered the factoid from FA. Many people in the bottom 20% maybe be answering it correctly.
Every year, we can expect to see more and more familiar questions get replaced or revised to maintain the performance of the exam. It may appear the exam is getting more difficult, but in truth it's simply adjusting to the new reality of ubiquitous prep materials.
Here's an example:
Mid/Late 1990s question:
"A 43 year old African-American Female presents with weakness and persistent cough for the past 5 months. She denies hemoptysis and her PPD shows no induration at 48 hours. Her EKG shows normal sinus rhythm and shows no pathological Q waves. CXR shows bilateral lymphadenopathy and a biopsy reveals non-casseating granulomas. Which of the following values is most likely to be elevated in this patient?
a) Bradykinin
b) Angiotensin I
c) Calcium
d) Potassium
e) Magnesium
So, in response to this, FA includes a section on Sarcoid reminding everyone of the "bilateral lymphadenopathy" and "noncaseating granulomas" buzzword as well as increased calcium. Over 2-3 more years, this question stops performing so now it gets revised:
2003 question:
"A 43 year old African-American Female presents with weakness and persistent cough for the past 5 months. She denies hemoptysis and her PPD shows no induration at 48 hours. Her EKG shows normal sinus rhythm and shows no pathological Q waves. CXR shows lymphadenopathy and a biopsy reveals granulomas. Which of the following is most likely to be elevated in this patient?
a) Potassium
b) Angiotensin converting enzyme
c) Magnesium
d) Sodium
e) Bradykinin
Now the next FA amends its Sarcoid section to include that examiners love to ask about elevated ACE levels in Sarcoid, so once the question stops performing well they change it again:
2008 question
"A 43 year old African-American Female presents with weakness and persistent cough for the past 5 months. She denies hemoptysis and her PPD shows no induration at 48 hours. Her EKG shows normal sinus rhythm and shows no pathological Q waves. CXR shows adenopathy and a biopsy and immunostaining reveals clumps of activated macrophages. Which of the following is most likely to be elevated in this patient?
a) Angiotensin I
b) Serum calcium carbonate
c) Potassium
d) 1-alpha hydroxlase
e) IFN-alpha
Here again, FA adds that the mechanism of hypercalcemia in Sarcoid is due to TH1 activation of macrophages via IFN-gamma --> increased 1-alpha hydroxylase expression and ultimately vitamin D.
2013 Question
"A 43 year old African-American Female presents with weakness and persistent cough for the past 5 months. She denies hemoptysis and her PPD shows no induration at 48 hours. Her EKG shows normal sinus rhythm and shows no pathological Q waves. CXR shows possible adenopathy and a biopsy with immunostaining shows groups of cells positive for CD14. Which of the following is most likely to be DECREASED in this patient?
a) Angiotensin II
b) TGF-beta
c) Total Serum Calcium
d) Free Serum Calcium
e) 25-OH Cholecalciferol
Now we're still talking the same concepts, but you have fewer buzzwords in the stem and must understand the pathway of Vitamin D activation, namely that the increased 1-alpha hydroxylase will most likely convert more of your 25-OH Vit D into the active 1,25-OH form and thus deplete the former (choice E). The question isn't really "harder," but is rewritten to make sure you have to think and can't just respond with straight recall from FA.