This is actually consistent with my theory. Maybe I didn’t word it the best way, but what you are saying is basically the same thing I was saying. “Experimental questions” are scored for future examinees, not current ones.
Anyway here’s a possible example of how it could result in score creep. According to your link, a discriminatory index of 0 means a question cannot discriminate between high and low scorers, and it ranges from -1 to +1. Let’s say out of 80 unscored questions, 20 have discriminatory indices less than 0.2, which is considered “low.” These questions get tossed out, but because they have some discriminatory value, ranging from 0-0.2 (negative values are rare due to the nature of the exam, which is heavily based on factual knowledge), when you take away these questions it results in a shift in scores for future examinees. Take for example someone who scored a 230 and got 75% of the unscored questions correct or 60/80. Let’s say a future examinee takes an exam with the new questions but with some of the unscored questions taken out due to poor discrimination, he might get between 45-47/60 correct, or 75-78%. It tends toward slight score increases because questions with positive discriminatory value are taken away.