I'm not sure if by "correlate" you mean score or similarity. Assuming you mean similarity, I thought the released questions were definitely easier than those on the real exam. Also, most of the basic science disciplines are sufficiently represented on the released items test, but on the real step 1, discipline question distribution goes any way it wants. To put it another way, the 150 test adhered to the discipline distribution put forth by the NBME (1-5% embryo, 1-5% anatomy, etc, that kind of thing), but the real thing can give you some wild distributions. Some people claim to have not had a single embryology question; some people claim to have very little physiology; some people claim to have had much, much more than 1-5% anatomy; etc.
I guess you could also say that by this reasoning, correlating scores is not very useful (although I'm sure the point of your post was about similarity).
And yeah, they're all retired so there's no chance of seeing them again verbatim, but you could always see similar questions. If you want verbatim, pay attention to your shelf exams and the NBME Assessments.