From the NBPME 2016 Fall Report: "...the questions selected for the practice exams are those that have previously been used in actual examinations and had performed well statistically in distinguishing between better and less qualified candidates. These questions will now only be used for the practice exams."
As far as standardization and passing the exam, like most licensing and certifying exams, the questions are standardized by a modified Angoff standard-setting method. That's written right in the test bulletins as well as the Audit Panel Report done on NBPME. So if someone is trying to explain to you the scoring of the exam, and they can't explain the modified Angoff method, then I wouldn't take their advice. I'll try to simplify this explanation as much as possible, though it is a bit more complex than how I'll explain it. Also, what I'm about to explain only applies to the questions that count towards the pass/fail, which are the vast majority of them. Some small percentage are being tested for future use and don't count towards your score. This also throws some people off, making them think that their exam didn't match the percentages posted in the bulletin. Prometric guarantees that the scored portion does match the percentage breakdown given in the bulletin.
A group of practicing podiatrists, under the guidance of Prometric, preview every single test question to determine the likelihood that a "minimally competent podiatrist" would answer it correctly. That's an important distinction, because it's based on whether a practicing podiatrist could answer correctly, not a podiatry student. It's also important to note that these podiatrists must all come to an agreement on how a "minimally competent podiatrist" would perform on a question, otherwise they have to discuss the question with eachother and then keep assessing it until they come to a general agreement, then their assessments are averaged. Through this process, essentially, the questions are standardized which then in turn standardizes the exams.
So let's say you get a 5 question exam. The podiatrists agreed that a "minimally competent podiatrist" would have a 40% chance of answering question 1 correctly, a 50% chance of answering question 2 correctly, a 60% chance of answering question 3 correctly, a 70% chance of answering question 4 correctly, and an 80% chance of answering question 5 correctly. You could basically average all of these percentages to arrive at the "cut score", which is the pass mark for this particular exam. For this exam a "minimally competent podiatrist" should score a 60% and so that's the passing score. Whether they answer 1, 2, and 3, or 1, 3, and 5, or whatever correctly doesn't really matter because each question is technically weighted the same in the actual scoring of the exam. But as long as you can score a 60% then you've scored as well as a "minimally competent podiatrist" would have. The cut score could just as well end up being a 70% or an 80%. It's different for every version of the test, but that doesn't matter. What this standardization method is great for is scaling the cut score to the difficulty of the questions. If the questions are more difficult, all those expected percentages will be lower, and you can pass that difficult exam by answering fewer questions correctly. If the questions are much easier, then you'll be expected to answer many more questions correctly to pass that exam.
The last part of this all is a standardization of the final score so that different versions of the exam can be directly compared. Raw percentages are converted to a scale that begins at 55 (lowest score) up to 75 (passing score) and beyond. So if you have a difficult exam with a cut score of 60%, that 60% would become a 75 on the final scale. If your friend takes a much easier exam with a cut score of 80%, that 80% would become a 75 on the final scale. When some people say that you need a "75" to pass the exam, this is where they pull the 75 from, they're just incorrect in that the 75 isn't representative of a 75%. The 75 can just be read as "pass" or "whatever percentage is equal to passing on this very specific version of the exam".
Finally, following the exam, items that performed statistically worse than expected are reviewed and if they are found to be flawed everybody is given credit for those questions.
If any part of that is still confusing let me know and I can try to explain it better.