I actually think that the article you quoted ("Scientists rise up against statistical significance") is making the opposite point. The biostatisticians who signed that letter are trying to *discourage* binary categorization into "p>0.05, that's trash" versus "p<0.05, it's good to go." Rosnow put it as follows:
"... surely, God loves the .06 nearly as much as the .05. Can there be any doubt that God views the strength of evidence for or against the null as a fairly continuous function of the magnitude of p?”
Whether a Phase III trial would succeed or fail really depends on all the other information in the universe that is NOT in this trial itself. If, instead of SABR, it was a Phase II trial evaluating different flavors of jellybeans for curing cancer, and we knew that 100 trials had been kicked off simultaneously and 10 of them had reached P = 0.09, then we would be very reluctant to expect that a Phase III trial to be successful. (And in general, we are right to be skeptical. Lots of people try plausible but useless "jellybean" interventions, some of them work out in Phase II and we read headlines, and many of these do fail at Phase III)
If, on the other hand, secondary analyses from other trials are consistent with SABR-COMET -- to the extent that we say, look at STAMPEDE, a couple of tumors and the body and we SABR a few of 'em and see benefits, maybe that's the same story? To the extent that multiple trials point the same direction, we might be more hopeful for a positive result.
In a world with no other prior information, if we see P = 0.09 and we repeat the same trial with the same umber of patients, I agree that we're unlikely to get P < 0.05. But if we use many more patients, our confidence intervals close in.
In summary, I agree with everyone else in this thread and think a phase III trial is warranted.