Do I need to Know Stats to Get involved in med school research?

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

FootballFoot

Full Member
7+ Year Member
Joined
May 27, 2016
Messages
170
Reaction score
76
I'm starting as an MS1 in the summer and I really want to get involved in research during my time in school.

The problem is, I have never taken a stats course and I have absolutely no knowledge of any stats. I hear people mentioning things like R, SPSS, or whatever. Would I even be able to learn the software stuff if I don't have a background in statistics?

Should I try to learn this stuff over the summer in order for a doctor/PI to accept me into a research project? I really don't want to take a formal class on it. If it is recommended, can someone recommend a good online crash course for the basics?
 
When you meet with potential research mentors, be very up front about your lack of experience. I don't know what school you will be at, but most docs are very understanding and will explain the type of analysis your project needs and how and why you're doing that analysis. It never hurts to research it and learn more about it, but if you're honest they will teach you. Or in my opinion, they should.
 
I'm starting as an MS1 in the summer and I really want to get involved in research during my time in school.

The problem is, I have never taken a stats course and I have absolutely no knowledge of any stats. I hear people mentioning things like R, SPSS, or whatever. Would I even be able to learn the software stuff if I don't have a background in statistics?

Should I try to learn this stuff over the summer in order for a doctor/PI to accept me into a research project? I really don't want to take a formal class on it. If it is recommended, can someone recommend a good online crash course for the basics?
In general, agree with post by @mw18. My school/AMC employs a group of dedicated statisticians - so wouldn't expect an MS to be a stats expert for that reason, among others. All of us work together even though many academic MD/PhDs are well-trained in stats, too.

If you have the time (and will enjoy the stats knowledge), you can learn stats in your spare time (e.g., look at R courses online).
 
You'll probably take a stats crash course some time in M1 too. And you'll learn to despise it with a burning passion.
And yet, there is nothing more important to keep you from being hornswoggled.
 
In general, agree with post by @mw18. My school/AMC employs a group of dedicated statisticians - so wouldn't expect an MS to be a stats expert for that reason, among others. All of us work together even though many academic MD/PhDs are well-trained in stats, too.

If you have the time (and will enjoy the stats knowledge), you can learn stats in your spare time (e.g., look at R courses online).
How would you define "well trained"? Many MD/PhDs I have come across may have taken a couple of ancillary stats courses that more or less taught button clicking in SPSS but very rarely learn statistics in a way for the MD/PhD to teach most of the methods he or she uses, let alone to withstand any sort of inspection by someone with a degree in statistics.

For the OP, I think you'll do yourself a favor by trying to learn some statistics, both theory and application. The reality is that most of the people in medicine don't have a formal background in statistics, including those with MD/PhD or MD/MPH type credentials, and merely publishing a lot of papers doesn't mean that someone knows the stats well. From years of work, they can usually direct you to some common tests, but it's not uncommon for them to suggest inappropriate tests or procedures or to incorrectly carry out an appropriate test. The best thing you can do for yourself is find reliable sources from someone with an adequate statistics background (MS/PhD statistics or biostatistics, with very few exceptions). Coursera has some decent material and you can vet the credentials of the professor yourself by checking out the professor's CV. If you're going to buy or use a statistics textbook, you'll probably want to do the same and vet the author's degree (Paul Allison is an exception to the general rule of needing a statistics degree, he was given the distinction of becoming a fellow of the American Statistical Association and he does a pretty good job at giving an overview of applied logistic regression and survival analysis).
 
I actually wanted to learn stats. So, between my M1 and M2 year I did a project where I was able to sort of learn it as I went, working with a mentor who really helped me learn it. Most of my friends, however, didn't do the stats for their projects as there was a statistician on the project team.
 
Most of my friends, however, didn't do the stats for their projects as there was a statistician on the project team.

The answer ultimately depends on if you go into bench, translational/clinical, or epi/public health research for what you will be expected to contribute. Many groups will have statisticians they work with, in which case not every member needs to know how to do them.

In my experience, for projects where a student gets involved in a shorter timeframe often the PI/student will design a side project nested from the PIs main work where the student will get a chance to do data collection, analysis, and write up the first draft of the manuscript. In this case it will be expected the student can do the (typically basic) stats themselves or learn quickly.
 
The answer ultimately depends on if you go into bench, translational/clinical, or epi/public health research for what you will be expected to contribute. Many groups will have statisticians they work with, in which case not every member needs to know how to do them.

In my experience, for projects where a student gets involved in a shorter timeframe often the PI/student will design a side project nested from the PIs main work where the student will get a chance to do data collection, analysis, and write up the first draft of the manuscript. In this case it will be expected the student can do the (typically basic) stats themselves or learn quickly.
I agree with you that many quick projects that students can become involved with have some expectation that the student take care of it. However, I would caution against assuming that any analysis is simple, even if it looks simple. Most people I have heard from think that button clicking and deciding that a result is significant is sufficient knowledge but it doesn't even scratch the surface of doing it the right way (many students don't even realize that picking an alpha level is situational and shouldn't always be the same for each project such as always selecting .05). I can show you what looks like a simple problem and you decide on a t-test, and I can also tell you at least 90% of students won't think to check the assumptions needed for the test, won't know how to remedy any issues that are found, and won't know how to decide if a violated assumption is actually an issue. Of the students who might consider checking test assumptions, most of them probably won't realize that "just using a test of normality" to determine if a nonparametric test is appropriate isn't the right way to go about testing, nor will they realize what NP test is appropriate or what it means if it actually is indicated.

This is why I advocate for trying to arm yourself with as much knowledge as possible as many of the PIs aren't really any better of a reference for this stuff unless they have a degree in biostatistics or statistics. Stats books and vetted online resources are your friends.
 
I agree with you that many quick projects that students can become involved with have some expectation that the student take care of it. However, I would caution against assuming that any analysis is simple, even if it looks simple. Most people I have heard from think that button clicking and deciding that a result is significant is sufficient knowledge but it doesn't even scratch the surface of doing it the right way (many students don't even realize that picking an alpha level is situational and shouldn't always be the same for each project such as always selecting .05). I can show you what looks like a simple problem and you decide on a t-test, and I can also tell you at least 90% of students won't think to check the assumptions needed for the test, won't know how to remedy any issues that are found, and won't know how to decide if a violated assumption is actually an issue. Of the students who might consider checking test assumptions, most of them probably won't realize that "just using a test of normality" to determine if a nonparametric test is appropriate isn't the right way to go about testing, nor will they realize what NP test is appropriate or what it means if it actually is indicated.

This is why I advocate for trying to arm yourself with as much knowledge as possible as many of the PIs aren't really any better of a reference for this stuff unless they have a degree in biostatistics or statistics. Stats books and vetted online resources are your friends.

Point well taken, but I said "basic" not "simple". As in univariate or multivariate regression, t-tests, hazard ratios, perhaps propensity score matching. Anything beyond that, such as Beyesian analysis, will likely only come from someone with considerable prior training or a student who finds an interesting article and tries to replicate their methods.
 
Can anyone recommend good resources (books, videos, websites) for me to get a grasp on clinical research / the statistics I will need to know to perform it?
 
Point well taken, but I said "basic" not "simple". As in univariate or multivariate regression, t-tests, hazard ratios, perhaps propensity score matching. Anything beyond that, such as Beyesian analysis, will likely only come from someone with considerable prior training or a student who finds an interesting article and tries to replicate their methods.
I'm using the two words interchangeably, but either way, the point doesn't change. Most of those are bread and butter type methods, but they require a lot of knowledge to use properly (anyone who took a stats class and has SPSS can click buttons but that doesn't mean they're doing the right thing or actually know what it means). I'm also going to assume that by "multivariate regression" you meant "multivariable regression." The former has multiple dependent/outcome variables analyzed simultaneously, while the latter has multiple independent variables with one dependent/outcome variable (which is another common mistake I've seen students and PI's make, even noted in research literature, but there is clearly a difference between multivariable and multivariate). Propensity score matching isn't really considered a basic technique and can be pretty tricky-- as with everything it has it's pitfalls and it is far from a magic wand to fix non-randomized group allocation. Bayesian statistics isn't necessarily less "basic" than the methods you're suggesting, but the most controversial and often tricky part surrounds specifying priors which is fairly subjective.

Still, coming back to something like a regression, for example, I know from personal experience, and from seeing how people without stats backgrounds publish papers, that they mainly think their work is done once they've coded the data and gotten some p-values and model-based statistics (it's not too tricky to read the statistical methods portion of a paper and discern who knows what they're doing and more often than not, when the stats are done well, the person in charge of the stats was someone with an education in biostats/stats). Outside of the papers with statisticians, few mention or are even aware of proper model building, assumption checking, model validation, diagnostics for the model such as influential observations and outliers (which isn't as simple as locating the outlier and deleting it...), failing to apply or misapplying procedures for a given situation, treating each analysis as a cook book recipe that varies little from the last study, and the list goes on...these are things that would be readily apparent to someone with a decent background in stats.

So I come back to my advice before: the best thing someone can do is to learn from vetted resources written or delivered by statisticians and biostatisticians. You can learn to generate and loosely interpret a t-test in under an hour, but to actually understand what, when, how, where, and why (all important, as well as alternatives or supplemental analysis), you need to spend a lot of time working with both neat and dirty data. Once you do this, you'll start to realize that misapplying a test or failing to verify assumptions can get you dramatically different and inappropriate conclusions. You'll notice that certain things will give you equivalent answers, some with more useful or more convenient information, such as the case of an independent t-test with two groups (special case of an ANOVA), an ANOVA with the same two groups, and a simple linear regression using a dummy variable for the same 2 groups (people are often surprised at that). Someone who does this will start to realize that always using Fisher's exact test can result in an unnecessary loss of power while always using Pearson's chi-square for independence can run into issues when expected cell counts start to dip below 5 (and will also realize when it's not material that some of the expected counts are less than 5 while in other cases a single expected count of less than 5 is a problem).

It's pretty similar to medicine in that you don't know what you don't know, and what you don't know can really cause problems.
 
Last edited by a moderator:
Can anyone recommend good resources (books, videos, websites) for me to get a grasp on clinical research / the statistics I will need to know to perform it?
If you want, we can discuss some things over PM to avoid clogging up the thread. I can send you a couple PDFs of some articles that I think would be helpful in addition to other resources, but advice will largely depend on some factors surrounding you.
 
If you want, we can discuss some things over PM to avoid clogging up the thread. I can send you a couple PDFs of some articles that I think would be helpful in addition to other resources, but advice will largely depend on some factors surrounding you.
I would be interested in some useful resources as well, if you could direct me to them.
 
I would be interested in some useful resources as well, if you could direct me to them.
Sure. Send me a PM so we can hammer out some details specific to you and your goals, and I'd be happy to share what I've heard of or personally found useful.
 
Top