Coding/Statistics Experience Necessary for PhD Programs?

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

Redpancreas

Full Member
10+ Year Member
Joined
Dec 28, 2010
Messages
4,955
Reaction score
5,991
Having gone down the physician path myself, I had no idea how competitive this Clinical Psychology PhD process is (i.e triple digit applications for double digit interviews for 1-4 spots) and definitely realize I lack the intuition I once thought I had about it so I need help.

My partner has worked multiple jobs in the area of psychology with a diverse range of clinical experiences at psychology centers, an OK uGPA (she's a bit removed from it though), an above average GRE, and she has recently started getting more publications as a product of her most recent research job. The issue that she has brought up to me is that she feels like she lacks experience with coding (ex. R) and a foundation in Statistics. This cycle she has a couple interviews but recently her #1 (top 30ish program) rejected her and she feels it is because she doesn't have coding experience which I think is absurd. It's a PhD program, they're supposed to teach her the research methodologies. I feel her time would be best spent pumping out more publications to improve her CV (in the off chance she has to reapply) whereas she agrees that she wants to keep doing that but also take classes in advanced statistics and programming which she's currently doing and it's like stressing her out. I obviously support her and love her drive to learn programming but do you all really feel this is necessary for her admission into a Clinical Psychology PhD program (I'm not talking a tip top academic program, I'm talking any program). She has an unhealthy tendency to underestimate herself sometimes and wanted to get some wiser opinions. Obviously I am happy she's taking initiative but just want to make sure she's not putting the cart before the horse...

Members don't see this ad.
 
Last edited:
If she already has publications, she probably has sufficient stats experience and knowledge for grad school admissions and coding experience isn't necessary. While I agree that getting more research and products would be the best use of her time, I also wonder about her fit with the PIs, their current grad students, and the overall programs to which she has applied. It doesn't really matter how impressive her CV is and how great she is at coding if she doesn't have a close fit.
 
  • Like
Reactions: 4 users
If she already has publications, she probably has sufficient stats experience and knowledge for grad school admissions and coding experience isn't necessary. While I agree that getting more research and products would be the best use of her time, I also wonder about her fit with the PIs, their current grad students, and the overall programs to which she has applied. It doesn't really matter how impressive her CV is and how great she is at coding if she doesn't have a close fit.
She mentioned fit being important and she planned for that (see below). She mentioned she loved the PI at the place she got rejected from and felt she fielded questions from the grad students and post-docs well. I think she was proactive about the fit part and applied to programs based off her research interests and ended up applying to only like 9 places against my advice of applying to 100s...she followed the Mitch Guide or w/e which I guess is supposed to be good. It seemed like the PI at the program was recruiting her actively and telling her how she fit well and apparently she got onto some short list of applicants in the single digits but ultimately I guess they could only take 1-2 and she wasn't one of them :/ She reached out for feedback and was ghosted. The only thing she mentioned was some people asked some questions about coding, etc. and she felt that was one areas not as strong on her application.

Do you think some programs see coding experience/stats as a prerequisite? She does have publications but she's not first author and while she's done a lot working in the writing aspect and data collection, I think the team outsources the data analysis to the statisticians and the original ideas/research idea itself are usually formulated by the PI even though she mentioned she's highly involved with the IRB development.

I personally think this may be a specific program issue and maybe they're competitive so they can ask more of their applicants. That's partly what I am thinking if what her inclination that she was not selected due to that is true. I also remind her that her goal is to get into a program and she agrees but maybe subconsciously she really wants to get into one of these more competitive programs because she senses they have more passion for the research she's interested in.
 
Last edited:
Members don't see this ad :)
She mentioned fit being important and she planned for that (see below). She mentioned she loved the PI at the place she got rejected from and felt she fielded questions from the grad students and post-docs well. I think she was proactive about the fit part and applied to programs based off her research interests and ended up applying to only like 9 places against my advice of applying to 100s...she followed the Mitch Guide or w/e which I guess is supposed to be good. It seemed like the PI at the program was recruiting her actively and telling her how she fit well and apparently she got onto some short list of applicants in the single digits but ultimately I guess they could only take 1-2 and she wasn't one of them :/ She reached out for feedback and was ghosted. The only thing she mentioned was some people asked some questions about coding, etc. and she felt that was one areas not as strong on her application.

Do you think some programs see coding experience/stats as a prerequisite? She does have publications but she's not first author and while she's done a lot working in the writing aspect and data collection, I think the team outsources the data analysis to the statisticians and the original ideas/research idea itself are usually formulated by the PI even though she mentioned she's highly involved with the IRB development.

I personally think this may be a specific program issue and maybe they're competitive so they can ask more of their applicants. That's partly what I am thinking if what her inclination that she was not selected due to that is true. I also remind her that her goal is to get into a program and she agrees but maybe subconsciously she really wants to get into one of these more competitive programs because she senses they have more passion for the research she's interested in.
She might be incredible, phenomenal, off the charts rockstar...and not get the spot. Most of the interviewed folks will not get the spot. It's unreasonable to apply to "100s" (I'll grant you likely hyperbole?) because there will not be sufficient fit with that many mentors (and this is coming from someone who applied to ~15-20 programs....twice) but yeah, 9 isn't much. If she is getting interviews, she's in range at least. And Mitch's guide is legit. And some mentors are specifically looking for students with very specific skills sometimes. Hopefully she has some luck at another program.

PS the program rankings are largely useless. All the fully funded programs are quite competitive. There is a ton of variability even within programs since it's mentorship model.
 
Last edited:
  • Like
Reactions: 3 users
Fwiw when I was admitted, I could read data into statistical software, run descriptives, and the basic tests. I also had experience running complex longitudinal models, because while the lab I was working in had a statistician, I sought out the opportunity to work with him when he was analyzing our study's data. I entered grad school ready to independently conduct data analysis though I certainly learned a ton more in grad school. That being said, I know plenty of people who got in with just a couple stats classes and being able to point and click their way through spss.
 
  • Like
Reactions: 1 user
Mitch's guide is definitely a great place to start. It has excellent information about how the application and interview process works and what a competitive application looks like. From my experience, statistics/coding is generally not a deciding factor unless it's central to the work the lab does. It can be seen as a positive, but would be weighed against other applicants' strengths and weaknesses more than a required box to check to be considered for admission. Unfortunately, the reality at many programs is that there are 50-100 applicants competing for a single spot for that PI, so many people who are otherwise qualified will not be offered admission. Each program and PI evaluates applications differently and it's hard to know exactly what they are looking for (and sometimes they don't completely know until they see their applicant pool). It really does all come down to fit, both in past experience and future research interests.

I also wouldn't put much stock in program rankings. Because PhD programs are based on the mentor model, who you work with matters far more than the program you went to if you want a research career in the future. You want to work with someone who will be able to support your growth in the content areas and methodologies you want to base your career on. They may work at a "big name" school or a state college in the middle of nowhere - it doesn't matter. Program quality is also often different from undergraduate rankings. I'd recommend looking at the "Student Admissions, Outcomes, and Other Data" on each school's website (they're required to post if they are APA accredited) for a general sense of competitiveness, since it'll tell you #offers/#total apps for the past 5 or so years. These also have licensure percentages and match rates for accredited internships, which are really what you want to look at before accepting an offer.

All that said, research productivity trumps all in the admissions process, so I also support putting time into that vs. stats/coding skill building that won't lead to pubs/posters or clinical work.

Edit: also, what @PsychPhDone said. Didn't see that post before I finished writing mine.
 
  • Like
Reactions: 1 user
Mitch's guide is definitely a great place to start. It has excellent information about how the application and interview process works and what a competitive application looks like. From my experience, statistics/coding is generally not a deciding factor unless it's central to the work the lab does. It can be seen as a positive, but would be weighed against other applicants' strengths and weaknesses more than a required box to check to be considered for admission. Unfortunately, the reality at many programs is that there are 50-100 applicants competing for a single spot for that PI, so many people who are otherwise qualified will not be offered admission. Each program and PI evaluates applications differently and it's hard to know exactly what they are looking for (and sometimes they don't completely know until they see their applicant pool). It really does all come down to fit, both in past experience and future research interests.

I also wouldn't put much stock in program rankings. Because PhD programs are based on the mentor model, who you work with matters far more than the program you went to if you want a research career in the future. You want to work with someone who will be able to support your growth in the content areas and methodologies you want to base your career on. They may work at a "big name" school or a state college in the middle of nowhere - it doesn't matter. Program quality is also often different from undergraduate rankings. I'd recommend looking at the "Student Admissions, Outcomes, and Other Data" on each school's website (they're required to post if they are APA accredited) for a general sense of competitiveness, since it'll tell you #offers/#total apps for the past 5 or so years. These also have licensure percentages and match rates for accredited internships, which are really what you want to look at before accepting an offer.

All that said, research productivity trumps all in the admissions process, so I also support putting time into that vs. stats/coding skill building that won't lead to pubs/posters or clinical work.

Edit: also, what @PsychPhDone said. Didn't see that post before I finished writing mine.

This was my experience. we were a coding/programming heavy lab (eeg, f/MRI), so everyone in that lab had coding experience. But there were very few labs like that in our program, and coding experience was not even considered in most other labs when reviewing applications. This was also more then a decade ago, so I'm not sure how much coding now pervades non-neuro areas these days, but at least locally I don't see a sea change in that respect.
 
  • Like
Reactions: 1 user
Agree with all the posts above.

I also want to emphasize that in a true mentorship model program, that PI is essentially king/queen/dictator (which is a cynical way of saying fit is really important).

If they want coding expedience, they will get it, regardless of how widespread or niche this is for the field as a whole.

If they don't like older/non-traditional students, those students will likely not end up in their lab. And so on and so forth.

And sometimes when there are two equally/relatively competitive candidates, the ultimate decision might come down to who they jive interpersonally better with.

Unfortunately, this is an extremely subjective process where candidates are largely in the dark as to what they need to be genuinely competitive. Which is why the general advice is to gain solid experiences, apply broadly and don't give up if you're not successful during a first application cycle.
 
  • Like
Reactions: 1 user
In my experience, you can learn R in like a month. It's the same with statistics, it's like 1 or 2 courses and you're done. Most of the time you must consult a statistician anyway. It's not that difficult. Just get more publications. Statistics and coding aren't a lot of work per se, they require some compatibility with the subject matter. Think of it as your epidemiology course XL
 
  • Like
Reactions: 1 user
In my experience, you can learn R in like a month. It's the same with statistics, it's like 1 or 2 courses and you're done. Most of the time you must consult a statistician anyway. It's not that difficult. Just get more publications. Statistics and coding aren't a lot of work per se, they require some compatibility with the subject matter. Think of it as your epidemiology course XL
Yes at a basic level (e.g., regressions, moderation/mediation, ANOVAs) but if you want to run your own higher level stats it's substantially more of a time investment. I'm thinking things like SEM, anything multilevel/longitudinal, and mixture modeling, which some grad students and professors do run themselves. There's so many intricacies involved in data preparation (e.g., dealing with missing data, creating latent variables) and interpreting the output (much more of an art than basic stats, where you can reject or fail to reject based on the p-value) that it takes pretty substantial experience and mentorship.
 
  • Like
Reactions: 1 user
Yes at a basic level (e.g., regressions, moderation/mediation, ANOVAs) but if you want to run your own higher level stats it's substantially more of a time investment. I'm thinking things like SEM, anything multilevel/longitudinal, and mixture modeling, which some grad students and professors do run themselves. There's so many intricacies involved in data preparation (e.g., dealing with missing data, creating latent variables) and interpreting the output (much more of an art than basic stats, where you can reject or fail to reject based on the p-value) that it takes pretty substantial experience and mentorship.

Lord don't remind me of missing data struggles, do you replace it or is it 0 or whatever. Much of what you're mentioning comes with experience in the field, which I presume you'll get in the Ph.D. Do you guys expect a master's student to know how to deal with shrinkage and such? The basic is all u need to get into the Ph.D., the rest you'll see when your initial dataset sucks.
 
Lord don't remind me of missing data struggles, do you replace it or is it 0 or whatever. Much of what you're mentioning comes with experience in the field, which I presume you'll get in the Ph.D. Do you guys expect a master's student to know how to deal with shrinkage and such? The basic is all u need to get into the Ph.D., the rest you'll see when your initial dataset sucks.
Totally agree that you don't need to know any of that before starting the PhD. I more wanted to give context that there's a ton to statistics beyond the basics and that it can be quite difficult. Not everyone will focus on that during their graduate studies, but learning some of the more complex techniques can be helpful in designing studies and being asked to consult by others in the field (grad students and profs alike).
 
  • Like
Reactions: 1 user
If she is curious what held her back, it isn't uncommon to email PIs and ask for advice how to improve their application for future years. Experience programming is not normally expected/required, but for certain labs it can be. I know several for whom fairly extensive Matlab/Python experience is a prerequisite but these are unusual cases (computational modeling, machine learning and neuroimaging labs). For more run-of-the-mill research topics it would be unusual - still probably a positive overall but really not a big deal.

If she's stressed about learning programming, working on publications and working a job though....grad school is likely to be a tough ride.

Psych grad school is competitive in a different way then med school in that it is a bit more nuanced just because admissions are typically done by lab. You can be a fantastic candidate and by sheer luck (or lack thereof) be competing against someone who is a current project coordinator for your desired PI at one school, a long-time research staff member with 20+ publications at another and the other ones might decide you fit well with their past work but not quite well enough with the impossible-to-know direction they see their lab going over the next 5 years. That's part of what can make the process infuriating (but its good practice for the unfairness of life moving forward!).

Good news is if she's getting interviews, she's a very strong applicant and likely can get accepted somewhere. Might take some tweaks to the application and some effort, but she's in the running.
 
  • Like
Reactions: 1 user
Members don't see this ad :)
Hot take, but I'd be weary of a lab that required a prospective graduate student to have extensive coding experience. It sounds like the training model that could be using grad students essentially as analysts on other people's pubs instead of building their own CVs. Coding itself is not terribly hard to learn--there are oodles of free resources to learn R that I can happily point you, but I agree with you OP that grad school should be a learning experience.
 
Hot take, but I'd be weary of a lab that required a prospective graduate student to have extensive coding experience. It sounds like the training model that could be using grad students essentially as analysts on other people's pubs instead of building their own CVs. Coding itself is not terribly hard to learn--there are oodles of free resources to learn R that I can happily point you, but I agree with you OP that grad school should be a learning experience.

I think it depends on the lab. People generally came to our lab to do EEG/f/MRI work, and if you have zero coding experience, it's a steep learning curve, especially with all of the other stuff you are doing in those first two years. Without coding, you would not be able to set up your own experiments or the analysis needed to work with that type of data. We were also fairly research heavy, so the older grad students didn't have a ton of time to teach you the coding either. We could do quick troubleshooting, but I didn't have time to teach someone from scratch.

That being said, we were an outlier lab. No other labs were anywhere close to the coding involvement we had. At least in clinical. The neuroscience grad labs were fairly coding heavy as well.
 
I think it depends on the lab. People generally came to our lab to do EEG/f/MRI work, and if you have zero coding experience, it's a steep learning curve, especially with all of the other stuff you are doing in those first two years. Without coding, you would not be able to set up your own experiments or the analysis needed to work with that type of data. We were also fairly research heavy, so the older grad students didn't have a ton of time to teach you the coding either. We could do quick troubleshooting, but I didn't have time to teach someone from scratch.

That being said, we were an outlier lab. No other labs were anywhere close to the coding involvement we had. At least in clinical. The neuroscience grad labs were fairly coding heavy as well.

How extensive, would you say, a prospective student's coding experience needed to be to do this level of work?

In other words, how fast could a reasonable intelligent person figure it out?
 
How extensive, would you say, a prospective student's coding experience needed to be to do this level of work?

In other words, how fast could a reasonable intelligent person figure it out?

In my time there, we had maybe one person who did not have experience with coding to a decent extent. Most people had Unix/AFNI coding experience. I would say that for a clinical psych grad student in our applicant group, if you had little to no coding experience, you may be able to get up to snuff within 4-6 months to be able to start something on your own in that realm.
 
  • Like
Reactions: 1 user
In my time there, we had maybe one person who did not have experience with coding to a decent extent. Most people had Unix/AFNI coding experience. I would say that for a clinical psych grad student in our applicant group, if you had little to no coding experience, you may be able to get up to snuff within 4-6 months to be able to start something on your own in that realm.

In my current role, I've done some handling on biometric data, for instance. We didn't cover it in my doctoral program at all and there was a heavy expectation that I just figure it out and defend what I did. It's admittedly a different level of training though.
 
Yes at a basic level (e.g., regressions, moderation/mediation, ANOVAs) but if you want to run your own higher level stats it's substantially more of a time investment. I'm thinking things like SEM, anything multilevel/longitudinal, and mixture modeling, which some grad students and professors do run themselves. There's so many intricacies involved in data preparation (e.g., dealing with missing data, creating latent variables) and interpreting the output (much more of an art than basic stats, where you can reject or fail to reject based on the p-value) that it takes pretty substantial experience and mentorship.
I have a huge issue with the trend for more complex stats. It seems like every lackluster lab is evergreening previous studies/datasets with more gussied up stats. I don't think it really adds much to the literature.
 
Last edited:
Hot take, but I'd be weary of a lab that required a prospective graduate student to have extensive coding experience. It sounds like the training model that could be using grad students essentially as analysts on other people's pubs instead of building their own CVs. Coding itself is not terribly hard to learn--there are oodles of free resources to learn R that I can happily point you, but I agree with you OP that grad school should be a learning experience.
This is really going to vary by lab. I agree that could conceivably be an issue but at least the labs I'm thinking of had a good track record for CV-building and people going on to academic positions. Some labs would ideally take folks from an engineering-type background, so not having written code before is just kind of a basic filter. Coding is pretty easy to learn, though being able to "think like a programmer" is not (though whether one can reasonably infer this from a grad school application is up for discussion - interviews might be a bit more effective for highlighting this).

I have a huge issue with the trend for more complex stats. It seems like every lackluster lab is evergreening previous studies/datasets with more gussied up stats. I don't think it really adds much to the literature.

I'd go a step further and say "Most of the literature doesn't add to the literature." There are some great reasons to use gussied up stats, but certainly we see a fair amount of research that is just "The same thing we knew 20 years ago is still true using <insert new technique that could conceivably have a marginal impact on standard error>". Sadly, its not going to change until we stop treating publications as a currency unit
 
  • Like
Reactions: 1 user
This is really going to vary by lab. I agree that could conceivably be an issue but at least the labs I'm thinking of had a good track record for CV-building and people going on to academic positions. Some labs would ideally take folks from an engineering-type background, so not having written code before is just kind of a basic filter. Coding is pretty easy to learn, though being able to "think like a programmer" is not (though whether one can reasonably infer this from a grad school application is up for discussion - interviews might be a bit more effective for highlighting this).

I appreciate the nuance here though I wonder if we would agree that there are pros and cons to thinking like a programmer. I do some ad hoc reviewing for one fairly good journal (IF > 10) and I can't tell you how many papers we get where the methods are super solid, but the theoretical validity of what they're doing makes little sense if we're all talking about humans. It's definitely my bias because I was trained this way, but I do think raw empiricism needs a strong theoretical check and I wonder to what extent a recruitment strategy like this promotes that.
 
  • Like
Reactions: 3 users
I appreciate the nuance here though I wonder if we would agree that there are pros and cons to thinking like a programmer. I do some ad hoc reviewing for one fairly good journal (IF > 10) and I can't tell you how many papers we get where the methods are super solid, but the theoretical validity of what they're doing makes little sense if we're all talking about humans. It's definitely my bias because I was trained this way, but I do think raw empiricism needs a strong theoretical check and I wonder to what extent a recruitment strategy like this promotes that.

100% agreed. Though I'd add that inability to "think like a programmer" very likely contributes to its own set of problems (e.g., blind trust that if the model ran and didn't throw errors that "the analysis worked"). We could endlessly debate which one is more problematic for the field, but both are prevalent and neither is great. That said, its certainly possible to teach a programmer about theory or a theorist about programming.

I'm not saying one or the other is right or wrong. I'm actually pretty down on a lot of the mhealth/machine learning work in the field, which I think are some areas most guilty of what you describe.
 
  • Like
Reactions: 2 users
I'm still very early in my career and biased by enjoying quantitative things agnostic of application (I'm self-studying linear algebra in my spare time to better understand how matrices underlie many of the models we run), but I like to try to frame things as "what does this add that would be impossible with prior methods." That's why I think, for example, longitudinal methods are interesting because they improve our understanding of individual trajectory based on personal factors vs normative experience at different ages, hopefully informing potential interventions and treatments. This has been a helpful discussion to consider, though, because the last thing I want in my career is to get caught up in doing things that seem "cool" in the moment but ultimately contribute nothing.
 
  • Like
Reactions: 2 users
Just keep in mind its always a balance. If you stick in the field you will inevitably publish some incremental science just to get a pub, get tenure, get a grant, etc. You'll publish things and later find errors in them. Hopefully they'll be minor.

If these things don't happen, you are doing yourself (and your trainees if you have them) a disservice.. In my view, it is about striking a balance and keeping the big picture in mind - thinking about your career as a whole versus any individual publication. Admittedly, that is easier to do further along and any individual publications "feels" a lot more weighty early on.

Keep going on the math. I mean, longitudinal methods aren't anything new but obviously have an important place. I'd advise anyone just starting out to get at least some barebones basics in Python and machine learning methods. While a lot of the literature to date has been the awful, incremental sort I describe above I do legitimately think they have enormous potential when properly used.
 
  • Care
  • Like
Reactions: 1 users
Just keep in mind its always a balance. If you stick in the field you will inevitably publish some incremental science just to get a pub, get tenure, get a grant, etc. You'll publish things and later find errors in them. Hopefully they'll be minor.

If these things don't happen, you are doing yourself (and your trainees if you have them) a disservice.. In my view, it is about striking a balance and keeping the big picture in mind - thinking about your career as a whole versus any individual publication. Admittedly, that is easier to do further along and any individual publications "feels" a lot more weighty early on.

Keep going on the math. I mean, longitudinal methods aren't anything new but obviously have an important place. I'd advise anyone just starting out to get at least some barebones basics in Python and machine learning methods. While a lot of the literature to date has been the awful, incremental sort I describe above I do legitimately think they have enormous potential when properly used.
One of the funnest things about my dissertation was the night I spent manually doing some of the math in the esoteric model I used model for analyzing variance (time, rater, method) and optimizing measurement procedures (one of Cronbach's last pushes was for this theory). At the time, there were two software programs you could get to do the math for you. One of them was basically an emulator for code that was originally written for mainframe computers and the interface had you virtually coding and entering punch cards. The other one was much kinder, but I kept getting some slightly different numbers. So I just had to do it by hand (although I did use SPSS to calculate the squares and other stuff) to see if I could make it work manually. It was like "whoa" when the hand calculations and the computer aligned.
 
Top