Help - How to do moderation analysis in SPSS?

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

OlliePsych

Full Member
10+ Year Member
15+ Year Member
Joined
Sep 15, 2008
Messages
47
Reaction score
0
Hey, just wondering if anyone can provide some assistance on this stats question. I am working on doing a moderation analysis and have been searching the internet and books for some instrcutions on how to do this (if you have any recommendation for resources, that would also be wonderful...).

Here is what I have come up with so far from my readings and I an wondering if I am doing it right:

Step 1 - Some sources say I would need to center the variable first, others say it is not necessary. Anyone have any advice on this?

Step 2 - Multiply the variables by the moderator.

Step 3 - Multiple regression with the predictor variable, the moderator, and their product entered. Is this all entered at once or is this a step-wise regression?

If the product of the variables is significant, then there is moderation? Does it matter if the initial variable and the moderator are individually significant or not?

Thanks for any help anyone out there can provide!!!

Members don't see this ad.
 
Hey, just wondering if anyone can provide some assistance on this stats question. I am working on doing a moderation analysis and have been searching the internet and books for some instrcutions on how to do this (if you have any recommendation for resources, that would also be wonderful...).

http://davidakenny.net/cm/moderation.htm

Frazier, P. A., Tix, A. P., & Barron, K. E. (2004). Testing Moderator and Mediator Effects in Counseling Psychology Research, Journal of Counseling Psychology, 51(1), 115-134.


Step 1 - Some sources say I would need to center the variable first, others say it is not necessary. Anyone have any advice on this?
You MUST either center the variables or regress the product terms on their parent scales. Not doing this results in severe multicollinearity. Think about what you're doing with this. You want the UNIQUE effect of the combination of the variables, but the raw product term contains a mix of the effects of each variable and the effect of the interaction... not useful.

Note: Centering is NOT the same as standardizing (i.e., converting to a z-score). Too many people seem to think that.

Step 2 - Multiply the variables by the moderator.
Yes

Step 3 - Multiple regression with the predictor variable, the moderator, and their product entered. Is this all entered at once or is this a step-wise regression?
No, heirarchical. Parent variables in block 1, interaction term in block 2, so that you can test if the interaction term improves the model.

If the product of the variables is significant, then there is moderation? Does it matter if the initial variable and the moderator are individually significant or not?
Yes then no. The main effects do NOT have to be significant for the interaction to be significant.
 
Last edited:
Members don't see this ad :)
Given what you say at the end, I assume you are referring to moderation specifically in regression (you can examine it using other techniques as well). In this case, JN gave you a good rundown.

And yes, you absolutely can have a significant moderator without significant main effects. In fact, that is the one of the reasons moderators are important to examine...you can have a cross-over interaction that is wildly significant that may have completely insignificant main effects.

Yes, step-wise. Parent variables in block 1, interaction term in block 2, so that you can test if the interaction term improves the model.

Careful - this is another area where terminology gets a little messy or fuzzy. I would call this a hierarchical regression, not stepwise (though you will see both used in the literature). Importantly, I don't think JN was referring to using a stepwise entry method, which is a different animal from entering things into different blocks. If memory serves (and its been awhile, so someone correct me if I'm wrong) a stepwise entry method means the predictors are entered into the equation one at a time, and then backed out if they are insignificant, so you lose the effect of essentially "controlling" for one predictor by entering it into the equation.
 
Careful - this is another area where terminology gets a little messy or fuzzy. I would call this a hierarchical regression, not stepwise (though you will see both used in the literature). Importantly, I don't think JN was referring to using a stepwise entry method, which is a different animal from entering things into different blocks. If memory serves (and its been awhile, so someone correct me if I'm wrong) a stepwise entry method means the predictors are entered into the equation one at a time, and then backed out if they are insignificant, so you lose the effect of essentially "controlling" for one predictor by entering it into the equation.

Right you are--that was a crossed neuron on my part. There's almost never a reason or justification to use step-wise regression; I did indeed mean hierarchical.
 
Great - that was all very helpful! Thank you all for your responses!
 
Right you are--that was a crossed neuron on my part. There's almost never a reason or justification to use step-wise regression; I did indeed mean hierarchical.

this is completely correct, and in SPSS the code would be
/METHOD=ENTER nameofvariable as opposed to /METHOD=STEPWISE or whatever it is if you tried stepwise, i don't know since i've never used stepwise since... well... why would you?

random other stats question - if you have two predictors and they are fairly highly correlated and you are worried about multicollinearity by including them both in a regression - what are good cutoffs for VIF/similar stats? i had hear a VIF under 4 is fine but i'm wondering if that is too loose of a standard...
 
random other stats question - if you have two predictors and they are fairly highly correlated and you are worried about multicollinearity by including them both in a regression - what are good cutoffs for VIF/similar stats? i had hear a VIF under 4 is fine but i'm wondering if that is too loose of a standard...

VIF should be under 10, tolerance should be over .1 (tolerance is the reciprocal of VIF). You can cite this for that:
http://www.amazon.com/Classical-Regression-Applications-Duxbury-Classic/dp/0534380166
 
Note: Centering is NOT the same as standardizing (i.e., converting to a z-score). Too many people seem to think that.

I've had arguments re: this with classmates who say standardizing is the same as centering, whereas I argue the opposite, but I have not been able to find exactly why that is. Could you elaborate?
 
I've had arguments re: this with classmates who say standardizing is the same as centering, whereas I argue the opposite, but I have not been able to find exactly why that is. Could you elaborate?

They're nothing at all alike. Not related one bit, except that they both result in a mean = 0.
Centering is subtracting the mean of the variables. So, if you had a mean score on some variable of 7.5, you subtract 7.5 from every data point of that variable. The purpose of this is to make the variables uncorrelated with the interaction term.
Standardizing is converting to a z-score.

Because I don't like to mess around with changing variables scales, I use the regress-out method.
 
Also, if the interaction is significant you'll want to follow it up by calculating simple slopes. To do so, you have to create a "high" and "low" version of the moderator variable (by subtracting and adding 1 SD to the centered variable, respectively--and yes, I mean subtracting 1 SD to get the "high" level, and adding for the low), create new interaction terms and then re-run the regressions. I've created a handout on how to do this, feel free to PM me if interested.

Or, the easier way, is to use this lovely program:

http://www.victoria.ac.nz/psyc/paul-jose-files/modgraph/modgraph.php

You can get a graph of the interaction effect, and, if you ask SPSS for the covariance matrix, it will calculate simple slopes for you. I've been using it steadily since 2005, and was feeling incomplete a few weeks ago when the site was down. Thank goodness it's back now!
 
Also, if the interaction is significant you'll want to follow it up by calculating simple slopes.

You don't have to follow-up the interaction with this approach. An alternative strategy is the Johnson-Neyman regions of significance (which I personally prefer).
 
Hi,
I thought I'd post in this thread instead of starting a new one.
I am in the process of finalizing my thesis proposal which includes a moderation analysis in regression. My prof keeps telling me that I am not writing it up correctly, that my wording sounds like I am talking about mediation. I DO know what the differences are :) so maybe somebody can give me some feedback on what I am missing. Here is who I have written it right now.

For research question # 5, which asked whether the association between B and C is the same for people with different A's., a hierarchical regression analysis will be conducted. In the first step A and B will be entered. In the second step the interaction term of A and B will be entered, with C as the dependent variable. Substantial change in the coefficient for the interaction term will indicate that the interaction of A and B may account for the change in C. A post-hoc analysis will need to be conducted in order to understand the direction of the moderation.

Any input is greatly appreciated!
 
Members don't see this ad :)
For research question # 5, which asked whether the association between B and C is the same for people with different A's., a hierarchical regression analysis will be conducted. In the first step A and B will be entered. In the second step the interaction term of A and B will be entered, with C as the dependent variable. Substantial change in the coefficient for the interaction term will indicate that the interaction of A and B may account for the change in C. A post-hoc analysis will need to be conducted in order to understand the direction of the moderation.!

I think part of the issue is that you're talking about "change in the coefficient for the interaction term." It's not change in the coefficient. It's the change in R-squared when you add the interaction term. You have to make it clear that you're talking about how the addition of the interaction term accounts for significant unique variance in C.
 
I agree, though will also add that "the interaction of A and B may account for the change in C" is probably contributing to the confusion. I understand exactly what you are doing and why you explained it this way, but it does sort of sound like you are describing a mediation model.

You are looking at whether the interaction term explains a significant amount of unique variance as ClinicalTrainee said, not whether it "accounts for the change". "Accounts for the change" implies to me that any main effects of A and B are "mediated" by the interaction effect...which doesn't make a whole lot of sense, and isn't what you are actually testing.
 
Quibble: If were reading that as written I would seriously wondering whether you incorrectly used the raw interaction term rather than the residual of the interaction or the centered interaction.
 
Hey everyone -

Thanks for all the great feedback you gave earlier on this question. So, I have gotten to a point where I have done the multiple regressions, have some significant interactions, and need to interpret them.

Can anyone provide information of graphing the interactions?

Thanks!
 
Actually, I think I got Modgraph to work now.
 
Glad you got it working.

RE: The earlier discussion, stepwise can actually be useful for more directly applied questions. I/O psych in particular still seems to use it. If say, you have a battery of tests you want to use to inform hiring decisions - in this case it actually makes sense to back things out of the model if they are not significant because you are interested in the combination of tests that accounts for the largest amount of variance on their own. This will then serve as the most cost-effective battery moving forward.

That was how I had it explained to me in a stats class anyways, and it seemed to make sense. That said, I think there are still issues with order of entry. For the typical research project, it is not of any real use since you typically want to make use of the data you have on hand.
 
I figured I'd join on this thread rather than start a new one.

I am currently trying to assess whether gender moderates the relation between a curvilinear/quadratic predictor variable and my outcome variable. After checking for skewness, I found that the predictor variable is positively skewed. I conducted a square root transformation to correct for skewness. However, I am interested in conducting curvilinear analyses with transformed variable as the predictor. Therefore, I'm wondering the following:

1) After transforming the variable, will it be necessary to also center it because I am looking at moderation?


Can anyone help?
 
Not 100% sure I followed what you are trying to do, but transforming is irrelevant and shouldn't change anything. Whatever you do to get the data into its final form doesn't matter...you still need to center predictors.
 
Thanks. I think you answered my question.

But just to clarify: I am looking to see if gender moderates the curvilinear relation between a skewed predictor variable (sport intensity) and the outcome (loneliness). I transformed (square root transformation) the sport intensity variable before creating the quadratic term. I realized then that I may need to center the transformed variable before creating the quadratic term because I am conducting moderation analyses (gender X quadratic sport intensity --> Loneliness). Is this correct?
 
I am trying to look at the moderating effects of three continuous variables with a 4-level categorical predictor variable and a continuous dependent variables. I think the best way to examine this relationship is to run an ANCOVA in SPSS and model the IV, Moderator, Moderator, Moderator, IV*Moderator1, IV*Moderator2, IV*Moderator3 on the DV. When I do this I get a significant interaction effects. I look at the univariate findings to see which DVs the significant interaction effects hold true for. However, I am having difficulty isolating where specifically the moderation effect is located when I plot the regression lines. I can eyeball in the graph but was wondering if there is any other recommendation out there.

Thanks in advance for your assistance :)
 
Hi,

I am posting in this thread instead of starting a new one. My advisor has asked me to explain steps and criteria for testing moderator effect in my thesis proposal. After reading an article by Frazier; Barron & Tix, 2004 and the messages posted here, I find it very useful. So I have written the steps below according to my understanding. Could anyone please advice if the following steps are correct? Are there any other criteria I should add in? For your reference, all of my variables are continuous.

Step 1: Standardizing the predictor, moderator and outcome variables. (It was stated in the article by Frazier; Barron & Tix, 2004 that it is better to convert predictor and moderator variables into standard scores. But I am unclear if I should also standardize the outcome variable or not.)

Step 2: Creating interaction term by multiplying the standardized predictors and moderator variables.

Step 3: Conducting Hierarchical Multiple Regression.

First: Enter the standardized predictor and moderators variables. (Does it matter that in the first block, the moderator variable has to be uncorrelated with the predictor and the outcome before I could proceed to the next step?)

Second: Enter the interaction term.

- Determine the change in R square after entering the interaction term. (where p < .05)

Step 4: If the interaction term is significant, plot a graph. (Referring to the message posted by ClinicalTrainee, I am still unclear about the part on “create new interaction terms and then re-run the regressions.” Could anyone please advise and elaborate more?

[ClinicalTrainee;9650313]Also, if the interaction is significant you'll want to follow it up by calculating simple slopes. To do so, you have to create a "high" and "low" version of the moderator variable (by subtracting and adding 1 SD to the centered variable, respectively--and yes, I mean subtracting 1 SD to get the "high" level, and adding for the low), create new interaction terms and then re-run the regressions.

Last question: My predictor and moderator are measured with 5-point Likert scales. And it is suggested by Frazier; Barron & Tix, 2004, that the outcome measure needs to have at least 25 response options. I thought this might be a little difficult for the participants to pick an answer from. So, I am thinking to reduce them to 10 point scales. Could anyone please advice? .

Thanks so much
 
Per step 1: I've always centered the predictor and moderator, not used standardized scores (to center a variable, subtract the mean from each data point). Just make sure it is .0000 when you're done. You do not need to standardize or center your outcome.

The rest is correct. Your moderator can be correlated with the outcome (that would be a "main effect" of the moderator variable) and can be correlated with the predictor, though it's best if the correlation isn't too high.

As for the followup, you create "high" and "low" versions of your moderators....see the explanation in this article (plus cite it):

Holmbeck, G.N. (2002) Post-hoc probing of significant moderational and mediational effects in studies of pediatric populations. Journal of Pediatric Psychology, 27(1), 87-96.

Also, I highly recommend the website "modgraph" for getting graphs of significant interaction effects (you can do follow ups here too, provided you ask SPSS for the covariance matrix): http://www.victoria.ac.nz/psyc/paul-jose-files/modgraph/modgraph.php
 
...
Step 1: Standardizing the predictor, moderator and outcome variables. (It was stated in the article by Frazier; Barron & Tix, 2004 that it is better to convert predictor and moderator variables into standard scores. But I am unclear if I should also standardize the outcome variable or not.)

You can mean-center (not convert to z-scores), or regress out the interaction term from the parent terms.

You do not need to, nor should you, center or use the regress out method on anything involving the outcome.

(Does it matter that in the first block, the moderator variable has to be uncorrelated with the predictor and the outcome before I could proceed to the next step?)

Yup, it would wreck the regression to run the analysis with the outcome included in the orthogonalizing.

Determine the change in R square after entering the interaction term. (where p < .05)

Well, as with anything, you need to look at R-sq, regression weights, etc.

Last question: My predictor and moderator are measured with 5-point Likert scales. And it is suggested by Frazier; Barron & Tix, 2004, that the outcome measure needs to have at least 25 response options. I thought this might be a little difficult for the participants to pick an answer from. So, I am thinking to reduce them to 10 point scales. Could anyone please advice? .

If the variable you're entering into the regression is a mean for the outcome, then you have k*r (where r is the response interval) for possible scores. So if you have a 7 pt scale you have 5*7 values.

I think Frazier et al just meant "no ordinal outcomes."
 
As for the followup, you create "high" and "low" versions of your moderators

Although people sometimes do this, I don't see why you would, especially if it involves dichotomizing a continuous variable. The stats obtained in a normal moderation analysis are perfectly sufficient to see the relationship.
 
Thank you, ClinicalTrainee and JockNerd for your kind help :)
 
Hi,

I have a question about analyzing data with Hierarchical Regression. (Sorry if I am out of topic here...) I am just wondering if I can use hierarchical regression when I have a categorical independent variable?

(For independent variables: I have 1 categorical variable, and 3 continuous variables; For dependent variables: I have 1 continuous dependent variable)

Thanks!!:)
 
Hi,

I have a question about analyzing data with Hierarchical Regression. (Sorry if I am out of topic here...) I am just wondering if I can use hierarchical regression when I have a categorical independent variable?

(For independent variables: I have 1 categorical variable, and 3 continuous variables; For dependent variables: I have 1 continuous dependent variable)

Thanks!!:)

Yes. You'll want to dummy-code the categorical variable, first. This is quite easy if you only have two categories-- Make one variable, and just make one category value 0, and the other category value 1. If you have more than two categories, you'll need multiple variables (e.g., C1, where Category1's value is 1 and all others are 0; C2, where Category 2's value is 1 and all others are 0; C3, where Category 3's value is 1...).
 
Yes. You'll want to dummy-code the categorical variable, first. This is quite easy if you only have two categories-- Make one variable, and just make one category value 0, and the other category value 1. If you have more than two categories, you'll need multiple variables (e.g., C1, where Category1's value is 1 and all others are 0; C2, where Category 2's value is 1 and all others are 0; C3, where Category 3's value is 1...).

:)Thanks for your reply thewesternsky! In this case, I can just interpret my analysis like normal hierarchical regression? and not interpreting it in the Mixed Model Regression way?
 
So, I'm going to revive this thread again.

For my current analyses I have one IV and two moderators.
I know that for step 1, I enter all of the predictor variables, i.e. IV, M1, and M2. For step 2 then you would then enter an interaction term of the IV and moderator. In my case this is going to be IV * M1 * M2.

My question is will I need to have an intermediate step where I enter all other possible interaction terms, i.e. IV * M1, IV * M2, and M1 * M2. I am only interested in the three way interaction, but I'm wondering if I in any way I need to statistically control for the other possible two-way interactions?

Thanks!
 
Yes, that's correct. Step 1 would be entering the individual predictors. At Step 2 you enter all possible 2-way interactions involved in the 3-way interaction. And at Step 3 you enter the 3-way interaction.

Of course, as stated above, all the variables involved in the interaction should be centered.

Check out: Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, London, Sage.
 
Hey,

i have a problem with the moderation analysis, hope someone can help me.

For the current analysis i have 8 IV and 7 moderators and i am not sure about how to enter all of them to SPSS. Do i need to do the moderation analysis for each IV separately?

I would enter then:

Step 1: IV1, M1,M2,M3,M4,M5,M6,M7

Step 2: IV1*M1 (...7)
M1*M2
M2*M3 etc.

Step 3: IV1*M1*M2*M3*M4*M5*M6*M7

Is it correct?


Thanks for any help!
 
In general, with that number of variables I would be inclined to use more of a "model-building" approach. In many ways this is similar to stepwise regression, which is usually a bad thing, but with that number of variables your options will be limited - especially if your sample size is anything but gigantic. I'd wager you would need an n in the tens of thousands in order to run a single model, and even then it would be virtually impossible to interpret.

This is assuming your Step 3 really is what you are interested in (an 8-way interaction across all the moderators?). If not, it is somewhat simpler but you would still oversaturate your model retaining all combinations of these variables.

The trick with bigger models is that you really need to "understand" your data, and not just plug & play with the analyses (not that I'd ever recommend doing that, but I worry about it less with simpler models). You need a thorough understanding of how your IVs and moderators co-vary with eachother in order to make sense of this. I would actually test moderators individually in separate models first. Then look at combining the significant moderators into a single model. If you aren't interested in moderator/moderator interactions your step 2/3 would actually look very different. Correct me if I'm wrong, but I don't believe the M1*M2 term needs to be in the model unless you are looking at the M1*M2*IV interaction. If you just want M1*IV and M2*IV, I don't think M1*M2 NEEDS to be in the model (at least not on statistical grounds).

In sum, I would likely do the following:
1) Correlation matrix of IVs
2) Correlation matrix of moderators
3) Univariate model for each IV
4) Multiple regression model combining all IVs (don't necessarily need to drop out insig. terms yet).
5) Hierarchical regression with Step 1 = IV, M1, Step 2 = IV*M1 (repeated for all combinations)

The results of 5 would then partially dictate what you use to build larger models. Next you might want to stick to a single IV and see if moderators are still significant when combined. So for example, you could do Step 1 = IV, M1, M2, Step 2 = IV*M1, IV*M2, etc. You can use a similar plan to examine the impact of a single moderator with multiple IVs in the model. In general though, I prefer to keep the models smaller - for example, presenting the results of individual moderators, and using the correlation matrix to inform discussion of how they may overlap.

Be careful to use all the appropriate terms when testing an interaction. For any interaction term, all lower-order interactions need to be present. This is where it can quickly turn to uninterpretable gibberish. Your Step 3 above would require hundreds and hundreds of parameters. You'll find that even in the rare cases you do find significance with that number of terms, the results will often be complete nonsense, in opposing directions, etc.

.
 
Thank u so much for the explanation! I am going to try to work with small models first
 
Hello friends,
I have 2 IV, one DV and 1 moderator and want to know influence of moderator on the relationship between IV and DV, i used SPSS: first i entered IV and moderator and in the second step i entered interaction between them. i used stepwise method? i want to be sure this process is correct or not? actually we can use stepwise method?
 
Top