Stats Question

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

Marissa4usa

Full Member
15+ Year Member
Joined
Sep 5, 2007
Messages
523
Reaction score
56
Hi guys,
I have been thinking about this for hours and my head is hurting. I am in the process of writing up my proposed analysis but I cannot think of a way how to run the stats in SPSS.
The problem is as follows: I will have couples come in who will complete questionnaires assessing each partner's level of anxiety. Based on the scores that each partner gets on this questionnaire, the couple will fall in 1 of 3 "profiles":
  • anxious - anxious (i.e. both partners anxious)
  • anxious- non-anxious (one partner anxious, one non-anxious)
  • non-anxious - non- anxious (neither partner anxious)
My goal is then to assess how these couples based on the category/profile they fall into (i.e. NOT the individual partners) differ from each other on different variables --> which leads to the next problem: One of the dependent variables also comes from questionnaires that both partners in a couple have completed.

Essentially, I'm not sure how create the IV (i.e. the "profile) to be able to use it in SPSS for my purposes. Similarly, I don't know how do the same with the the DV.

The answer might be obvious but for some reason I don't see it.

Thanks in advance to all you great statisticians out there !
 
Last edited:
F test with the groups aa, an, nn, then plug in the DV's you want to look at, post hoc to see what they mean.

I'm not great at stats (probably an understatement), but this is what comes to mind.
 
Hi loveoforganic,
thanks for your help, but what you just told me, is essentially the statistical test I am supposed to run which right now isn't my concern.
I guess, I am overthinking things right now. I might just have to create a dummy variable to group the couples....oh man, I think it's time for me to go to bed and start fresh tomorrow and hopefully read some advice! G'night!
 
Oh, I misunderstood, I see what you're asking now, my b. I'm not sure your question is answerable without more information about the dependent variable in question.

That said, I don't have tons of experience while plenty of people on here do, so sleeping on it and looking tomorrow probably isn't a bad idea!
 
We need more info about the IV. How did you assess anxiousness?
 
Maybe I'm misunderstanding what the problem is, but here is how I would do it.
First, move the data out of stacked form (i.e. Transform > Restructure>Cases into variables), so you have the data for a couple in a single row.

If "Person1AnxietyScore" < "CutoffScore" & "Person2AnxietyScore" < "CutoffScore" Then "Profile" = 0
If "Person1AnxietyScore" < "CutoffScore" & "Person2AnxietyScore" > "CutoffScore" OR If "Person1AnxietyScore" > "CutoffScore" & "Person2AnxietyScore" < "CutoffScore" Then "Profile" = 1
If "Person1AnxietyScore" > "CutoffScore" & "Person2AnxietyScore" > "CutoffScore" Then "Profile" = 2

There are more efficient ways to do it and you might have to play around with the ordering a bit, but you get the idea. You could treat it as continuous if you wanted, but that would get a bit more complicated.

Remember if you are looking at variables across levels (i.e. interested in how the "couple" level variables predict scores for individuals on one item or another) you should really use HLM or something that accounts for clustering. I'm not sure its necessary if you are collapsing couples together and treating them as a single unit in all analyses. You might be able to get away with it in this instance since your couple-level variables are really just combinations of your individual-level variables, so I'm not sure they would really count as level-2 in the traditional sense.
 
Maybe I'm misunderstanding what the problem is, but here is how I would do it.
First, move the data out of stacked form (i.e. Transform > Restructure>Cases into variables), so you have the data for a couple in a single row.

If "Person1AnxietyScore" < "CutoffScore" & "Person2AnxietyScore" < "CutoffScore" Then "Profile" = 0
If "Person1AnxietyScore" < "CutoffScore" & "Person2AnxietyScore" > "CutoffScore" OR If "Person1AnxietyScore" > "CutoffScore" & "Person2AnxietyScore" < "CutoffScore" Then "Profile" = 1
If "Person1AnxietyScore" > "CutoffScore" & "Person2AnxietyScore" > "CutoffScore" Then "Profile" = 2

Arbitrary cut points! It burns!
 
Arbitrary cut points! It burns!

Haha, agreed. However, unless diagnostic interviews were done, they are inevitable if the measures need to boil down to 3 categories.
 
Ollie, why would analyzing continuously increase the complexity? You mean needing to look at the normalcy of the data to see what's the appropriate test?
 
Thanks for your replies so far. Here is a little more information.
I will need those "profiles" because the couples' interactions will be coded using an observational coding system, where the couple is coded, not the individuals. Therefore, I need some way to "lump" them together because I am predicting that couples based on their profile (i.e. whether none, one or both partners are high in anxiety) will interact differently.
I am assessing anxiety with a scale that allows me to just separate individuals into anxious and non-anxious ( I know that's not the most precise measure but I got my Prof's Okay, plus some previous studies have done it, so at least it's justified).

@JockNerd:
Transform > Restructure>Cases into variables)
Did you mean:
1. Transform > Compute Variable > etc. OR
2. Transform > Recode > etc. OR
3.Data > Restructure > etc

I don't have this exact combination in my version of SPSS and I have tried all of them but for some reason, I can't figure out, how do it.
 
Last edited:
Marissa - you need to restructure your data - wherever that function is, its what you need to do (I might be wrong, I didn't have SPSS open when i posted). I don't have time now, but shoot me a PM if you want me to give you more explicit instructions on how to do it.

Ollie, why would analyzing continuously increase the complexity? You mean needing to look at the normalcy of the data to see what's the appropriate test?

Well yes, that is one part of it, albeit not a major hurdle.

However, I think the bigger barrier is the desire to look at it differently for people at high levels versus low levels. I haven't thought through the details of the analysis, but this isn't exactly straightforward the way the original analytic plan sounded. If you want to keep it continuous you would essentially be looking at difference scores in the anxiety level. Calculating the differences is more complex than it initially appears because if you divide by an arbitrary variable (i.e. gender) as is typically done, your effects can wash out unless you take the absolute value. However, taking the absolute value in turn can make interpretation of moderators and other effects an absolute beast.

Ideally, I think the best way would be a random effects model. Any use of fixed effects seem very messy. If intercepts and slopes are both allowed to vary, you can then look at their covariation and see if for example, people with low difference scores but a high overall anxiety score would have a different slope from people with a low difference score and low overall anxiety score. I can't think of any way to do that with a fixed effects model without setting cut points, but maybe I'm wrong.I'd have to spend a lot more time thinking about it to figure out exactly how the random effects could even be done in this context (there are some other complicating issues I hinted at above), but hopefully you get the idea.
 
There are ways to do this that don't require you to set an arbitrary point for "high anxiety." I would never let that fly in a paper I'm reviewing.

If I understand what you said: Each person filled out an anxiety questionnaire. So, you have anxiety for both partners (IV). You have observations of interactions (DV).

I think a simple moderation might actually do it. Anxiety of person 1 and person 2 are both associated with increased bad interactions (DV), and the interaction adds significantly to the model. Boom. No arbitrary cuts.
 
Marissa - you need to restructure your data - wherever that function is, its what you need to do (I might be wrong, I didn't have SPSS open when i posted). I don't have time now, but shoot me a PM if you want me to give you more explicit instructions on how to do it.



Well yes, that is one part of it, albeit not a major hurdle.

However, I think the bigger barrier is the desire to look at it differently for people at high levels versus low levels. I haven't thought through the details of the analysis, but this isn't exactly straightforward the way the original analytic plan sounded. If you want to keep it continuous you would essentially be looking at difference scores in the anxiety level. Calculating the differences is more complex than it initially appears because if you divide by an arbitrary variable (i.e. gender) as is typically done, your effects can wash out unless you take the absolute value. However, taking the absolute value in turn can make interpretation of moderators and other effects an absolute beast.

Ideally, I think the best way would be a random effects model. Any use of fixed effects seem very messy. If intercepts and slopes are both allowed to vary, you can then look at their covariation and see if for example, people with low difference scores but a high overall anxiety score would have a different slope from people with a low difference score and low overall anxiety score. I can't think of any way to do that with a fixed effects model without setting cut points, but maybe I'm wrong.I'd have to spend a lot more time thinking about it to figure out exactly how the random effects could even be done in this context (there are some other complicating issues I hinted at above), but hopefully you get the idea.

this is what I was thinking... too few samples would need to be bootstrapped no? I am overly rusty on social science stats, but I agree with this post and the post below, setting arbitrary cut points is a bad idea, especially with all the statistically sound techniques out there to determine relationships in small sample sizes. I am assuming here, but what 15-20 couples? Do you have the data already?
 
Hi guys,
I have been thinking about this for hours and my head is hurting. I am in the process of writing up my proposed analysis but I cannot think of a way how to run the stats in SPSS.
The problem is as follows: I will have couples come in who will complete questionnaires assessing each partner's level of anxiety. Based on the scores that each partner gets on this questionnaire, the couple will fall in 1 of 3 "profiles":
  • anxious - anxious (i.e. both partners anxious)
  • anxious- non-anxious (one partner anxious, one non-anxious)
  • non-anxious - non- anxious (neither partner anxious)
My goal is then to assess how these couples based on the category/profile they fall into (i.e. NOT the individual partners) differ from each other on different variables --> which leads to the next problem: One of the dependent variables also comes from questionnaires that both partners in a couple have completed.

Essentially, I'm not sure how create the IV (i.e. the "profile) to be able to use it in SPSS for my purposes. Similarly, I don't know how do the same with the the DV.

The answer might be obvious but for some reason I don't see it.

Thanks in advance to all you great statisticians out there !

You know I just realized something... even if you ran a random effects model, that does not necessarily mean what you get will make sense, and I think you might have that problem here.

Basically I think what will be wrong is that lets say you have a group that is anxious/non-anxious, then the information from each questionnaire will counteract the other...

Essentially my point is this, once you ascribe groups based on their questionnaires, I don't think you can reuse the questionnaire to do a between group analysis, it just makes little sense. Can you use a different assessment or is it too late?
 
You know I just realized something... even if you ran a random effects model, that does not necessarily mean what you get will make sense, and I think you might have that problem here.

Basically I think what will be wrong is that lets say you have a group that is anxious/non-anxious, then the information from each questionnaire will counteract the other...

Essentially my point is this, once you ascribe groups based on their questionnaires, I don't think you can reuse the questionnaire to do a between group analysis, it just makes little sense. Can you use a different assessment or is it too late?

Right now I am in the process of writing up my proposed analysis, so anything is possible. this is for my thesis and based on the advice of my professor I will officially propose to a regression analysis to assess my results. We are, however, planning to assess the data using SEM as well.
therefore, I obviously don't have any data yet. I'd like to get around 50-60 couples. If that's going to happen, who knows.
And also, the cut-off point was used in a previous study for which it seemed to have worked. Furthermore, the question that I posed here, was only for sub-question of a hypothesis.

ANY advice is greatly appreciated. I hope to eventually publish the results (if I find anything) and the more sophisticated the analysis the better.
 
the more sophisticated the analysis the better.


Noooooooooooooooooooooooooooooooooooooooooooooooooooo

The simpler the analysis, the better, always.

n = 60 is not enough to run an SEM.

What's wrong with my moderation proposal.
 
the more sophisticated the analysis the better

Actually, FWIW, one of the stats people at my university would disagree with that and says that a study that uses relatively simple stats (as appropriate, of course) is often more impressive than one that tries to use really complex stats to search for significance.

YMMV (at this point, I don't know enough to say whether mine does or not, but I thought that was an interesting stance).

ETA: JN beat me to it! 🙂
 
Noooooooooooooooooooooooooooooooooooooooooooooooooooo

The simpler the analysis, the better, always.

n = 60 is not enough to run an SEM.

What's wrong with my moderation proposal.

Ha, okay. That's the exact opposite of what I've been told so far but I want to hear opposing views.

Actually, I will do an moderation analysis. I forgot to mention that earlier because, again, the question I had was for a specific sub-question of a hypothesis.

Hm, I don't know enough about SEM, it's just something my prof suggested and at that time we had already talked about the number of participants. I'm glad I posted here because it seems there are several things that I have been misinformed about.
Thanks!
 
Right now I am in the process of writing up my proposed analysis, so anything is possible. this is for my thesis and based on the advice of my professor I will officially propose to a regression analysis to assess my results. We are, however, planning to assess the data using SEM as well.
therefore, I obviously don't have any data yet. I'd like to get around 50-60 couples. If that's going to happen, who knows.
And also, the cut-off point was used in a previous study for which it seemed to have worked. Furthermore, the question that I posed here, was only for sub-question of a hypothesis.

ANY advice is greatly appreciated. I hope to eventually publish the results (if I find anything) and the more sophisticated the analysis the better.

JN is wrong you can perform SEM, again it involves a bootstrap, see below for a reference
http://www.informaworld.com/index/785833305.pdf

and

http://orm.sagepub.com/cgi/content/abstract/11/2/296

Regression.... again depends on how you set it up, but SEM is obviously the better route

The cutoff will work because you need simplified conditions, it would be seriously complicated to do a multivariate analysis where you compared varying levels of anxiety in each couple member vs EACH individual variable... but by simplifying to 3 groups, then its a lot easier, it becomes [x,y,z] [a,b,c...] so I agree with cutoffs... I just disagree with other things about it, as noted before, using the same questionnaire for making the groups and analyzing results within the groups... or at least have two independent measures of the same thing so you could prove validity to me (or a reviewer)

Cheers and good luck... ps I could also post an article where boostrapping was used in a psych context

J
 
Ha, okay. That's the exact opposite of what I've been told so far but I want to hear opposing views.

Actually, I will do an moderation analysis. I forgot to mention that earlier because, again, the question I had was for a specific sub-question of a hypothesis.

Hm, I don't know enough about SEM, it's just something my prof suggested and at that time we had already talked about the number of participants. I'm glad I posted here because it seems there are several things that I have been misinformed about.
Thanks!

You do not want to fish for fishing sake, but an SEM IS a complex analysis on purpose because there is NO way to know all possible effects... this is the point of a monte carlo simulation... so again I disagree with JN... yes I agree you do not want to just do a bunch of meaningless complex bs, BUT if you have your variables properly setup and defined, SEM is perfectly suitable to your needs...
 
The cutoff will work because you need simplified conditions, it would be seriously complicated to do a multivariate analysis where you compared varying levels of anxiety in each couple member vs EACH individual variable... but by simplifying to 3 groups, then its a lot easier, it becomes [x,y,z] [a,b,c...] so I agree with cutoffs... I just disagree with other things about it, as noted before, using the same questionnaire for making the groups and analyzing results within the groups... or at least have two independent measures of the same thing so you could prove validity to me (or a reviewer)

What do you mean by that? I'm not sure I'm following what exactly your problem is? Would you mind explaining that?


I could also post an article where boostrapping was used in a psych context
Yes, I'd appreciate that!
 
What do you mean by that? I'm not sure I'm following what exactly your problem is? Would you mind explaining that?



Yes, I'd appreciate that!

http://www.springerlink.com/content/v72311m2450w0153/
and
http://www.springerlink.com/content/n422087618358672/

both very good examples of using this technique... this was especially important because they were working in populations that pose the triple threat problem (human, juvenile, offenders) so they had much fewer samples.

edit: I will think more on how to better word my criticisms when it isnt 245 am 🙂
 
JN is wrong you can perform SEM, again it involves a bootstrap, see below for a reference
http://www.informaworld.com/index/785833305.pdf

and

http://orm.sagepub.com/cgi/content/abstract/11/2/296

Regression.... again depends on how you set it up, but SEM is obviously the better route

The OP *could* do it, but again, there's no *point* to using SEM when a simpler analysis will do.

I'm never wrong 😉

The cutoff will work because you need simplified conditions, it would be seriously complicated to do a multivariate analysis where you compared varying levels of anxiety in each couple member vs EACH individual variable... but by simplifying to 3 groups, then its a lot easier, it becomes [x,y,z] [a,b,c...] so I agree with cutoffs... I just disagree with other things about it, as noted before, using the same questionnaire for making the groups and analyzing results within the groups... or at least have two independent measures of the same thing so you could prove validity to me (or a reviewer)
"It was easier to do it that way" is not sound justification. Artificial groups are nonsensical. They're essentially saying "our range is 0-60, out cutoff is 30, people who score 0 and 29 are the same but people who score 29 and 31 are different." Nonsense.

Also, if you're suggesting combining the cutoffs with the SEM, that sounds to me like multiple group comparison. Even generously assuming equal cell sizes, that's only 20 couples per group--far too few, even with bootstrapping.

Can someone tell me what's wrong with my moderation hypothesis that everyone is ditching it in favor of flashy things? Unless I was misunderstanding the data structure, that's the best way to do it.
 
Last edited:
JN, just out of curiosity - for studies on diagnosable mental illness, would cutoff groups be valid (e.g. nondiagnosed, mild, moderate, severe; or more extremely, nondiagnosed, diagnosed)? Or would you still favor looking at it continuously based on number of criteria met?
 
JN, just out of curiosity - for studies on diagnosable mental illness, would cutoff groups be valid (e.g. nondiagnosed, mild, moderate, severe; or more extremely, nondiagnosed, diagnosed)? Or would you still favor looking at it continuously based on number of criteria met?

Only if there is a reasonable theoretical or statistical reason. If you were measuring number of symptoms presented, for example, 1 symptom = mild, 2 symptoms = moderate, 3 symptoms = severe at categories might be better than doing a continuous variable with a range of only three. Or, some scales have specific cutoffs to determine something is qualitatively different. But, things like median or mean splits of continuous variables should almost never be used. I can't think of any circumstance where that would be preferable to continuous measurement.
 
.

Can someone tell me what's wrong with my moderation hypothesis that everyone is ditching it in favor of flashy things? Unless I was misunderstanding the data structure, that's the best way to do it.

I'll have to get back to you on this one. Though the broader point (don't do complex analyses just for the sake of it) I agree with.

Yours would certainly WORK based off what I understand...I'm just not certain it allows all the relevant questions to be asked (not that the original plan does either). Arbitrary cut scores aren't good either, but I also see far too many papers that use multiple regression and then try to make inferences that extend well past it into things that they could have tested using different analyses, but didn't - so it is far from perfect too. However, figuring out if that is the case requires more time than I have right now so it is going to be delayed for a bit til I can think through it a little more.

I will say I'm totally unclear why SEM would be necessary in this case. I thought the debate was between whether to keep the measure continuous or use cut scores, and then use that questionnaire in either form to try and predict another variable. Don't see any need for SEM in that situation. To be fair I'm horribly biased against SEM since I think it is by far the most abused statistical technique in existence right now. I'm tired of seeing people building ridiculously complex, supposedly "causative" models from correlational surveys of 100 undergrads where they clearly just threw a bunch of measures on a single topic in a relatively atheoretical way onto surveymonkey, slammed them into an SEM analysis, and just played with it until it fit. Blech. Some wonderful, fascinating models have been developed out of careful SEM analysis, and it has other uses as well that I think are if anything, underutilized. However...I think it is far more commonly used as turd polish.
 
Last edited:
Top