Which formula for standard deviation? N or N-1?

This forum made possible through the generous support of SDN members, donors, and sponsors. Thank you.

FROGGBUSTER

Full Member
10+ Year Member
Joined
Jul 22, 2010
Messages
961
Reaction score
4
standard_deviation_sample.png


Or would we use the n version?

Chad says n-1, Destroyer says n.

Members don't see this ad.
 
it depends on what you're looking at.

N is for a population.

while N-1 is usually for samples. smaller in size than population.

the good thing is you won't be asked to calculate standard deviation. its too time consuming.

I have not heard of anyone being asked to calculate standard deviation. I got a question on standard deviation but it was in regards to variance. I was thrown 4 statements and had to pick the correct one.

But worst case scenario you do get asked, whether you use N or N-1 you can determine the answer as the 3 other choices are so far off. It won' t be like 1.2 vs. 1.3 from what I saw. you can pretty much estimate and still get it right the choices were that obvious on even the more number intensive problems. At least that is what I saw both times.
 
This hella threw me off too but I don't know why it is n-1 because:

Since you are taking the average distance the points are away from the mean which is the definition of standard deviation it would make sense you would included the number of things you averaged it in the first place. I don't understand why they have to subtract one from the number of data points. When you are taking an average of something you divide by the number of things. Can someone explain to me in simplistic terms why they do this instead of just telling its the degrees of freedom?

Note: I do know that there is n for populations and n-1 for data sets. I just don't understand why they subtract one from it.
 
Members don't see this ad :)
This hella threw me off too but I don't know why it is n-1 because:

Since you are taking the average distance the points are away from the mean which is the definition of standard deviation it would make sense you would included the number of things you averaged it in the first place. I don't understand why they have to subtract one from the number of data points. When you are taking an average of something you divide by the number of things. Can someone explain to me in simplistic terms why they do this instead of just telling its the degrees of freedom?

Note: I do know that there is n for populations and n-1 for data sets. I just don't understand why they subtract one from it.

It has to do with the principles of variable uncertainty. This is the reason that degrees of freedom is typically used for a sample (n-k) because each variable can increase the total standard deviation due to uncertainty. However, when the population is used, there is no additional uncertainty involved because the whole population is tested. This is why only n is used (k would be zero because there are no unknown variables in the whole population). If I remember my stats theory correctly, I think this is how it would go lol 😀
 
It has to do with the principles of variable uncertainty. This is the reason that degrees of freedom is typically used for a sample (n-k) because each variable can increase the total standard deviation due to uncertainty. However, when the population is used, there is no additional uncertainty involved because the whole population is tested. This is why only n is used (k would be zero because there are no unknown variables in the whole population). If I remember my stats theory correctly, I think this is how it would go lol 😀

Nice bereno that was good, so K would be a factors the could affect the data set. How do we know its just 1 in a random data set though? There could be a whole set of variables that are unaccounted for and could be offsetting the variance in one way or the other?
 
Nice bereno that was good, so K would be a factors the could affect the data set. How do we know its just 1 in a random data set though? There could be a whole set of variables that are unaccounted for and could be offsetting the variance in one way or the other?

Well, the easy way to look at it is that if you are taking one sample looking for the mean, or something similar to that, there is only one variable, so k=1. This is why most people think of degrees of freedom as "n-1" because most samples are inclusive of only 1 variable. However, when you start jumping into two-sample tests, tests of varianve, or ANOVA, or Chi squared, etc, you start running into many variables at once, so you need the appropriate value for k.

When in doubt, just I would assume k=1 for most simple setups (n-1) Hope this helps
 
Well, the easy way to look at it is that if you are taking one sample looking for the mean, or something similar to that, there is only one variable, so k=1. This is why most people think of degrees of freedom as "n-1" because most samples are inclusive of only 1 variable. However, when you start jumping into two-sample tests, tests of varianve, or ANOVA, or Chi squared, etc, you start running into many variables at once, so you need the appropriate value for k.

When in doubt, just I would assume k=1 for most simple setups (n-1)

Hope this helps

Yee
 
sample standard deviation is calculated around the sample mean

if you're given the sample mean and n-1 of the data points, you could simply calculate the nth data point...sample mean times n, minus all n-1 data points you've been given thus far

basically you use n-1 because after you have a given, set sample mean (which you do the moment you start using it to calculate each data point's deviation from that value) you don't have n independent data points anymore, just n-1 independent data points
 
Last edited:
sample standard deviation is calculated around the sample mean

if you're given the sample mean and n-1 of the data points, you could simply calculate the nth data point...sample mean times n, minus all n-1 data points you've been given thus far

basically you use n-1 because after you have a given, set sample mean (which you do the moment you start using it to calculate each data point's deviation from that value) you don't have n independent data points anymore, just n-1 independent data points

This is the more technical, but very correct way to explain it 👍
 
ive never done n-1 and always use N and get all the problems right. so i dunno if this n-1 is a must.
 
Bump on this topic please. Has anyone that has taken the DAT encountered having to calculate/determine the variance or standard deviation on the actual exam?

I'm still a bit unclear on whether the ADA would base these questions off of using N to calculate the standard deviation of the sample set, like how Destroyer/Math Destroyer use N, OR if they would deem it correct to use (N-1) like how Chad teaches for the sample set.

For the purposes of the actual exam, when calculating the standard deviation or variance of a data set, such as {3,4,5,6,7}, should we use N, or N-1 in the denominator for the standard deviation equation?

I recall that Chad teaches to use N-1 in his video, but seems like the questions/answers from the DAT Destroyer and Math Destroyer base the correct answers off of using N (as opposed to N-1). (as stated above)

I understand that technically, one is used for a sample set, and the other is used for a population, but which should we use for the DAT under which conditions?

Any definite insight on this?
 
Last edited:
Bump on this topic please. Has anyone that has taken the DAT encountered having to calculate/determine the variance or standard deviation on the actual exam?

I'm still a bit unclear on whether the ADA would base these questions off of using N to calculate the standard deviation of the sample set, like how Destroyer/Math Destroyer use N, OR if they would deem it correct to use (N-1) like how Chad teaches for the sample set.

For the purposes of the actual exam, when calculating the standard deviation or variance of a data set, such as {3,4,5,6,7}, should we use N, or N-1 in the denominator for the standard deviation equation?

I recall that Chad teaches to use N-1 in his video, but seems like the questions/answers from the DAT Destroyer and Math Destroyer base the correct answers off of using N (as opposed to N-1). (as stated above)

I understand that technically, one is used for a sample set, and the other is used for a population, but which should we use for the DAT under which conditions?

Any definite insight on this?

Average and variance showed up on mine. I used N to determine the variance. Same thing as math destroyer. I swear the QR part was just a clone of the math destroyer but easier.
 
Average and variance showed up on mine. I used N to determine the variance. Same thing as math destroyer. I swear the QR part was just a clone of the math destroyer but easier.

Wow, that's really interesting. Do you recall if they had both answers, where if you used N, or N-1, that both answer choices were there?

I asked Chad this question, and he still suggests to use N-1. What to do! 😕
 
crazy, i just ran into this issue this morning lol. I think for the purposes of the DAT I'm going to go with n-1. Kinda arbitrary decision though, not basing it on facts. I think I just trust Chad more.
 
Top