Skip to main content

Inferential Statistics - Degrees Of Freedom


Degrees Of Freedom:

Degrees of freedom refers to the maximum number of logically independent values in a data sample which have the freedom to vary within.

Example:
If there is a sample of 3 values {5, x, 15} and the mean of all the values is 10.
Now it is easy to say that the value of x would be 10 as the mean of these 3 values is 10.
But if 2 values from this sample are not known, say {5, x, y} with same mean 10, then we are now cannot be sure about the exact values of x & y.
It could be any values from (10, 15), (15, 10), (5, 20), (20, 5) or even (1, 24).
So we cannot determine the exact value of these data x & y.
These 2 values has a freedom to vary.
But the third value do not have the freedom to change as it has to be some value so that the mean will not change. So this value depends upon all the other values.
So the degrees of freedom of this sample data of size 3 is 2.
Not only with size 3 sample, a sample with any size we can determine only one value if it is unknown as it depends on all the other values in the sample.
So the degrees of freedom is always the sample size minus 1.

Formula:
V = n – 1
V = Degrees of freedom
n = Sample size

In the above example, there is only one constraint placed in the set that the “mean is 10”.
Therefore the constraint placed on the freedom is one and degrees of freedom is two.

If we mention number of constraints as k, then
V = n – k
As the restrictions increase, the freedom is reduced.

Handed-
ness
Sex
Right handed
Left handed
Total
Male
43
9
52
Female
44
4
48
Total
87
13
100

In the above matrix of 2 X 2, the degrees of freedom of Gender and Handedness, each having 1 constraints in it (Total) and size is 2, then the degrees of freedom is as given below:
V (nu) = (c - 1) (r - 1)
            = (2 – 1) (2 – 1)
            = 1

P.S. Contingency Table: In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables.

Conclusion: 
Degrees of freedom of a data set is n – k, whereas n – Size of data set & k – Number of constraints placed.
As the constraints increase, the freedom is reduced.
If there are more than one variable are combined into a matrix, then the entire degrees of freedom is the product of degrees if freedom of each variable.



Comments

Popular posts from this blog

Inferential Statistics - Hypothesis Testing Part #1

Hypothesis Testing We have methods to test our hypothesis and these methods can be categorized into two parts. Parametric Testing: This type of tests make assumptions about the Population parameters and the distributions that the data came from. These types of test  includes Student's T  tests  and ANOVA   tests , which assume data is from a normal distribution. Non- parametric Testing: Non - parametric tests  are used when there is  no  or few information available about the population parameters. Z Test: To find test statistics, we can use the below formula. Z test can be done if the below 3 points are satisfied. 1.      Sample size should be > 30. 2.      Population SD should be known. 3.      Variables should be continues. Steps for Z Test: 1.      State Null & Alternate Hypothesis. 2.      Find t...

Inferential Statistics - An Introduction

Inferential Statistics                                          An Introduction Inferential statistics helps us to predict/inference from the data. Population: Population is a set of data on which we need to infer. Sample: Sample is a subset of data which is drawn from Population. Parameter: The measures we made on population data are called Parameter. Statistic: The measures we made on Sample data are called Statistic. Population Parameter Sample Statistic Mean – μ Mean – x Variance – σ² Variance – s² Standard Deviation - σ Standard Deviation – s Standard Error: Per CLT, the mean of each sample’s mean will reflect the population mean. In the sampling distribution, the variance between all the sample means from the mean is referred as Standard Error. S.E = σ / √n     ...