The definition of variance is the sum of squares of deviations of the set of numbers from the mean value. It describe about how far a set of data are dispersed out from their mean value. Variance is always a non -negative number and it is denoted by sigma squared σ2 (sigma squared).
[latexpage]
\[
\sigma^2 =\frac{1}{N}\sum_{i = 1}^n {({x_i – \mu})^2 }
\]
Where
\(x_i\)= Data set valuesµ = Mean valueN = Total number of observations
Our data of observations are 3, 4, 6, 7, 8, 10, 11.
Let’s calculate the mean of the data. The mean value in this case is (3 + 4 + 6 + 7 + 8 + 10 + 11 ) / 7 = 7
Calculate the squares of the deviations from the mean value µ, sum all the squares of deviation and divide it by total number of observations, it is calculated as [(3-7)^2 + (4-7)^2 + (6-7)^2 + (7-7)^2 + (8-7)^2 + (10-7)^2 + (11-7)^2] / 7 = [16+9+1+0+1+9+16] / 7 = 52/7= 7,428. Hence the variance in this example is 7,428.
We will compare above value with the value variance calculate with R as follows:
#calculate variance
v <- c(3, 4, 6, 7, 8, 10, 11)
print(var(v))
#[1] 8.666667xis a variable which is an integer vectors usingc()function- The result of
var(x)is displayed usingprint()
we see that the variance calculate by R produce value 8,66666. This because R using n-1 as the divider considers the data as a sample not the population.
Standard Deviation = σ = square root of variance = \(\sqrt[2]{8,66666}\) = 2,944
R provides an in-built function sd() to compute the standard deviation of all the data in the dataset from the central point.
#calculate standard deviation
v <- c(3, 4, 6, 7, 8, 10, 11)
print(sd(v))
#[1] 2.94392


