# Variance and Standard deviation

## Variance

The variance of some data is the arithmetical mean of the square of the absolute deviations. It is symbolized as $\sigma ^2$ and it is calculated by applying the formula $$\sigma^2=\displaystyle \frac{\displaystyle\sum_{i=1}^N (x_i-\overline{x})^2}{N}=\frac{(x_1-\overline{x})^2+(x_2-\overline{x})^2+\ldots+(x_N-\overline{x})^2}{N}$$ which it is possible to simplify as: $$\sigma^2=\displaystyle \frac\sum_{i=1}^N x_i^2}{N}-\overline{x}^2=\frac{x_1^2+x_2^2+\ldots+x_N^2}{N}-\overline{x}^$$

Same as with the average, it is not always possible to find the variance, and it is a parameter that is very sensitive to the extreme scorings. We can see that, with the deviation being squared, the variance cannot have the same units as the data.

Comparing with the same type of information, a high variance means that the data is more dispersed. And a low value of the variance indicates that the values are in general closer to the average.

A value of the variance equal to zero means that all the values are equal, and therefore they are also equal to the arithmetical average.

In a basketball match, we have the following points for the players of a team: $0, 2, 4, 5, 8, 10, 10, 15, 38$. Calculate the variance of the scorings of the players of the team.

Applying the formula $\overline{x}=\displaystyle \frac{0+2+4+5+8+10+10+15+38}{9}=\frac{92}{9}=10.22$ the average is obtained.

Next we apply the formula of the variance: $$\sigma^2=\displaystyle \frac{(0-10.22)^2+(2-10.22)^2+(4-10.22)^2+(5-10.22)^2+(8-10.22)^2+(10-10.22)^2+(10-10.22)^2+(15-10.22)^2+(38-10.22)^2}{9}=\\=\displaystyle \frac{10.22^2+8.22^2+6.22^2+5.22^2+2.22^2+0.22^2+4.78^2+27.78^2}{9}=\\=\displaystyle\frac{104.4484+67.5684+38.6884+27.2484+4.9284+0.0484+22.8484+771.7284}{9}=\\=\displaystyle \frac{1037.5556}{9}=115.28$$

### Calculation of the variance for grouped information

In case of $N$ samples grouped in $n$ classes the formula is: $$\sigma^2=\displaystyle \frac\sum_{i=1}^n (x_i-\overline{x})^2 f_i}{N}=\frac{(x_1-\overline{x})^2f_1+(x_2-\overline{x})^2f_2+\ldots+(x_n-\overline{x}^2f_n}{N$$ which is simplified as: $$\displaystyle \sigma^2=\frac\sum_{i=1}^n x_i^2f_i}{N}-\overline{x}^2=\frac{x_1^2f_1+x_2^2f_2+\ldots+x_n^2f_n}{N}-\overline{x}^$$ The interpretation that we can make of the result is the same as it is for non grouped information.

The height in cm of the players of a basketball team is in the following table. Calculate the variance.

 $x_i$ $f_i$ $[160,170)$ $165$ $1$ $[170,180)$ $175$ $2$ $[180,190)$ $185$ $4$ $[190,200)$ $195$ $3$ $[200,210)$ $205$ $2$

First of all, fill the following table:

 $x_i$ $f_i$ $x_if_i$ $x_i^2f_i$ $[160,170)$ $165$ $1$ $165$ $27225$ $[170,180)$ $175$ $2$ $350$ $61250$ $[180,190)$ $185$ $4$ $740$ $136900$ $[190,200)$ $195$ $3$ $585$ $114075$ $[200,210)$ $205$ $2$ $410$ $84050$ $12$ $2250$ $423500$

It is necessary to calculate the average $$\displaystyle \overline{x}=\frac{2250}{12}=187.5$$ to be able to apply the formula.

The variance is calculated then $$\displaystyle \omega^2=\frac{423500}{12}-187.5^2=135.42$$

### Properties of the variance

1. $\sigma^2 \geq$ The variance is a positive value, as has already been said, and we have the equality only in the event that all the samples are equal.
2. If we add a constant to all the data, the variance doesn't change.
3. If all the information is multiplied by a constant, the variance remains multiplied by the square of the constant.
4. If we have several distributions with the same average and we calculate the variances, we can find the total variance by applying the formula $$\sigma^2=\displaystyle \frac{\sigma_1^2+\sigma_2^2+\ldots+\sigma_n^2}{n}$$ In the event that the distributions have a different size, the formula is adjusted and becomes$$\sigma^2=\displaystyle \frac{\sigma_1^2k_1+\sigma_2^2k_2+\ldots+\sigma_n^2k_n}{k_1+k_2+\ldots+k_n}$$

In an exam, all the students got a ten. Find the variance of the marks.

Since all the values are the same, the average is also equal $\overline{x}=10$, and the variance is zero $\sigma^2=0$.

## Standard deviation

The standard deviation is the square root of the variance and it is represented by the letter $\sigma$. To calculate it, the variance is calculated first and the root is extracted. The interpretations that are deduced from standard deviation are, therefore, similar to those that were deduced from the variance.

In comparing this with the same type of information, standard deviation means that the information is dispersed, while a low value indicates that the values are close together and, therefore, close to the average.

### Properties of standard deviation

1. $\sigma \geq 0$ The standard deviation is a positive value, we have the equality only in the event that all the samples are equal.
2. If we add a constant to all the data, the standard deviation doesn't change.
3. If all the data is multiplied by a constant, the standard deviation remains multiplied by the constant.
4. If we have several distributions with the same average and we calculate the standard deviations, we can find the total standard deviation by applying the formula$$\sigma=\displaystyle \sqrt\frac{\sigma_1^2+\sigma_2^2+\ldots+\sigma_n^2}{n}$$ In the even that the distributions have a different size, the formula is adjusted and is$$\sigma=\displaystyle \sqrt\frac{\sigma_1^2k_1+\sigma_2^2k_2+\ldots+\sigma_n^2k_n}{k_1+k_2+\ldots+k_n}$$