# Skewness

In probability theory and statistics, **skewness** is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The value of skewness can be positive or negative, or even undefined.

## Definition

The skewness of a random variable $X$ is the third standardized moment $\gamma_1$, defined as:
\begin{equation}
\tag{1}
\gamma_1 = \operatorname{E}\left[\left(\frac{X-\mu}{\sigma}\right)^3 \right]
= \frac{\mu_3}{\sigma^3}
= \frac{\operatorname{E}\left[(X-\mu)^3\right]}{\ \ \ ( \operatorname{E}\left[ (X-\mu)^2 \right] )^{3/2}}
= \frac{\kappa_3}{\kappa_2^{3/2}}
\end{equation}
where $\mu$ is the mean, $\sigma$ is the standard deviation, $\operatorname{E}$ is the expectation operator, $\mu_3$ is the third central moment, and $\kappa_t$ are the $t$^{th} cumulants.

## Sample Skewness

There are several ways to define **sample skewness**.

In many older texts, (sample) skewness is given by:
\begin{equation}
\tag{2}
g_1 = \frac{m_3}{m_2^{3/2}}
\end{equation}
where $m_t$ is the $t$^{th} (sample) moment:
\begin{equation}
\tag{3}
m_t = \frac{1}{n} \sum \left( x_i - \mu \right)^t
\end{equation}
Note that $m_t$ is a biased estimate of the population moment $\mu_t$.

One way to remove the bias in $g_1$ is: \begin{equation} \tag{4} G_1 = \frac{\sqrt{n(n-1)}}{n-2} g_1 \end{equation}

Note that this is not the only possibility. Another is given by: \begin{equation} \tag{5} b_1 = \left( \frac{n-1}{n} \right)^{3/2} \frac{m_3}{m_2^{3/2}} \end{equation}

These different measures have been studied in ^{[1]}. Both of these measures are unbiased. For large sample sizes, there is little difference between the measures. Differences exist for small sample sizes, however. For samples from a normal distribution, $b_1$ has the lowest MSE; for samples from an asymmetric distribution, $G_1$ has the lowest MSE. Whatever distribution is being sampled, $G_1$ has the greatest variance.

## stats++

In stats++, $G_1$ is implemented.

## Notes and references

- ↑ D. N. Joanes and C. A. Gill, "Comparing measures of sample skewness and kurtosis," The Statistician
**47**, 183--189 (1998).