Are Stock Returns Normally Distributed?
December 20, 2020 by Chris
In a previous post we talked about the Higher Moments of a Distribution. We saw that skewness and kurtosis are two attributes that can identify if a distribution is normal or not (skewnes = 0 & kurtosis = 3).
Let's try this approach on the MSFT stock.
First step is to to fetch the data and print the returns.
Package Installation
%pip install yahoofinancials from yahoofinancials import YahooFinancials import pandas as pd import matplotlib import matplotlib.pyplot as plt import seaborn as sns import dateutil.parser import numpy as np matplotlib.rcParams['figure.figsize'] = (10.0, 5.0) matplotlib.style.use('ggplot')
def retrieve_stock_data(ticker, start, end): json = YahooFinancials(ticker).get_historical_price_data(start, end, "daily") columns=["adjclose"] # ["open","close","adjclose"] df = pd.DataFrame(columns=columns) for row in json[ticker]["prices"]: d = dateutil.parser.isoparse(row["formatted_date"]) df.loc[d] = [row["adjclose"]] # [row["open"], row["close"], row["adjclose"]] df.index.name = "date" return df def normal_rets(S): return S.pct_change().dropna() def log_rets(S): rets = np.log(S) - np.log( S.shift(1)) return rets[1:] stock_prices = retrieve_stock_data("MSFT", "2000-01-01", "2020-01-01") rets = normal_rets(stock_prices).dropna() rets.columns = ['returns'] rets.plot(figsize=(14,7)) plt.title("Daily returns", weight="bold");

Let's find skewness and kurtosis:
from scipy.stats import kurtosis, skew skew(rets, bias=False)[0], kurtosis(rets, bias=False, fisher=False)[0]
(0.20887713542026032, 13.229622042763442)
It is obvious that the MSFT stock returns (for that period) do not comply with the kurtosis and skewness of a normal distribution. Same, of course, happens if we get the log returns instead.
log_msft_rets = log_rets(stock_prices).dropna() skew(log_msft_rets, bias=False)[0], kurtosis(log_msft_rets, bias=False, fisher=False)[0]
(-0.12981025399984283, 12.81366131030108)
Normality Tests
There are several interrelated approaches to determining normality:
- Histogram with the normal curve superimposed. Unfortunately, there is no automated way to represent the "fitness" as a value. This approach is empirical mostly and requires experience.
- Skewness & Kurtosis Tests.
- Normality plots. “Normal Q-Q Plot” provides a graphical way to determine the level of normality.
- Normality tests. The Kolmogorov-Smirnov test (K-S) and Shapiro-Wilk (S-W) test are designed to test normality by comparing your data to a normal distribution with the same mean and standard deviation of your sample. If the test is NOT significant, then the data are normal, so any value above .05 indicates normality. If the test is significant (less than .05), then the data are non-normal.
Histogram & Normal PDF
from scipy.stats import norm x = np.linspace(min(rets.returns.values), max(rets.returns.values)) ax = rets.plot(kind='hist', bins=500, density=True) pdf_fitted = norm.pdf(x, *norm.fit(rets.returns.values)) pd.Series(pdf_fitted, x).plot(ax=ax) plt.show()

Skewness & Kurtosis Tests
from scipy import stats stats.kurtosistest(rets.returns)
KurtosistestResult(statistic=29.93227785492693, pvalue=7.484189304773088e-197)
stats.skewtest(rets.returns)
SkewtestResult(statistic=5.99114785753993, pvalue=2.083650895527666e-09)
QQ-Plot
from numpy.random import seed from statsmodels.graphics.gofplots import qqplot from matplotlib import pyplot seed(1) qqplot(rets.returns, line='s') pyplot.show()

The Quantile-Quantile plot, as the name suggests, will compare the quantiles between the normal distribution and our data. We notice here that, the tails of the distribution of our data are diverging a lot from the normal distribution. This is what we would expect. Fat tails (leptokurtic)!
Statistical Normality Tests
The tests assume that the sample was drawn from a Gaussian distribution. Technically this is called the null hypothesis, or H0. A threshold level is chosen called alpha, typically 5% (or 0.05), that is used to interpret the p-value.
In the SciPy implementation of these tests, you can interpret the p value as follows.
- p <= alpha: reject H0, not normal.
- p > alpha: fail to reject H0, normal.
This means that, in general, we are seeking results with a larger p-value to confirm that our sample was likely drawn from a Gaussian distribution.
A result above 5% does not mean that the null hypothesis is true. It means that it is very likely true given available evidence. The p-value is not the probability of the data fitting a Gaussian distribution; it can be thought of as a value that helps us interpret the statistical test.
Kolmogorov-Smirnov test (K-S)
kstest = stats.kstest(rets.returns, 'norm') kstest.pvalue > 0.05
False
Shapiro-Wilk Test
shapiro_stat, shapiro_p = stats.shapiro(rets.returns) shapiro_p > 0.05
False
D’Agostino’s K^2 Test
The D’Agostino’s K^2 test calculates summary statistics from the data, namely kurtosis and skewness, to determine if the data distribution departs from the normal distribution. (named for Ralph D’Agostino)
- Skew is a quantification of how much a distribution is pushed left or right, a measure of asymmetry in the distribution.
- Kurtosis quantifies how much of the distribution is in the tail.
It is a simple and commonly used statistical test for normality.
seed(1) dagostino_stat, dagostino_p = stats.normaltest(rets.returns) dagostino_p > 0.05
False
Jarque-Bera Test for Normality
jarque_bera_stat, jarque_bera_p = stats.jarque_bera(rets.returns) jarque_bera_p > 0.05
False
Conclusion
In this article we went through some techniques that allow us identify if stock returns are normally distributed. We saw, with examples, that returns (arithmetic, or log) are not normally distributed but instead exhibit fat tails. We cannot generalize, of course, just by looking into one stock, but I will leave that as a small exercise to the curious readers.
The question is still... Since the returns are not following a normal distribution, then what type of distribution do they follow?
The answer to that in Fit Multiple Distributions to Asset Returns!