股票是否正态分布、QQ图、偏度(skewness) 和 峰度(kurtosis)的计算方法#

pic

Fig. 18 股票是否正态分布、QQ图、偏度(skewness) 和 峰度(kurtosis)#

前言

我们一般对股票的理解一般只停留于各大交易软件如东方财富、通达信等所提供的基本信息。 股票的日线图除了告诉我们股价的走势之外,股民基本上将其用于MACD、KDJ等技术分析上。 事实上, 我们可以利用股票的历史价格可以算出股票的走势是否正态分布、 回报率(Returns)、波动率(Volatility)、偏度(skewness) 和 峰度(kurtosis)。

本文将介绍股票是否正态分布、QQ图、偏度(skewness) 和 峰度(kurtosis)的计算方法。

实现过程#

import tushare as ts
import pandas as pd
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')
plt.rcParams['font.sans-serif'] = ['SimHei'] 
plt.rcParams['axes.unicode_minus'] = False  

pro = ts.pro_api('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') #这里需要填写你注册好的Tushare的TOKEN凭证

通过调用tushare获取股票601012(隆基股份)的股票数据,这里不设置日期,那么默认获取Tushare提供的历史数据。

ticker_data = pro.daily(ts_code='601012.SH')
print('数据量:',len(ticker_data))
ticker_data.head(5)
数据量: 2430
ts_code trade_date open high low close pre_close change pct_chg vol amount
0 601012.SH 20220609 60.21 61.59 59.90 60.98 60.21 0.77 1.2789 794660.60 4828508.525
1 601012.SH 20220608 61.08 61.30 59.01 60.21 60.63 -0.42 -0.6927 974318.52 5842896.395
2 601012.SH 20220607 62.70 63.56 60.60 60.63 60.62 0.01 0.0165 1467264.39 9075963.991
3 601012.SH 20220606 57.72 62.30 57.00 60.62 56.94 3.68 6.4629 1068502.03 6382247.895
4 601012.SH 20220602 79.00 80.17 78.00 79.98 78.70 1.28 1.6264 377225.29 3003191.420

从上面可以看出,序号并不是以时间作为单位的。那么我们首先需要将trade_date转为datetime格式,然后设置为序号以便于画图。

ticker_data['trade_date'] = pd.to_datetime(ticker_data['trade_date'],format='%Y%m%d')
ticker_data.set_index('trade_date', inplace=True)
returns = ticker_data["close"].pct_change().dropna()

plt.figure(figsize=(15, 5))
plt.title("股票代码:601012 - 隆基股份", weight='bold')
ticker_data['close'].plot()
<AxesSubplot:title={'center':'股票代码:601012 - 隆基股份'}, xlabel='trade_date'>
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 32929 (\N{CJK UNIFIED IDEOGRAPH-80A1}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 31080 (\N{CJK UNIFIED IDEOGRAPH-7968}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20195 (\N{CJK UNIFIED IDEOGRAPH-4EE3}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 30721 (\N{CJK UNIFIED IDEOGRAPH-7801}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 38534 (\N{CJK UNIFIED IDEOGRAPH-9686}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 22522 (\N{CJK UNIFIED IDEOGRAPH-57FA}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20221 (\N{CJK UNIFIED IDEOGRAPH-4EFD}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
../_images/stock-skew-kurtosis-qq_6_2.png

下面将画出每日收盘价的百分比变化图:

plt.figure(figsize=(15, 5))
ticker_data["close"].pct_change().plot()
plt.title("股票代码:601012 - 隆基股份", weight='bold');
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 32929 (\N{CJK UNIFIED IDEOGRAPH-80A1}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 31080 (\N{CJK UNIFIED IDEOGRAPH-7968}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20195 (\N{CJK UNIFIED IDEOGRAPH-4EE3}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 30721 (\N{CJK UNIFIED IDEOGRAPH-7801}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 38534 (\N{CJK UNIFIED IDEOGRAPH-9686}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 22522 (\N{CJK UNIFIED IDEOGRAPH-57FA}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20221 (\N{CJK UNIFIED IDEOGRAPH-4EFD}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
../_images/stock-skew-kurtosis-qq_8_1.png

是否正态分布#

接下来,我们用利用scipy里面的jarque_bera函数来判断股票的历史收益是否属于正态分布。 注:数据需要大于2000才有效

_,pvalue= scipy.stats.jarque_bera(returns)
print(pvalue)
if pvalue > 0.05:
    print ('数据服从正态分布')
else:
    print ('数据不服从正态分布')
0.0
数据不服从正态分布

QQ图#

分位数-分位数 (QQ) 图是两种分布的分位数相对彼此进行绘制的图。 评估数据集是否正态分布,并分别研究两个数据集是否具有相似的分布。

1. 股票收益回报的QQ图#

Q = returns.values
scipy.stats.probplot(Q, dist=scipy.stats.norm, plot=plt.figure(figsize=(15, 5)).add_subplot(111))
plt.title(u"隆基股份:收益回报的QQ图", weight="bold")
Text(0.5, 1.0, '隆基股份:收益回报的QQ图')
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 65306 (\N{FULLWIDTH COLON}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 25910 (\N{CJK UNIFIED IDEOGRAPH-6536}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 30410 (\N{CJK UNIFIED IDEOGRAPH-76CA}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 22238 (\N{CJK UNIFIED IDEOGRAPH-56DE}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 25253 (\N{CJK UNIFIED IDEOGRAPH-62A5}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 30340 (\N{CJK UNIFIED IDEOGRAPH-7684}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 22270 (\N{CJK UNIFIED IDEOGRAPH-56FE}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
../_images/stock-skew-kurtosis-qq_15_2.png

2. 正态分布的QQ图#

nsample = 1000

scipy.stats.probplot(scipy.stats.norm.rvs(loc=0, scale=1, size=nsample), dist=scipy.stats.norm, plot=plt.figure(figsize=(15, 5)).add_subplot(111))
plt.title(u"标准正态分布QQ图", weight="bold")
Text(0.5, 1.0, '标准正态分布QQ图')
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 26631 (\N{CJK UNIFIED IDEOGRAPH-6807}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20934 (\N{CJK UNIFIED IDEOGRAPH-51C6}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 27491 (\N{CJK UNIFIED IDEOGRAPH-6B63}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 24577 (\N{CJK UNIFIED IDEOGRAPH-6001}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20998 (\N{CJK UNIFIED IDEOGRAPH-5206}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 24067 (\N{CJK UNIFIED IDEOGRAPH-5E03}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
../_images/stock-skew-kurtosis-qq_17_2.png

我做了两个图,一个是股票的QQ图,一个是正态分布的QQ图,可以对比一下不同之处。

是否LOG正态分布#

fig, ax = plt.subplots(figsize=(10, 6))

values = ticker_data["close"]

shape, loc, scale = scipy.stats.lognorm.fit(values) 
x = np.linspace(values.min(), values.max(), len(values))
pdf = scipy.stats.lognorm.pdf(x, shape, loc=loc, scale=scale) 
label = 'mean=%.4f, std=%.4f, shape=%.4f' % (loc, scale, shape)

ax.hist(values, bins=30, density=True)
ax.plot(x, pdf, 'r-', lw=2, label=label)
ax.legend(loc='best')
<matplotlib.legend.Legend at 0x7f601b5c7b80>
../_images/stock-skew-kurtosis-qq_20_1.png
values = returns
x = np.linspace(values.min(), values.max(), len(values))

loc, scale = scipy.stats.norm.fit(values)
param_density = scipy.stats.norm.pdf(x, loc=loc, scale=scale)
label = '均值=%.4f, 标准差=%.4f' % (loc, scale)

fig, ax = plt.subplots(figsize=(10, 6))
ax.hist(values, bins=30, density=True)
ax.plot(x, param_density, 'r-', label=label)
ax.legend(loc='best')
<matplotlib.legend.Legend at 0x7f60190053d0>
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 22343 (\N{CJK UNIFIED IDEOGRAPH-5747}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 20540 (\N{CJK UNIFIED IDEOGRAPH-503C}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 24046 (\N{CJK UNIFIED IDEOGRAPH-5DEE}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
../_images/stock-skew-kurtosis-qq_21_2.png

偏度(skewness) 和 峰度(kurtosis)#

接下来再计算一下偏度(skewness) 和 峰度(kurtosis)

1. 股票的偏度(skewness) 和 峰度(kurtosis)#

print ('Skew:', scipy.stats.skew(returns))
print ('kurtosis:', scipy.stats.kurtosis(returns))
Skew: 20.54452375324835
kurtosis: 698.0614565305685

我们可以根据股票的历史收益来画一个柱状图,这样可以更好地了解收益分布

fig = plt.figure(figsize=(15, 5))
plt.hist(returns, 50)
plt.title("收益回报柱状图", weight='bold', alpha=0.5)
plt.show()
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 26609 (\N{CJK UNIFIED IDEOGRAPH-67F1}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/IPython/core/pylabtools.py:151: UserWarning: Glyph 29366 (\N{CJK UNIFIED IDEOGRAPH-72B6}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
../_images/stock-skew-kurtosis-qq_27_1.png

2. 正态分布的偏度(skewness) 和 峰度(kurtosis)#

plt.style.use('ggplot')

data = np.random.normal(0, 1, 10000000)
np.var(data)

plt.hist(data, bins=50)

print("mean : ", np.mean(data))
print("var  : ", np.var(data))
print("skew : ", scipy.stats.skew(data))
print("kurt : ", scipy.stats.kurtosis(data))
mean :  0.0005789707164294141
var  :  1.0003560755008827
skew :  0.0005938373756586869
kurt :  0.0014742612859635074
../_images/stock-skew-kurtosis-qq_29_2.png
_,pvalue= scipy.stats.jarque_bera(data)
print(pvalue)
if pvalue > 0.05:
    print ('数据服从正态分布')
else:
    print ('数据不服从正态分布')
0.4739419919504797
数据服从正态分布