实证资产定价(Empirical asset pricing)已经发布于Github和Pypi. 包的具体用法(documentation)博主将会陆续在CSDN中详细介绍,也可以通过Pypi直接查看。
Pypi: pip install --upgrade EAP
Github: GitHub - whyecofiliter/EAP: empirical asset pricing
这个demo测试了经典的资产定价模型是否可以解释一些异常的投资组合回报。异常投资组合包括价格现金流(PCF)、资产增长率和异常换手率。通过单变量分析构建,准确地说,股票按特征或代理变量分为10组,首组和尾组之间的差异收益被视为异常投资组合收益。经典的资产定价模型包括Fama-French三因素模型、Carhart四因素模型和Fama-French五因素模型,其因素风险溢价是通过因子模拟投资组合构建的。
在本演示中,使用了Fama French 3因子模型和Fama French 5因子,数据来自CSMAR数据集。警告:请勿将此演示中的数据集用于任何商业目的。
# %% set system path
import sys,os
sys.path.append(os.path.abspath(".."))
# %% import data
import pandas as pd
month_return = pd.read_hdf('.\data\month_return.h5', key='month_return')
company_data = pd.read_hdf('.\data\last_filter_pe.h5', key='data')
trade_data = pd.read_hdf('.\data\mean_filter_trade.h5', key='data')
beta = pd.read_hdf('.\data\beta.h5', key='data')
risk_premium = pd.read_hdf('.\data\risk_premium.h5', key='data')
数据预处理
# %% data preprocessing
# forward the monthly return for each stock
# emrwd is the return including dividend
month_return['emrwd'] = month_return.groupby(['Stkcd'])['Mretwd'].shift(-1)
# emrnd is the return including no dividend
month_return['emrnd'] = month_return.groupby(['Stkcd'])['Mretnd'].shift(-1)
# select the A share stock
month_return = month_return[month_return['Markettype'].isin([1, 4, 16])]
# % distinguish the stocks whose size is among the up 30% stocks in each month
def percentile(stocks) :
return stocks >= stocks.quantile(q=.3)
month_return['cap'] = month_return.groupby(['Trdmnt'])['Msmvttl'].apply(percentile)
构建代理变量
# %% Construct proxy variable
import numpy as np
# CMA
# % calculate the total asset
# asset = debt + equity
# debt = company_value - market_value
# equity = market_value / PB
company_data['debt'] = company_data['EV1'] - company_data['MarketValue']
company_data['equity'] = company_data['MarketValue']/company_data['PBV1A']
company_data['asset'] = company_data['debt'] + company_data['equity']
# asset growth rate
company_data['asset_growth_rate'] = company_data['asset'].groupby(['Symbol']).diff(12)/company_data['asset']
# Turnover
trade_data['rolling_Turnover'] = np.array(trade_data['Turnover'].groupby('Symbol').rolling(12).mean())
trade_data['specific_Turnover'] = trade_data['Turnover'] / trade_data['rolling_Turnover']
进一步数据预处理
# %% merge data from pandas.tseries.offsets import * month_return['Stkcd_merge'] = month_return['Stkcd'].astype(dtype='string') month_return['Date_merge'] = pd.to_datetime(month_return['Trdmnt']) #month_return['Date_merge'] += MonthEnd() company_data['Stkcd_merge'] = company_data['Symbol'].dropna().astype(dtype='int').astype(dtype='string') company_data['Date_merge'] = pd.to_datetime(company_data['TradingDate']) company_data['Date_merge'] += MonthBegin() trade_data['Stkcd_merge'] = trade_data['Symbol'].dropna().astype(dtype='int').astype(dtype='string') trade_data['TradingDate'] = trade_data.index.map(lambda x : x[1]) trade_data['Date_merge'] = pd.to_datetime(trade_data['TradingDate']) #company_data['Yearmonth'] = company_data['Date_merge'].map(lambda x : 1000*x.year + x.month) trade_data['Date_merge'] += MonthBegin() # dataset starts from '2000-01' company_data = company_data[company_data['Date_merge'] >= '2000-01'] month_return = month_return[month_return['Date_merge'] >= '2000-01'] return_company = pd.merge(month_return, company_data, on=['Stkcd_merge', 'Date_merge']) return_company = pd.merge(return_company, trade_data, on=['Stkcd_merge', 'Date_merge']) # beta return_company = return_company.set_index(['Stkcd', 'Trdmnt']) return_company = pd.merge(return_company, beta, left_index=True, right_index=True)
构建异象投资组合和异象投资组合收益率
# %% construct anomaly portfolio and return
from portfolio_analysis import Univariate
# PCF : Price cash flow ratio
pcf = return_company[(return_company['Ndaytrd']>=10)]
pcf = pcf[['emrwd', 'PCF1A', 'Date_merge']].dropna()
pcf = pcf[(pcf['Date_merge'] >= '2000-01-01') & (pcf['Date_merge'] <= '2019-12-01')]
model_pcf = Univariate(np.array(pcf), number=9)
ret_pcf = model_pcf.print_summary_by_time(export=True)[['Time', 'diff']]
ret_pcf.index = pd.to_datetime(ret_pcf['Time'])
ret_pcf = ret_pcf['diff'].shift(1)
ret_pcf = ret_pcf.rename('PCF')
# Investment: Asset growth rate
inv = return_company[(return_company['Ndaytrd']>=10)]
inv = inv[['emrwd', 'asset_growth_rate', 'Date_merge']].dropna()
inv = inv[(inv['Date_merge'] >= '2000-01-01') & (inv['Date_merge'] <= '2019-12-01')]
model_inv = Univariate(np.array(inv), number=9)
ret_inv = model_inv.print_summary_by_time(export=True)[['Time', 'diff']]
ret_inv.index = pd.to_datetime(ret_inv['Time'])
ret_inv = ret_inv['diff'].shift(1)
ret_inv = ret_inv.rename('INV')
# abnormal turnover rate (one month): abtr1mon
abtr1mon = return_company[(return_company['Ndaytrd']>=10)]
abtr1mon = abtr1mon[['emrwd', 'specific_Turnover', 'Date_merge']].dropna()
abtr1mon = abtr1mon[(abtr1mon['Date_merge'] >= '2000-01-01') & (abtr1mon['Date_merge'] <= '2019-12-01')]
model_abtr1mon = Univariate(np.array(abtr1mon), number=9)
ret_abtr1mon = model_abtr1mon.print_summary_by_time(export=True)[['Time', 'diff']]
ret_abtr1mon.index = pd.to_datetime(ret_abtr1mon['Time'])
ret_abtr1mon = ret_abtr1mon['diff'].shift(1)
ret_abtr1mon = ret_abtr1mon.rename('ABT')
# %% merge data
data = pd.concat([ret_pcf, ret_inv, ret_abtr1mon, risk_premium], axis=1)
data = data['2004':'2019'].dropna()
Fama-French三因子模型,无需Newey-West调整。使用时间序列回归,回归中的所有alpha均显著为0.05,这意味着异常投资组合收益率不能完全用Fama-French三因素模型解释。GRS测试也给出了同样的结果。
# %% Fama-French 3 factors model # Without Newey-West adjustment # import data type: Dataframe list_data = data.iloc[:, :3] factor = data.iloc[:, 3:6] model = TS_regress(list_y=list_data, factor=np.array(factor)) model.fit(newey_west=False) model.summary() ===================================================================== +----------+---------+---------+---------+---------+ | Variable | alpha | BETA | SMB | HML | +----------+---------+---------+---------+---------+ | PCF | -0.0096 | 0.0892 | 0.9127 | -0.7753 | | t-value | -5.891 | 5.363 | 12.393 | -16.443 | | p-value | 0.0 | 0.0 | 0.0 | 0.0 | | INV | 0.0086 | -0.0293 | -0.9563 | -0.7849 | | t-value | 4.074 | -1.352 | -9.972 | -12.784 | | p-value | 0.0 | 0.178 | 0.0 | 0.0 | | ABT | -0.0065 | 0.1271 | -0.067 | 0.3228 | | t-value | -2.305 | 4.415 | -0.526 | 3.955 | | p-value | 0.022 | 0.0 | 0.599 | 0.0 | +----------+---------+---------+---------+---------+ ----------------------------------- GRS Test -------------------------------- GRS Statistics: 17.2 GRS p_value: 0.0 -----------------------------------------------------------------------------
Fama-French三因子模型加入Newey-West调整。使用时间序列回归,回归中的所有alpha均显著为0.05,这意味着异常投资组合收益率不能完全用Fama-French三因素模型解释。GRS测试也给出了同样的结果。
# %% Fama-French 3 factors model # With Newey-West adjustment # import data type: Dataframe model = TS_regress(list_y=data.iloc[:, :3], factor=data.iloc[:, 3:6]) model.fit(newey_west=True) model.summary() ===================================================================== +----------+---------+---------+---------+---------+ | Variable | alpha | BETA | SMB | HML | +----------+---------+---------+---------+---------+ | PCF | -0.0096 | 0.0892 | 0.9127 | -0.7753 | | t-value | -4.92 | 6.101 | 8.435 | -11.03 | | p-value | 0.0 | 0.0 | 0.0 | 0.0 | | INV | 0.0086 | -0.0293 | -0.9563 | -0.7849 | | t-value | 5.113 | -1.016 | -8.562 | -8.948 | | p-value | 0.0 | 0.155 | 0.0 | 0.0 | | ABT | -0.0065 | 0.1271 | -0.067 | 0.3228 | | t-value | -2.15 | 3.257 | -0.294 | 1.916 | | p-value | 0.016 | 0.001 | 0.385 | 0.028 | +----------+---------+---------+---------+---------+ ----------------------------------- GRS Test -------------------------------- GRS Statistics: 17.2 GRS p_value: 0.0 -----------------------------------------------------------------------------
Fama-French 5因子模型加入Newey-West调整。使用时间序列回归,回归中的所有alpha都是不显著的,这意味着异常投资组合收益率在很大程度上可以用Fama-French 5因子模型来解释。GRS测试也给出了同样的结果。
# %% Fama-French 5 factors model # With Newey-West adjustment # import data type: Dataframe model = TS_regress(list_y=data.iloc[:, :3], factor=data.iloc[:, 3:]) model.fit(newey_west=True) model.summary() ===================================================================== +----------+---------+---------+---------+---------+---------+--------+ | Variable | alpha | BETA | SMB | HML | RMW | CMA | +----------+---------+---------+---------+---------+---------+--------+ | PCF | -0.0031 | 0.028 | 0.5858 | -0.9971 | -0.7922 | 0.2925 | | t-value | -1.486 | 1.767 | 5.303 | -12.079 | -7.24 | 1.975 | | p-value | 0.07 | 0.039 | 0.0 | 0.0 | 0.0 | 0.025 | | INV | 0.0008 | -0.0569 | -0.453 | -0.2155 | -0.1186 | 1.6106 | | t-value | 0.442 | -2.118 | -4.511 | -2.78 | -0.808 | 5.81 | | p-value | 0.329 | 0.018 | 0.0 | 0.003 | 0.21 | 0.0 | | ABT | -0.0037 | 0.0932 | -0.2032 | 0.2461 | -0.4225 | 0.2642 | | t-value | -1.223 | 2.401 | -1.031 | 1.377 | -1.227 | 0.784 | | p-value | 0.112 | 0.009 | 0.152 | 0.085 | 0.111 | 0.217 | +----------+---------+---------+---------+---------+---------+--------+ ----------------------------------- GRS Test -------------------------------- GRS Statistics: 1.477 GRS p_value: 0.222 -----------------------------------------------------------------------------



