这是滚动最大跌幅函数的小巧版本。
windowed_view是单行函数的包装,用于
numpy.lib.stride_tricks.as_strided使1d数组的内存有效2d窗口视图(下面的完整代码)。有了该窗口化视图后,计算基本上与相同
max_dd,但是是为numpy数组编写的,并沿第二个轴(即
axis=1)应用。
def rolling_max_dd(x, window_size, min_periods=1): """Compute the rolling maximum drawdown of `x`. `x` must be a 1d numpy array. `min_periods` should satisfy `1 <= min_periods <= window_size`. Returns an 1d array with length `len(x) - min_periods + 1`. """ if min_periods < window_size: pad = np.empty(window_size - min_periods) pad.fill(x[0]) x = np.concatenate((pad, x)) y = windowed_view(x, window_size) running_max_y = np.maximum.accumulate(y, axis=1) dd = y - running_max_y return dd.min(axis=1)
这是演示该功能的完整脚本:
import numpy as npfrom numpy.lib.stride_tricks import as_stridedimport pandas as pdimport matplotlib.pyplot as pltdef windowed_view(x, window_size): """Creat a 2d windowed view of a 1d array. `x` must be a 1d numpy array. `numpy.lib.stride_tricks.as_strided` is used to create the view. The data is not copied. Example: >>> x = np.array([1, 2, 3, 4, 5, 6]) >>> windowed_view(x, 3) array([[1, 2, 3],[2, 3, 4],[3, 4, 5],[4, 5, 6]]) """ y = as_strided(x, shape=(x.size - window_size + 1, window_size), strides=(x.strides[0], x.strides[0])) return ydef rolling_max_dd(x, window_size, min_periods=1): """Compute the rolling maximum drawdown of `x`. `x` must be a 1d numpy array. `min_periods` should satisfy `1 <= min_periods <= window_size`. Returns an 1d array with length `len(x) - min_periods + 1`. """ if min_periods < window_size: pad = np.empty(window_size - min_periods) pad.fill(x[0]) x = np.concatenate((pad, x)) y = windowed_view(x, window_size) running_max_y = np.maximum.accumulate(y, axis=1) dd = y - running_max_y return dd.min(axis=1)def max_dd(ser): max2here = pd.expanding_max(ser) dd2here = ser - max2here return dd2here.min()if __name__ == "__main__": np.random.seed(0) n = 100 s = pd.Series(np.random.randn(n).cumsum()) window_length = 10 rolling_dd = pd.rolling_apply(s, window_length, max_dd, min_periods=0) df = pd.concat([s, rolling_dd], axis=1) df.columns = ['s', 'rol_dd_%d' % window_length] df.plot(linewidth=3, alpha=0.4) my_rmdd = rolling_max_dd(s.values, window_length, min_periods=1) plt.plot(my_rmdd, 'g.') plt.show()
该图显示了由代码生成的曲线。绿点由计算
rolling_max_dd。
使用
n = 10000和比较时间
window_length = 500:
In [2]: %timeit rolling_dd = pd.rolling_apply(s, window_length, max_dd, min_periods=0)1 loops, best of 3: 247 ms per loopIn [3]: %timeit my_rmdd = rolling_max_dd(s.values, window_length, min_periods=1)10 loops, best of 3: 38.2 ms per loop
rolling_max_dd快了6.5倍 对于较小的窗口长度,加速效果更好。例如,使用
window_length = 200,速度快将近13倍。
要处理NA,可以在将数组传递给之前对
Seriesusing
fillna方法进行预处理
rolling_max_dd。



