栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

计算*滚动*熊猫系列的最大跌幅

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

计算*滚动*熊猫系列的最大跌幅

这是滚动最大跌幅函数的小巧版本。

windowed_view
是单行函数的包装,用于
numpy.lib.stride_tricks.as_strided
使1d数组的内存有效2d窗口视图(下面的完整代码)。有了该窗口化视图后,计算基本上与相同
max_dd
,但是是为numpy数组编写的,并沿第二个轴(即
axis=1
)应用。

def rolling_max_dd(x, window_size, min_periods=1):    """Compute the rolling maximum drawdown of `x`.    `x` must be a 1d numpy array.    `min_periods` should satisfy `1 <= min_periods <= window_size`.    Returns an 1d array with length `len(x) - min_periods + 1`.    """    if min_periods < window_size:        pad = np.empty(window_size - min_periods)        pad.fill(x[0])        x = np.concatenate((pad, x))    y = windowed_view(x, window_size)    running_max_y = np.maximum.accumulate(y, axis=1)    dd = y - running_max_y    return dd.min(axis=1)

这是演示该功能的完整脚本:

import numpy as npfrom numpy.lib.stride_tricks import as_stridedimport pandas as pdimport matplotlib.pyplot as pltdef windowed_view(x, window_size):    """Creat a 2d windowed view of a 1d array.    `x` must be a 1d numpy array.    `numpy.lib.stride_tricks.as_strided` is used to create the view.    The data is not copied.    Example:    >>> x = np.array([1, 2, 3, 4, 5, 6])    >>> windowed_view(x, 3)    array([[1, 2, 3],[2, 3, 4],[3, 4, 5],[4, 5, 6]])    """    y = as_strided(x, shape=(x.size - window_size + 1, window_size),        strides=(x.strides[0], x.strides[0]))    return ydef rolling_max_dd(x, window_size, min_periods=1):    """Compute the rolling maximum drawdown of `x`.    `x` must be a 1d numpy array.    `min_periods` should satisfy `1 <= min_periods <= window_size`.    Returns an 1d array with length `len(x) - min_periods + 1`.    """    if min_periods < window_size:        pad = np.empty(window_size - min_periods)        pad.fill(x[0])        x = np.concatenate((pad, x))    y = windowed_view(x, window_size)    running_max_y = np.maximum.accumulate(y, axis=1)    dd = y - running_max_y    return dd.min(axis=1)def max_dd(ser):    max2here = pd.expanding_max(ser)    dd2here = ser - max2here    return dd2here.min()if __name__ == "__main__":    np.random.seed(0)    n = 100    s = pd.Series(np.random.randn(n).cumsum())    window_length = 10    rolling_dd = pd.rolling_apply(s, window_length, max_dd, min_periods=0)    df = pd.concat([s, rolling_dd], axis=1)    df.columns = ['s', 'rol_dd_%d' % window_length]    df.plot(linewidth=3, alpha=0.4)    my_rmdd = rolling_max_dd(s.values, window_length, min_periods=1)    plt.plot(my_rmdd, 'g.')    plt.show()

该图显示了由代码生成的曲线。绿点由计算

rolling_max_dd

使用

n = 10000
和比较时间
window_length = 500

In [2]: %timeit rolling_dd = pd.rolling_apply(s, window_length, max_dd, min_periods=0)1 loops, best of 3: 247 ms per loopIn [3]: %timeit my_rmdd = rolling_max_dd(s.values, window_length, min_periods=1)10 loops, best of 3: 38.2 ms per loop

rolling_max_dd
快了6.5倍 对于较小的窗口长度,加速效果更好。例如,使用
window_length = 200
,速度快将近13倍。

要处理NA,可以在将数组传递给之前对

Series
using
fillna
方法进行预处理
rolling_max_dd



转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/660515.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号