Python-在情节中寻找模式

我认为

pandas.rolling_max()

这里是正确的方法。我们正在将数据加载到Dataframe中，并计算超过8500个值的滚动最大值。之后，曲线看起来相似。您可以对参数进行一点测试以优化结果。

import numpy as npimport matplotlib.pyplot as pltimport pandas as pdplt.ion()names = ['actual.csv','estimated.csv']#-------------------------------------------------------------------------------def load_data(fname):    return np.genfromtxt(fname, delimiter = ',')#-------------------------------------------------------------------------------data = [load_data(name) for name in names]actual_data = data[0]estimated_data = data[1]df = pd.read_csv('estimated.csv', names=('x','y'))df['rolling_max'] = pd.rolling_max(df['y'],8500)plt.figure()plt.plot(actual_data[:,0],actual_data[:,1], label='actual')plt.plot(estimated_data[:,0],estimated_data[:,1], label='estimated')plt.plot(df['x'], df['rolling_max'], label = 'rolling')plt.legend()plt.title('Actual vs. Interpolated')plt.xlim(0,10)plt.ylim(0,500)plt.xlabel('Time [Seconds]')plt.ylabel('Segments')plt.grid()plt.show(block=True)

要从评论中回答问题：

由于

pd.rolling()

正在生成您的数据定义的窗口，第一个值将是

NaN

对

pd.rolling().max

。要替换这些

NaN

s，我建议将整个Series转过来并向后计算窗口。之后，我们可以将所有

NaN

s替换为反向计算中的值。我调整了窗口长度以进行向后计算。否则，我们将得到错误的数据。

此代码有效：

import numpy as npimport matplotlib.pyplot as pltimport pandas as pdplt.ion()df = pd.read_csv('estimated.csv', names=('x','y'))df['rolling_max'] = df['y'].rolling(8500).max()df['rolling_max_backwards'] = df['y'][::-1].rolling(850).max()df.rolling_max.fillna(df.rolling_max_backwards, inplace=True)plt.figure()plt.plot(df['x'], df['rolling_max'], label = 'rolling')plt.legend()plt.title('Actual vs. Interpolated')plt.xlim(0,10)plt.ylim(0,700)plt.xlabel('Time [Seconds]')plt.ylabel('Segments')plt.grid()plt.show(block=True)

我们得到以下结果：

Python-在情节中寻找模式

面试问答相关栏目本月热门文章