所以我使它工作,但不支持滚动窗口,因为它不支持字符串类型。该功能也在Pandas Repo上进行了报告和请求。
我对这个问题的解决方案摘要:
if len(df.index) > 0: res = df.loc[(df.merchant == data['transaction']['merchant']) & (df.amount == data['transaction']['amount'])] res['timediff'] = (data['transaction']['time'] - res['time']).dt.total_seconds().abs() <= 120 if res.timediff.any(): continue df = df.append(df1)print(df)
样本数据:
{"transaction": {"merchant": "merchantA", "amount": 20, "time": "2019-02-13T10:00:00.000Z"}}{"transaction": {"merchant": "merchantB", "amount": 90, "time": "2019-02-13T11:00:01.000Z"}}{"transaction": {"merchant": "merchantC", "amount": 10, "time": "2019-02-13T11:00:10.000Z"}}{"transaction": {"merchant": "merchantD", "amount": 10, "time": "2019-02-13T11:00:20.000Z"}}{"transaction": {"merchant": "merchantE", "amount": 10, "time": "2019-02-13T11:01:30.000Z"}}{"transaction": {"merchant": "merchantF", "amount": 10, "time": "2019-02-13T11:03:00.000Z"}}{"transaction": {"merchant": "merchantE", "amount": 10, "time": "2019-02-13T11:02:00.000Z"}}{"transaction": {"merchant": "merchantF", "amount": 10, "time": "2019-02-13T11:02:20.000Z"}}{"transaction": {"merchant": "merchantE", "amount": 10, "time": "2019-02-13T11:02:30.000Z"}}{"transaction": {"merchant": "merchantF", "amount": 10, "time": "2019-02-13T11:05:20.000Z"}}{"transaction": {"merchant": "merchantE", "amount": 10, "time": "2019-02-13T11:00:30.000Z"}}输出:
merchant amount time2019-02-13 10:00:00 merchantA 20 2019-02-13 10:00:002019-02-13 11:00:01 merchantB 90 2019-02-13 11:00:012019-02-13 11:00:10 merchantC 10 2019-02-13 11:00:102019-02-13 11:00:20 merchantD 10 2019-02-13 11:00:202019-02-13 11:01:30 merchantE 10 2019-02-13 11:01:302019-02-13 11:03:00 merchantF 10 2019-02-13 11:03:002019-02-13 11:05:20 merchantF 10 2019-02-13 11:05:20



