我也很乐意看到您最终提出的解决方案,以了解它最终如何解决。
您可以找到最接近的日期的一件事是计算第一个Dataframe中每个日期与第二个Dataframe中的日期之间的天数。然后,您可以使用
np.argmin来检索具有最小时间增量的日期。
例如:
设定
#!/usr/bin/env python# -*- coding: utf-8 -*- import numpy as npimport pandas as pdfrom pandas.io.parsers import StringIO
数据
a = """timepoint,measure2014-01-01 00:00:00,782014-01-02 00:00:00,292014-01-03 00:00:00,52014-01-04 00:00:00,732014-01-05 00:00:00,402014-01-06 00:00:00,452014-01-07 00:00:00,482014-01-08 00:00:00,22014-01-09 00:00:00,962014-01-10 00:00:00,822014-01-11 00:00:00,612014-01-12 00:00:00,682014-01-13 00:00:00,82014-01-14 00:00:00,942014-01-15 00:00:00,162014-01-16 00:00:00,312014-01-17 00:00:00,102014-01-18 00:00:00,342014-01-19 00:00:00,272014-01-20 00:00:00,582014-01-21 00:00:00,902014-01-22 00:00:00,412014-01-23 00:00:00,972014-01-24 00:00:00,72014-01-25 00:00:00,862014-01-26 00:00:00,622014-01-27 00:00:00,912014-01-28 00:00:00,02014-01-29 00:00:00,732014-01-30 00:00:00,222014-01-31 00:00:00,432014-02-01 00:00:00,872014-02-02 00:00:00,562014-02-03 00:00:00,452014-02-04 00:00:00,252014-02-05 00:00:00,922014-02-06 00:00:00,832014-02-07 00:00:00,132014-02-08 00:00:00,502014-02-09 00:00:00,482014-02-10 00:00:00,78"""b = """timepoint,measure2014-01-01 00:00:00,782014-01-08 00:00:00,292014-01-15 00:00:00,52014-01-22 00:00:00,732014-01-29 00:00:00,402014-02-05 00:00:00,452014-02-12 00:00:00,482014-02-19 00:00:00,22014-02-26 00:00:00,962014-03-05 00:00:00,822014-03-12 00:00:00,612014-03-19 00:00:00,682014-03-26 00:00:00,82014-04-02 00:00:00,94"""
看数据
df1 = pd.read_csv(StringIO(a), parse_dates=['timepoint'])df1.head() timepoint measure0 2014-01-01 781 2014-01-02 292 2014-01-03 53 2014-01-04 734 2014-01-05 40df2 = pd.read_csv(StringIO(b), parse_dates=['timepoint'])df2.head() timepoint measure0 2014-01-01 781 2014-01-08 292 2014-01-15 53 2014-01-22 734 2014-01-29 40
Func查找最接近给定日期的日期
def find_closest_date(timepoint, time_series, add_time_delta_column=True): # takes a pd.Timestamp() instance and a pd.Series with dates in it # calcs the delta between `timepoint` and each date in `time_series` # returns the closest date and optionally the number of days in its time delta deltas = np.abs(time_series - timepoint) idx_closest_date = np.argmin(deltas) res = {"closest_date": time_series.ix[idx_closest_date]} idx = ['closest_date'] if add_time_delta_column: res["closest_delta"] = deltas[idx_closest_date] idx.append('closest_delta') return pd.Series(res, index=idx)df1[['closest', 'days_bt_x_and_y']] = df1.timepoint.apply( find_closest_date, args=[df2.timepoint])df1.head(10) timepoint measure closest days_bt_x_and_y0 2014-01-01 78 2014-01-010 days1 2014-01-02 29 2014-01-011 days2 2014-01-03 5 2014-01-012 days3 2014-01-04 73 2014-01-013 days4 2014-01-05 40 2014-01-083 days5 2014-01-06 45 2014-01-082 days6 2014-01-07 48 2014-01-081 days7 2014-01-08 2 2014-01-080 days8 2014-01-09 96 2014-01-081 days9 2014-01-10 82 2014-01-082 days合并新closest
日期列上的两个Dataframe
df3 = pd.merge(df1, df2, left_on=['closest'], right_on=['timepoint'])colorder = [ 'timepoint_x', 'closest', 'timepoint_y', 'days_bt_x_and_y', 'measure_x', 'measure_y']df3 = df3.ix[:, colorder]df3 timepoint_x closest timepoint_y days_bt_x_and_y measure_x measure_y0 2014-01-01 2014-01-01 2014-01-010 days 78 781 2014-01-02 2014-01-01 2014-01-011 days 29 782 2014-01-03 2014-01-01 2014-01-012 days 5 783 2014-01-04 2014-01-01 2014-01-013 days 73 784 2014-01-05 2014-01-08 2014-01-083 days 40 295 2014-01-06 2014-01-08 2014-01-082 days 45 296 2014-01-07 2014-01-08 2014-01-081 days 48 297 2014-01-08 2014-01-08 2014-01-080 days 2 298 2014-01-09 2014-01-08 2014-01-081 days 96 299 2014-01-10 2014-01-08 2014-01-082 days 82 2910 2014-01-11 2014-01-08 2014-01-083 days 61 2911 2014-01-12 2014-01-15 2014-01-153 days 68 512 2014-01-13 2014-01-15 2014-01-152 days 8 513 2014-01-14 2014-01-15 2014-01-151 days 94 514 2014-01-15 2014-01-15 2014-01-150 days 16 515 2014-01-16 2014-01-15 2014-01-151 days 31 516 2014-01-17 2014-01-15 2014-01-152 days 10 517 2014-01-18 2014-01-15 2014-01-153 days 34 518 2014-01-19 2014-01-22 2014-01-223 days 27 7319 2014-01-20 2014-01-22 2014-01-222 days 58 7320 2014-01-21 2014-01-22 2014-01-221 days 90 7321 2014-01-22 2014-01-22 2014-01-220 days 41 7322 2014-01-23 2014-01-22 2014-01-221 days 97 7323 2014-01-24 2014-01-22 2014-01-222 days 7 7324 2014-01-25 2014-01-22 2014-01-223 days 86 7325 2014-01-26 2014-01-29 2014-01-293 days 62 4026 2014-01-27 2014-01-29 2014-01-292 days 91 4027 2014-01-28 2014-01-29 2014-01-291 days 0 4028 2014-01-29 2014-01-29 2014-01-290 days 73 4029 2014-01-30 2014-01-29 2014-01-291 days 22 4030 2014-01-31 2014-01-29 2014-01-292 days 43 4031 2014-02-01 2014-01-29 2014-01-293 days 87 4032 2014-02-02 2014-02-05 2014-02-053 days 56 4533 2014-02-03 2014-02-05 2014-02-052 days 45 4534 2014-02-04 2014-02-05 2014-02-051 days 25 4535 2014-02-05 2014-02-05 2014-02-050 days 92 4536 2014-02-06 2014-02-05 2014-02-051 days 83 4537 2014-02-07 2014-02-05 2014-02-052 days 13 4538 2014-02-08 2014-02-05 2014-02-053 days 50 4539 2014-02-09 2014-02-12 2014-02-123 days 48 4840 2014-02-10 2014-02-12 2014-02-122 days 78 48



