栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

Pandas Dataframe:根据其地理坐标(经度和纬度)联接范围内的项目

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

Pandas Dataframe:根据其地理坐标(经度和纬度)联接范围内的项目

您可以使用:

from math import radians, cos, sin, asin, sqrtdef haversine(lon1, lat1, lon2, lat2):    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])    # haversine formula     dlon = lon2 - lon1     dlat = lat2 - lat1     a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2    c = 2 * asin(sqrt(a))     r = 6371 # Radius of earth in kilometers. Use 3956 for miles    return c * r

首先需要交叉与加入

merge
,删除一行,在相同的价值观
city_x
,并
city_y
通过
booleanindexing

df['tmp'] = 1df = pd.merge(df,df,on='tmp')df = df[df.city_x != df.city_y]print (df)    city_x     lat_x     lng_x  tmp   city_y     lat_y     lng_y1   Berlin  52.52437  13.41053    1  Potsdam  52.39886  13.065662   Berlin  52.52437  13.41053    1  Hamburg  53.57532  10.015343  Potsdam  52.39886  13.06566    1   Berlin  52.52437  13.410535  Potsdam  52.39886  13.06566    1  Hamburg  53.57532  10.015346  Hamburg  53.57532  10.01534    1   Berlin  52.52437  13.410537  Hamburg  53.57532  10.01534    1  Potsdam  52.39886  13.06566

然后应用Haversine函数:

df['dist'] = df.apply(lambda row: haversine(row['lng_x'], row['lat_x'], row['lng_y'], row['lat_y']), axis=1)

滤镜距离:

df = df[df.dist < 500]print (df)    city_x     lat_x     lng_x  tmp   city_y     lat_y     lng_y        dist1   Berlin  52.52437  13.41053    1  Potsdam  52.39886  13.06566   27.2157042   Berlin  52.52437  13.41053    1  Hamburg  53.57532  10.01534  255.2237823  Potsdam  52.39886  13.06566    1   Berlin  52.52437  13.41053   27.2157045  Potsdam  52.39886  13.06566    1  Hamburg  53.57532  10.01534  242.4641206  Hamburg  53.57532  10.01534    1   Berlin  52.52437  13.41053  255.2237827  Hamburg  53.57532  10.01534    1  Potsdam  52.39886  13.06566  242.464120

而在去年创造

list
或获得
size
groupby

df1 = df.groupby('city_x')['city_y'].apply(list)print (df1)city_xBerlin     [Potsdam, Hamburg]Hamburg     [Berlin, Potsdam]Potsdam     [Berlin, Hamburg]Name: city_y, dtype: objectdf2 = df.groupby('city_x')['city_y'].size()print (df2)city_xBerlin     2Hamburg    2Potsdam    2dtype: int64

也可以使用

numpy haversinesolution

def haversine_np(lon1, lat1, lon2, lat2):    """    Calculate the great circle distance between two points    on the earth (specified in decimal degrees)    All args must be of equal length.    """    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])    dlon = lon2 - lon1    dlat = lat2 - lat1    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2    c = 2 * np.arcsin(np.sqrt(a))    km = 6367 * c    return kmdf['tmp'] = 1df = pd.merge(df,df,on='tmp')df = df[df.city_x != df.city_y]#print (df)df['dist'] = haversine_np(df['lng_x'],df['lat_x'],df['lng_y'],df['lat_y'])    city_x     lat_x     lng_x  tmp   city_y     lat_y     lng_y        dist1   Berlin  52.52437  13.41053    1  Potsdam  52.39886  13.06566   27.1986162   Berlin  52.52437  13.41053    1  Hamburg  53.57532  10.01534  255.0635413  Potsdam  52.39886  13.06566    1   Berlin  52.52437  13.41053   27.1986165  Potsdam  52.39886  13.06566    1  Hamburg  53.57532  10.01534  242.3118906  Hamburg  53.57532  10.01534    1   Berlin  52.52437  13.41053  255.0635417  Hamburg  53.57532  10.01534    1  Potsdam  52.39886  13.06566  242.311890


转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/653298.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号