栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

不平等加入了熊猫?

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

不平等加入了熊猫?

熊猫合并()允许

outer
left
right
连接(而不仅仅是
inner
连接)两个数据帧之间,这样你就可以返回匹配的记录。此外,
merge()
甚至可以泛化为返回交叉联接(两个数据帧之间的所有组合匹配),并且随后进行过滤可以返回不匹配的记录。还有,还有isin() pandas方法。

考虑下面的演示。下面是我们喜欢的两种数据框架, 计算机语言
。如图所示,第一数据帧是第二数据帧的子集。外部联接返回的记录都包含

NaN
不匹配的列,以后可以将其过滤掉。交叉联接返回可以过滤的完整完整行,并
isin()
在列中搜索值:

import pandas as pddf1 = pd.Dataframe({'Languages': ['C++', 'C', 'Java', 'C#', 'Python', 'PHP'],         'Uses': ['computing', 'computing', 'application', 'application', 'application', 'web'],          'Type': ['Proprietary', 'Proprietary', 'Proprietary', 'Proprietary', 'Open-Source', 'Open-Source']})df2 = pd.Dataframe({'Languages': ['C++', 'C', 'Java', 'C#', 'Python', 'PHP','Perl', 'R', 'Ruby', 'VB.NET', 'Javascript', 'Matlab'],         'Uses': ['computing', 'computing', 'application', 'application', 'application', 'web',      'application', 'computing', 'web', 'application', 'web', 'computing'],         'Type': ['Proprietary', 'Proprietary', 'Proprietary', 'Proprietary', 'Open-Source',      'Open-Source', 'Open-Source', 'Open-Source', 'Open-Source', 'Proprietary',      'Open-Source', 'Proprietary']})# OUTER JOIN mergedf = pd.merge(df1, df2, on=['Languages'], how='outer')# FILTER OUT LANGUAGES IN SMALLER THAT IS NULLmergedf = mergedf[pd.isnull(mergedf['Type_x'])][['Languages', 'Uses_y', 'Type_y']]#     Languages       Uses_y       Type_y#6         Perl  application  Open-Source#7 R    computing  Open-Source#8         Ruby          web  Open-Source#9       VB.NET  application  Proprietary#10  Javascript          web  Open-Source#11      Matlab    computing  Proprietary# ISIN COMPARISON, RETURNING RECORDS IN LARGER NOT IN SMALLERunequaldf = df2[~df2.Languages.isin(df1['Languages'])]#     Languages         Type         Uses#6         Perl  Open-Source  application#7 R  Open-Source    computing#8         Ruby  Open-Source          web#9       VB.NET  Proprietary  application#10  Javascript  Open-Source          web#11      Matlab  Proprietary    computing# CROSS JOIN df1['key'] = 1      # (REQUIRES A JOIN KEY OF SAME VALUE)df2['key'] = 1         crossjoindf = pd.merge(df1, df2, on=['key'])# FILTER FOR LANGUAGES IN LARGER NOT IN SMALLER (ALSO USING ISIN)crossjoindf = crossjoindf[~crossjoindf['Languages_y'].isin(crossjoindf['Languages_x'])]         [['Languages_y', 'Uses_y', 'Type_y']].drop_duplicates()#   Languages_y       Uses_y       Type_y#6         Perl  application  Open-Source#7 R    computing  Open-Source#8         Ruby          web  Open-Source#9       VB.NET  application  Proprietary#10  Javascript          web  Open-Source#11      Matlab    computing  Proprietary

诚然,交叉连接在这里可能是多余且冗长的,但是如果您无与伦比的需求需要跨数据帧进行排列,那么它会很方便。



转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/611640.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号