栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 面试经验 > 面试问答

子类化Pandas DataFrame,可以更新吗?

面试问答 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

子类化Pandas DataFrame,可以更新吗?

这就是我的方法。我遵循了发现的建议:

  • 子类化熊猫数据结构
  • 修复完成问题

以下示例仅显示了构造的新子类的用法

pandas.Dataframe
。如果您按照我的第一个链接中的建议进行操作,则也可以考虑使用子类化
pandas.Series
,以考虑获取
pandas.Dataframe
子类的一维切片。

定义
SomeData

import pandas as pdimport numpy as npclass SomeData(pd.Dataframe):    # This class variable tells Pandas the name of the attributes    # that are to be ported over to derivative Dataframes.  There    # is a method named `__finalize__` that grabs these attributes    # and assigns them to newly created `SomeData`    _metadata = ['my_attr']    @property    def _constructor(self):        """This is the key to letting Pandas know how to keep        derivative `SomeData` the same type as yours.  It should        be enough to return the name of the Class.  However, in        some cases, `__finalize__` is not called and `my_attr` is        not carried over.  We can fix that by constructing a callable        that makes sure to call `__finlaize__` every time."""        def _c(*args, **kwargs): return SomeData(*args, **kwargs).__finalize__(self)        return _c    def __init__(self, *args, **kwargs):        # grab the keyword argument that is supposed to be my_attr        self.my_attr = kwargs.pop('my_attr', None)        super().__init__(*args, **kwargs)    def my_method(self, other):        return self * np.sign(self - other)

示范

mydata = SomeData(dict(A=[1, 2, 3], B=[4, 5, 6]), my_attr='an attr')print(mydata, type(mydata), mydata.my_attr, sep='n' * 2)   A  B0  1  41  2  52  3  6<class '__main__.SomeData'>an attrnewdata = mydata.mul(2)print(newdata, type(newdata), newdata.my_attr, sep='n' * 2)   A   B0  2   81  4  102  6  12<class '__main__.SomeData'>an attrnewerdata = mydata.my_method(newdata)print(newerdata, type(newerdata), newerdata.my_attr, sep='n' * 2)   A  B0 -1 -41 -2 -52 -3 -6<class '__main__.SomeData'>an attr

陷阱

这种方法很烂

pd.Dataframe.equals

newerdata.equals(newdata)  # Should be `False`

TypeErrorTraceback (most recent call

last)
in ()
----> 1 newerdata.equals(newdata)

~/anaconda3/envs/3.6.ml/lib/python3.6/site-

packages/pandas/core/generic.py in equals(self, other)
1034 the same location are considered equal.
1035 “”“
-> 1036 if not isinstance(other, self._constructor):
1037 return False
1038 return self._data.equals(other._data)

TypeError: isinstance() arg 2 must be a type or tuple of types

发生的事情是该方法希望

type
_constructor
属性中找到类型的对象。相反,它找到了我可打电话的地方,以解决
__finalize__
我遇到的问题。

解决

equals
在类定义中使用以下方法重写该方法。

    def equals(self, other):        try: pd.testing.assert_frame_equal(self, other) return True        except AssertionError: return Falsenewerdata.equals(newdata)  # Should be `False`False


转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/641221.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号