Pandas中以Series为基础对所有元素进行统一的操作

在使用pandas对数据进行处理时，假设对于二维数组，每一行是一个数据，每一列是一个特征，可能需要对所有数据的特征进行一些缩放、平方、增加数值等操作。

本文中介绍的方式，将每一个特征都取出作为一个 Series 来对当前 Series (特征)下的所有数据进行统一的操作。

# 假设 X 为所有的数据矩阵, feature 为某一个特征，其数据类型为Dataframe中的Series
feature = X[0] # 假设 feature 取第0列表示的特征

new_f1 = feature + 1		# 所有数据的第0个特征都+1
new_f2 = feature ** 2		# 平方
new_f3 = 1 / feature		# 取倒数
new_f4 = np.sqrt(feature)	# 开放

下面是对数据X的每一列进行一系列操作的代码和结果展示：

import pandas as pd
import numpy as np

if __name__ == '__main__':
    X = [[1, 2], [3, 4], [5, 6], [7, 8]]
    X = pd.Dataframe(X)
    col_nums = 2
    column_counter = col_nums
    print("原数据:")
    print(X)
    X_copy = X.copy(deep=True)
    for column in X:
        X[column + column_counter] = X_copy[column] ** 2
    column_counter += col_nums

    for column in X.iloc[:, 0:col_nums]:
        X[column + column_counter] = 1 / X_copy[column]
    column_counter += col_nums

    for column in X.iloc[:, 0:col_nums]:
        X[column + column_counter] = np.sqrt(X_copy[column])
    print("进行统一操作后的数据:")
    print(X)
    column_counter += col_nums

原数据:
   0  1
0  1  2
1  3  4
2  5  6
3  7  8
进行统一操作后的数据:
   0  1   2   3         4         5         6         7
0  1  2   1   4  1.000000  0.500000  1.000000  1.414214
1  3  4   9  16  0.333333  0.250000  1.732051  2.000000
2  5  6  25  36  0.200000  0.166667  2.236068  2.449490
3  7  8  49  64  0.142857  0.125000  2.645751  2.828427

Pandas中以Series为基础对所有元素进行统一的操作

Python相关栏目本月热门文章