python-创建数据透视表

我认为这是您想要的：

data = np.array([[ 4057,     8,  1374],      [ 4057,     9,   759],      [ 4057,    11,    96],      [89205,    16,   146],      [89205,    17,   154],      [89205,    18,   244]])rows, row_pos = np.unique(data[:, 0], return_inverse=True)cols, col_pos = np.unique(data[:, 1], return_inverse=True)pivot_table = np.zeros((len(rows), len(cols)), dtype=data.dtype)pivot_table[row_pos, col_pos] = data[:, 2]>>> pivot_tablearray([[1374,  759,   96,    0,    0,    0],       [   0,    0,    0,  146,  154,  244]])>>> rowsarray([ 4057, 89205])>>> colsarray([ 8,  9, 11, 16, 17, 18])

这种方法有一些局限性，主要是，如果您对相同的行/列组合重复输入，则不会将它们加在一起，而只会保留一个（可能是最后一个）。如果您想将它们全部加在一起，尽管有些麻烦，但是您可能会滥用scipy的稀疏模块：

data = np.array([[ 4057,     8,  1374],      [ 4057,     9,   759],      [ 4057,    11,    96],      [89205,    16,   146],      [89205,    17,   154],      [89205,    18,   244],      [ 4057,    11,     4]])rows, row_pos = np.unique(data[:, 0], return_inverse=True)cols, col_pos = np.unique(data[:, 1], return_inverse=True)pivot_table = np.zeros((len(rows), len(cols)), dtype=data.dtype)pivot_table[row_pos, col_pos] = data[:, 2]>>> pivot_table # the element at [0, 2] should be 100!!!array([[1374,  759,    4,    0,    0,    0],       [   0,    0,    0,  146,  154,  244]])import scipy.sparse as spspivot_table = sps.coo_matrix((data[:, 2], (row_pos, col_pos)),       shape=(len(rows), len(cols))).A>>> pivot_table # now repeated elements are added togetherarray([[1374,  759,  100,    0,    0,    0],       [   0,    0,    0,  146,  154,  244]])

python-创建数据透视表

面试问答相关栏目本月热门文章