将适当元素设置为NaN的矢量化方法
@unutbu的解决方案必须摆脱您得到的值错误。如果您希望
vectorize获得性能,可以这样使用
booleanindexing-
import numpy as np# Create mask of positions in x (with float datatype) where NaNs are to be putmask = np.asarray(cutoff)[:,None] > np.arange(x.shape[1])# Put NaNs into masked region of x for the desired ouputx[mask] = np.nan
样品运行-
In [92]: x = np.random.randint(0,9,(4,7)).astype(float)In [93]: xOut[93]: array([[ 2., 1., 5., 2., 5., 2., 1.], [ 2., 5., 7., 1., 5., 4., 8.], [ 1., 1., 7., 4., 8., 3., 1.], [ 5., 8., 7., 5., 0., 2., 1.]])In [94]: cutoff = [5,3,0,6]In [95]: x[np.asarray(cutoff)[:,None] > np.arange(x.shape[1])] = np.nanIn [96]: xOut[96]: array([[ nan, nan, nan, nan, nan, 2., 1.], [ nan, nan, nan, 1., 5., 4., 8.], [ 1., 1., 7., 4., 8., 3., 1.], [ nan, nan, nan, nan, nan, nan, 1.]])
向量化方法可直接计算适当元素的按行平均值
如果要获取掩盖的平均值,则可以修改较早提出的矢量化方法,以避免
NaNs完全处理,更重要的是保留
x整数值。这是修改后的方法-
# Get array version of cutoffcutoff_arr = np.asarray(cutoff)# Mask of positions in x which are to be considered for row-wise mean calculationsmask1 = cutoff_arr[:,None] <= np.arange(x.shape[1])# Mask x, calculate the corresponding sum and thus mean values for each rowmasked_mean_vals = (mask1*x).sum(1)/(x.shape[1] - cutoff_arr)
这是这种解决方案的示例运行-
In [61]: x = np.random.randint(0,9,(4,7))In [62]: xOut[62]: array([[5, 0, 1, 2, 4, 2, 0], [3, 2, 0, 7, 5, 0, 2], [7, 2, 2, 3, 3, 2, 3], [4, 1, 2, 1, 4, 6, 8]])In [63]: cutoff = [5,3,0,6]In [64]: cutoff_arr = np.asarray(cutoff)In [65]: mask1 = cutoff_arr[:,None] <= np.arange(x.shape[1])In [66]: mask1Out[66]: array([[False, False, False, False, False, True, True], [False, False, False, True, True, True, True], [ True, True, True, True, True, True, True], [False, False, False, False, False, False, True]], dtype=bool)In [67]: masked_mean_vals = (mask1*x).sum(1)/(x.shape[1] - cutoff_arr)In [68]: masked_mean_valsOut[68]: array([ 1. , 3.5 , 3.14285714, 8. ])



