Reverse column A, take the cumsum, then reverse again:
df['C'] = df.loc[::-1, 'A'].cumsum()[::-1]
import pandas as pddf = pd.Dataframe( {'A': [False, True, False, False, False, True, False, True], 'B': [0.03771, 0.315414, 0.33248, 0.445505, 0.580156, 0.741551, 0.796944, 0.817563],}, index=[6, 2, 4, 7, 3, 1, 5, 0])df['C'] = df.loc[::-1, 'A'].cumsum()[::-1]print(df)yields
A B C6 False 0.037710 32 True 0.315414 34 False 0.332480 27 False 0.445505 23 False 0.580156 21 True 0.741551 25 False 0.796944 10 True 0.817563 1
Alternatively, you could count the number of
Trues in column
Aand
subtract the (shifted) cumsum:
In [113]: df['A'].sum()-df['A'].shift(1).fillna(0).cumsum()Out[113]: 6 32 34 27 23 21 25 10 1Name: A, dtype: object
But this is significantly slower. Using IPython to
perform the benchmark:
In [116]: df = pd.Dataframe({'A':np.random.randint(2, size=10**5).astype(bool)})In [117]: %timeit df['A'].sum()-df['A'].shift(1).fillna(0).cumsum()10 loops, best of 3: 19.8 ms per loopIn [118]: %timeit df.loc[::-1, 'A'].cumsum()[::-1]1000 loops, best of 3: 701 µs per loop


