这是基于熊猫完整的示例
groupby,
sum函数。基本思想是基于数据对数据进行分组
'Localization'并在其上应用功能。
import pandas as pdfrom io import StringIO#For Python 2, replace previous line with: from StringIO import StringIOdata = """Localization,RNA level,Sizecytoplasm ,1 Non-expressed, 7cytoplasm ,2 Very low ,13cytoplasm ,3 Low , 8cytoplasm ,4 Medium , 6cytoplasm ,5 Moderate , 8cytoplasm ,6 High , 2cytoplasm ,7 Very high , 6cytoplasm & nucleus ,1 Non-expressed, 5cytoplasm & nucleus ,2 Very low , 8cytoplasm & nucleus ,3 Low , 2cytoplasm & nucleus ,4 Medium ,10cytoplasm & nucleus ,5 Moderate ,16cytoplasm & nucleus ,6 High , 6cytoplasm & nucleus ,7 Very high , 5cytoplasm & nucleus & plasma membrane,1 Non-expressed, 6cytoplasm & nucleus & plasma membrane,2 Very low , 3cytoplasm & nucleus & plasma membrane,3 Low , 3cytoplasm & nucleus & plasma membrane,4 Medium , 7cytoplasm & nucleus & plasma membrane,5 Moderate , 8cytoplasm & nucleus & plasma membrane,6 High , 4cytoplasm & nucleus & plasma membrane,7 Very high , 1"""# Create the dataframedf = pd.read_csv(StringIO(data))df['Localization'].str.strip()df['RNA level'].str.strip()df['Size'].astype(int)df['Percent'] = df.groupby('Localization')['Size'].transform(lambda x: x/sum(x))


