帖子的答案如何使用数据框定义sankey图的结构?会告诉您,将Sankey数据源强制为一个数据框可能会很快导致混乱。最好将节点与链接分开,因为它们的构造不同。
因此,您的节点数据帧应如下所示:
ID Label Color0 AKJ Education #4994CE1 Amazon #8A59882 Flipkart #449E9E3 Books #7FC2414 Computers & tablets #D3D3D35 Other #4994CE
您的链接数据框应如下所示:
Source Target Value link Color0 3 846888 rgba(127, 194, 65, 0.2)0 4 1045 rgba(127, 194, 65, 0.2)1 3 1294423 rgba(211, 211, 211, 0.5)1 442165 rgba(211, 211, 211, 0.5)1 5 415 rgba(211, 211, 211, 0.5)2 5 1 rgba(253, 227, 212, 1)
现在,如果您使用与plot.ly上的苏格兰公投图表类似的设置,则可以构建此文件:
由于数字之间的巨大差异,该特定图表看起来有些奇怪。出于说明目的,我将您的所有电话号码替换为
1:
这是将所有内容轻松复制并粘贴到Jupyter笔记本中的全部内容:
# importsimport pandas as pdimport numpy as npimport plotly.graph_objs as gofrom plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplotinit_notebook_mode(connected=True)# Nodes & linksnodes = [['ID', 'Label', 'Color'], [0,'AKJ Education','#4994CE'], [1,'Amazon','#8A5988'], [2,'Flipkart','#449E9E'], [3,'Books','#7FC241'], [4,'Computers & tablets','#D3D3D3'], [5,'Other','#4994CE'],]# links with your datalinks = [['Source','Target','Value','link Color'], # AKJ [0,3,1,'rgba(127, 194, 65, 0.2)'], [0,4,1,'rgba(127, 194, 65, 0.2)'], # Amazon [1,3,1,'rgba(211, 211, 211, 0.5)'], [1,4,1,'rgba(211, 211, 211, 0.5)'], [1,5,1,'rgba(211, 211, 211, 0.5)'], # Flipkart [2,5,1,'rgba(253, 227, 212, 1)'], [2,3,1,'rgba(253, 227, 212, 1)'],]# links with some data for illustrative purposes #################links = [# ['Source','Target','Value','link Color'],# # # AKJ# [0,3,846888,'rgba(127, 194, 65, 0.2)'],# [0,4,1045,'rgba(127, 194, 65, 0.2)'],# # # Amazon# [1,3,1294423,'rgba(211, 211, 211, 0.5)'],# [1,4,42165,'rgba(211, 211, 211, 0.5)'],# [1,5,415,'rgba(211, 211, 211, 0.5)'],# # # Flipkart# [2,5,1,'rgba(253, 227, 212, 1)'],]################################################################## Retrieve headers and build dataframesnodes_headers = nodes.pop(0)links_headers = links.pop(0)df_nodes = pd.Dataframe(nodes, columns = nodes_headers)df_links = pd.Dataframe(links, columns = links_headers)# Sankey plot setupdata_trace = dict( type='sankey', domain = dict( x = [0,1], y = [0,1] ), orientation = "h", valueformat = ".0f", node = dict( pad = 10, # thickness = 30, line = dict( color = "black", width = 0 ), label = df_nodes['Label'].dropna(axis=0, how='any'), color = df_nodes['Color'] ), link = dict( source = df_links['Source'].dropna(axis=0, how='any'), target = df_links['Target'].dropna(axis=0, how='any'), value = df_links['Value'].dropna(axis=0, how='any'), color = df_links['link Color'].dropna(axis=0, how='any'), ))layout = dict( title = "Draw Sankey Diagram from dataframes", height = 772, font = dict( size = 10),)fig = dict(data=[data_trace], layout=layout)iplot(fig, validate=False)



