如何展平一个JSON
或dict
一个常见问题,对此有很多答案。
- 此答案集中于使用
flatten_json
递归展平嵌套dict
或JSON
假设:
- 该答案假定您已经将
JSON
或dict
加载到了某个变量(例如,文件,api等)中
如何data
加载到flatten_json
:
最常见的形式data
:
- 只是一句话:
{} - 字典列表:
[{}, {}, {}] - 带有顶级密钥的JSON,其中值重复:
{1: {}, 2: {}, 3: {}} - 其他
实际示例:
- 我通常会扁平
data
化成pandas.Dataframe
- 载
pandas
有import pandas as pd
资料1:
{ "id": 1, "class": "c1", "owner": "myself", "metadata": { "m1": { "value": "m1_1", "timestamp": "d1" }, "m2": { "value": "m1_2", "timestamp": "d2" }, "m3": { "value": "m1_3", "timestamp": "d3" }, "m4": { "value": "m1_4", "timestamp": "d4" } }, "a1": { "a11": [ ] }, "m1": {}, "comm1": "COMM1", "comm2": "COMM21529089656387", "share": "xxx", "share1": "yyy", "hub1": "h1", "hub2": "h2", "context": [ ]}
展平1:
df = pd.Dataframe([flatten_json(data)]) id class owner metadata_m1_value metadata_m1_timestamp metadata_m2_value metadata_m2_timestamp metadata_m3_value metadata_m3_timestamp metadata_m4_value metadata_m4_timestamp comm1 comm2 share share1 hub1 hub2 1 c1 myself m1_1 d1 m1_2 d2 m1_3 d3 m1_4 d4 COMM1 COMM21529089656387 xxx yyy h1 h2
资料2:
[{ 'accuracy': 17, 'activity': [{ 'activity': [{ 'confidence': 100, 'type': 'STILL' } ], 'timestampMs': '1542652' } ], 'altitude': -10, 'latitudeE7': 3777321, 'longitudeE7': -122423125, 'timestampMs': '1542654', 'verticalAccuracy': 2 }, { 'accuracy': 17, 'activity': [{ 'activity': [{ 'confidence': 100, 'type': 'STILL' } ], 'timestampMs': '1542652' } ], 'altitude': -10, 'latitudeE7': 3777321, 'longitudeE7': -122423125, 'timestampMs': '1542654', 'verticalAccuracy': 2 }, { 'accuracy': 17, 'activity': [{ 'activity': [{ 'confidence': 100, 'type': 'STILL' } ], 'timestampMs': '1542652' } ], 'altitude': -10, 'latitudeE7': 3777321, 'longitudeE7': -122423125, 'timestampMs': '1542654', 'verticalAccuracy': 2 }]
展平2:
df = pd.Dataframe([flatten_json(x) for x in data]) accuracy activity_0_activity_0_confidence activity_0_activity_0_type activity_0_timestampMs altitude latitudeE7 longitudeE7 timestampMs verticalAccuracy17 100STILL 1542652 -10 3777321 -122423125 1542654 217 100STILL 1542652 -10 3777321 -122423125 1542654 217 100STILL 1542652 -10 3777321 -122423125 1542654 2
资料3:
{ "1": { "VENUE": "JOEBURG", "COUNTRY": "HAE", "ITW": "XAD", "RACES": { "1": { "NO": 1, "TIME": "12:35" }, "2": { "NO": 2, "TIME": "13:10" }, "3": { "NO": 3, "TIME": "13:40" }, "4": { "NO": 4, "TIME": "14:10" }, "5": { "NO": 5, "TIME": "14:55" }, "6": { "NO": 6, "TIME": "15:30" }, "7": { "NO": 7, "TIME": "16:05" }, "8": { "NO": 8, "TIME": "16:40" } } }, "2": { "VENUE": "FOOBURG", "COUNTRY": "ABA", "ITW": "XAD", "RACES": { "1": { "NO": 1, "TIME": "12:35" }, "2": { "NO": 2, "TIME": "13:10" }, "3": { "NO": 3, "TIME": "13:40" }, "4": { "NO": 4, "TIME": "14:10" }, "5": { "NO": 5, "TIME": "14:55" }, "6": { "NO": 6, "TIME": "15:30" }, "7": { "NO": 7, "TIME": "16:05" }, "8": { "NO": 8, "TIME": "16:40" } } }}
展平3:
df = pd.Dataframe([flatten_json(data[key]) for key in data.keys()]) VENUE COUNTRY ITW RACES_1_NO RACES_1_TIME RACES_2_NO RACES_2_TIME RACES_3_NO RACES_3_TIME RACES_4_NO RACES_4_TIME RACES_5_NO RACES_5_TIME RACES_6_NO RACES_6_TIME RACES_7_NO RACES_7_TIME RACES_8_NO RACES_8_TIME JOEBURG HAE XAD1 12:352 13:103 13:404 14:105 14:556 15:307 16:058 16:40 FOOBURG ABA XAD1 12:352 13:103 13:404 14:105 14:556 15:307 16:058 16:40