此次爬取,主要通过调取高德行政区划查询API接口,获取最新的数据信息(省、市、区、乡镇、街道经纬度、行政级别、城市编码、行政编码等),并通过pymysql存入mysql数据库
表结构:
DROP TABLE IF EXISTS `districtsx`; CREATE TABLE `districtsx` ( `districtId` int(11) NOT NULL AUTO_INCREMENT, `districtPid` int(11) NULL DEFAULT NULL COMMENT '上级ID', `pname` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '上级名称_省', `cityname` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '上级名称_市', `districtname` varchar(60) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '上级名称_区县', `name` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '行政区名称', `citycode` varchar(6) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '城市编码', `adcode` varchar(6) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '城市区域编码', `lng` float(13, 10) NULL DEFAULT NULL COMMENT '经度', `lat` float(13, 10) NULL DEFAULT NULL COMMENT '纬度', `level` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '行政区划级别', `createTime` timestamp(0) NULL DEFAULT CURRENT_TIMESTAMP(0), `updateTime` timestamp(0) NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP(0), PRIMARY KEY (`districtId`) USING BTREE, INDEX `districtsx_idx1`(`name`) USING BTREE, INDEX `districtsx_idx2`(`districtId`) USING BTREE, INDEX `districtsx_idx3`(`cityname`) USING BTREE, INDEX `districtsx_idx4`(`districtname`) USING BTREE, INDEX `districtsx_idx5`(`districtPid`) USING BTREE ) ENGINE = InnoDB AUTO_INCREMENT = 44216 CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci ROW_FORMAT = Dynamic;
数据爬取后结果如下:
首先在"高德开放平台"注册账号, 可以免费新建web应用, 创建免费的key, 单次查询一个关键词keyword算一次, 每次运行大概会耗费36次API调用配额,每天的配额是30000次, 绰绰有余啦!
添加key的时候,记得选择服务平台为:"web服务" ,否则调用失败!
python代码:(替换MySQL帐号、及换入自己的高德key值)
districts_python/districtsx.py · felixlyu/code - Gitee.com



