栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

hive 分区值支持中文

hive 分区值支持中文

背景

hive 插入分区时(insert into 或者add partition等操作)分区值是中文的时候,就出现报错。
alter table test add partition(part=“中文”);
hivemetastore报错信息

2021-12-01T15:25:03,474 ERROR [pool-8-thread-134] bonecp.ConnectionHandle: Database access problem. Killing off this connection and all remaining connections in the connection pool. SQL State = HY000
2021-12-01T15:25:03,478 ERROR [pool-8-thread-134] DataNucleus.Transaction: Operation rollback failed on resource: org.datanucleus.store.rdbms.ConnectionFactoryImpl$EmulatedXAResource@6e8aa02d, error code UNKNOWN and transaction: [DataNucleus Transaction, ID=Xid=
2021-12-01T15:25:03,479 ERROR [pool-8-thread-134] metastore.RetryingHMSHandler: Retrying HMSHandler after 2000 ms (attempt 1 of 10) with error: javax.jdo.JDOException: Exception thrown when executing query : SELECt DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' AS `NUCLEUS_TYPE`,`A0`.`CREATE_TIME`,`A0`.`LAST_ACCESS_TIME`,`A0`.`PART_NAME`,`A0`.`PART_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `TBLS` `B0` ON `A0`.`TBL_ID` = `B0`.`TBL_ID` LEFT OUTER JOIN `DBS` `C0` ON `B0`.`DB_ID` = `C0`.`DB_ID` WHERe `B0`.`TBL_NAME` = ? AND `C0`.`NAME` = ? AND `A0`.`PART_NAME` = ?

原因定位

怀疑是中英文的问题,但是还是不能确认。进入mysql控制台,执行上面的命令(需要通过开启mysql通用日志获取到上面错误日志中?的内容)
执行后出现这个错误

ERROR 1267 (HY000): Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='

这个时候,可以确定是字符集的问题了。
确认一下,果然PARTITIONS表的字符集是latin1,不支持中文

MariaDB [hivemeta]> show create table PARTITIONS;
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table      | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PARTITIONS | CREATE TABLE `PARTITIONS` (
  `PART_ID` bigint(20) NOT NULL,
  `CREATE_TIME` int(11) NOT NULL,
  `LAST_ACCESS_TIME` int(11) NOT NULL,
  `PART_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `SD_ID` bigint(20) DEFAULT NULL,
  `TBL_ID` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`PART_ID`),
  UNIQUE KEY `UNIQUEPARTITION` (`PART_NAME`,`TBL_ID`),
  KEY `PARTITIONS_N49` (`TBL_ID`),
  KEY `PARTITIONS_N50` (`SD_ID`),
  ConSTRAINT `PARTITIONS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`),
  ConSTRAINT `PARTITIONS_FK2` FOREIGN KEY (`SD_ID`) REFERENCES `SDS` (`SD_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
解决方案

在hive连接的mysql中执行下面命令,更改字符集为utf8

alter table PARTITIONS default character set utf8;
alter table PARTITION_KEY_VALS default character set utf8;
alter table SDS default character set utf8;
field coding
alter table PARTITIONS modify column PART_name varchar(190) character set utf8;
alter table PARTITION_KEY_VALS modify column PART_KEY_VAL varchar(256) character set utf8;
alter table SDS modify column LOCATION varchar(4000) character set utf8;
细节说明
  1. MariaDB [hivemeta]> alter table PARTITIONS modify column PART_NAME varchar(767) CHARACTER SET utf8;
    ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
    原因:PART_NAME是索引建,按照规定最大的长度就是767字节,而设置为utf8字符集之后,就相当于一个utf8字符要占用3个字节,因此就不能直接设置767,设置200就不会报错了。
mysql的日志开启
  1. show variables like ‘general_log’; – 查看日志是否开启

  2. set global general_log=on; – 开启日志功能

  3. show variables like ‘general_log_file’; – 看看日志文件保存位置

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/629932.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号