大数据开源项目众多,何况还会经常涉及Linux,k8s相关的系统方面的内容,其中所有涉及到的命令更是繁多到已非人力可以记住的地步,每次使用再去查找真的是不胜其烦,所以一次将使用到的相关组件的命令总结记录下来,以备查找。
hive1.动态分区个数默认为100,增加动态分区的数目,需要设置的参数
FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed due to: Job aborted due to stage failure:
Aborting TaskSet 1.0 because task 0 (partition 0)
cannot run anywhere due to node and executor blacklist
需要设置参数:
set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.max.dynamic.partitions.pernode=10000; set hive.exec.max.dynamic.partitions=10000; set hive.exec.max.created.files=10000;clickhouse
1.命令启动clickhouse查询
clickhouse-client --param_parName="[1, 2]" -q "SELECt * FROM table WHERe a = {parName:Array(UInt16)}"
更多相关参数设置
2.将CSV数据导入到clickhouse的表中
cat /tmp/z/scene.csv | clickhouse-client -h ip --port 9000 -u username --password password --query="INSERT INTO common.scene_enum_dim format CSV"
更多文件导入clickhouse表方式
Flink1.命令触发savepoint的保存
./bin/flink savepoint[savepointDirectory]
任务从savepoint恢复
./bin/flink run -s[OPTIONS]
更多savepoint相关操作
--------开源不断,更新不止......
迪答 公众号



