栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

shell实现数据质量小功能

shell实现数据质量小功能

step_1 — 创建临时表,模拟数据质量监控情况
create table if not exists dataintel_tmp.qzd_20211026_sjzl_v1 as
select 0 as a
,null as b
,null as c
,null as d
,null as e
,0 as f
,0 as g
,0 as h
,0 as i
,3 as j
,6 as k
,7 as l
,12 as m

step_2 (done)
— 暂未找到 hive中统计字段数的方法
— 假定每张表字段穷举计算


针对每张表跳板机中空跑的情况 (done)

!/bin/bash

day=date -d"-1 days" +"%Y%m%d"
if [ $# -eq 1 ];then
day=$1
fi
count=0
while
[ $count -eq 0 ]
do
sh xxxx.sh KaTeX parse error: Undefined control sequence: at position 84: …utformat=csv2 ̲ ̲ …{day}"`
echo $count
done


step_3 针对结果异常占比情况返回特定的异常值(done)

!/bin/bash

COMMON_PATH=/data/hechenxi/test_hive

source ${COMMON_PATH}/common.sh

v_job_stat=0

count=/usr/local/share/hive/bin/beeline --silent=true --outputformat=csv2 --showHeader=false --showWarnings=false -u 'jdbc:hive2://bj1240:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=com-hive' -n dataintel -p c2248Br9ERs51RQY -e "select 18"

echo $count

if [ $count -le 15 ]
then count=0
else count=399
fi

echo $count

v_job_stat=expr ${v_job_stat} + ${count}

#########################################################################

返回作业执行状态码

#########################################################################
echo “v_job_stat = ${v_job_stat}”
exit ${v_job_stat}


step_4 实现demo拉通 (doing)

#!/bin/bash

Program: 脚本名称: 0值异常状态码:199 null值异常状态码:200 同时异常:143 脚本描述: 依赖表名: 写入表名: 功能说明: History: 2021/10/28

#########################################################################
###全局变量定义和引入
#########################################################################

source /data1/ide_resources/project/prod/Quality_Data/common/common.sh
#########################################################################
#########################################################################
v_job_stat=0

count=`
/usr/local/share/hive/bin/beeline --silent=true --outputformat=csv2
–showHeader=false --showWarnings=false
-u ‘jdbc:hive2://bj1240:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=com-hive’
-n dataintel -p c2248Br9ERs51RQY
-e "select
cast(( case when a is null then 1 else 0 end + case when b is null then 1 else 0 end + case when c is null then 1 else 0 end + case when d is null then 1 else 0 end

  • case when e is null then 1 else 0 end + case when f is null then 1 else 0 end + case when g is null then 1 else 0 end + case when h is null then 1 else 0 end
  • case when i is null then 1 else 0 end + case when j is null then 1 else 0 end + case when k is null then 1 else 0 end + case when l is null then 1 else 0 end
  • case when m is null then 1 else 0 end ) /13 as decimal(3,2) ) *100
    from dataintel_tmp.qzd_20211027_sjzl_v1 where dayno=${YYYYMMDD}"count1=
    /usr/local/share/hive/bin/beeline --silent=true --outputformat=csv2
    –showHeader=false --showWarnings=false
    -u ‘jdbc:hive2://bj1240:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=com-hive’
    -n dataintel -p c2248Br9ERs51RQY
    -e " select
    cast(( case when a=0 then 1 else 0 end + case when b=0 then 1 else 0 end + case when c=0 then 1 else 0 end + case when d=0 then 1 else 0 end
  • case when e=0 then 1 else 0 end + case when f=0 then 1 else 0 end + case when g=0 then 1 else 0 end + case when h=0 then 1 else 0 end
  • case when i=0 then 1 else 0 end + case when j=0 then 1 else 0 end + case when k=0 then 1 else 0 end + case when l=0 then 1 else 0 end
  • case when m=0 then 1 else 0 end ) /13 as decimal(3,2) ) *100 as 0_percent
    from dataintel_tmp.qzd_20211027_sjzl_v1 where dayno=${YYYYMMDD}"`

if [ $count -le 40 ]
then count=0
else count=455
fi
if [ $count1 -le 40 ]
then count1=0
else count1=456
fi

echo $count $count1

v_job_stat=expr ${v_job_stat} + ${count} + ${count1}

#########################################################################

返回作业执行状态码

#########################################################################
echo “v_job_stat = ${v_job_stat}”
exit ${v_job_stat}

#########################################################################
#########################################################################

echo ( d a t e + (date +%Y-%m-%d:%T) " (date+hql"
ExecuteHQLRecogDev “$hql”
v_job_stat=expr ${v_job_stat} + $?

#########################################################################

返回作业执行状态码

#########################################################################
echo “v_job_stat = ${v_job_stat}”
exit ${v_job_stat}

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/354451.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号