栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

Flink1.10实战:自定义聚合函数AggregateFunction

Flink1.10实战:自定义聚合函数AggregateFunction

    Flink 的AggregateFunction是一个基于中间计算结果状态进行增量计算的函数。由于是迭代计算方式,所以,在窗口处理过程中,不用缓存整个窗口的数据,所以效率执行比较高。

@PublicEvolving

public interface AggregateFunction extends Function, Serializable {

...............................

}

自定义聚合函数需要实现AggregateFunction接口类,它有四个接口实现方法:

a.创建一个新的累加器,启动一个新的聚合,负责迭代状态的初始化

ACC createAccumulator();
b.对于数据的每条数据,和迭代数据的聚合的具体实现

ACC add(IN value, ACC accumulator);
c.合并两个累加器,返回一个具有合并状态的累加器

ACC merge(ACC a, ACC b);
d.从累加器获取聚合的结果

OUT getResult(ACC accumulator);

3.自定义聚合函数MyCountAggregate

  1. package com.hadoop.ljs.flink110.aggreagate;
    import org.apache.flink.api.common.functions.AggregateFunction;
    
    public class MyCountAggregate implements AggregateFunction {
        @Override
        public Long createAccumulator() {
            
            return 0L;
        }
        @Override
        public Long add(ProductViewData value, Long accumulator) {
            
            return accumulator+1;
        }
        @Override
        public Long getResult(Long accumulator) {
            return accumulator;
        }
        
        @Override
        public Long merge(Long a, Long b) {
            return a+b;
        }
    }
    

    4.自定义窗口函数

  2. package com.hadoop.ljs.flink110.aggreagate;
    import org.apache.flink.streaming.api.functions.windowing.WindowFunction;
    import org.apache.flink.streaming.api.windowing.windows.TimeWindow;
    import org.apache.flink.util.Collector;
    
    public class MyCountWindowFunction2 implements WindowFunction {
    @Override
    public void apply(String productId, TimeWindow window, Iterable input, Collector out) throws Exception {
         
        
        out.collect("----------------窗口时间:"+window.getEnd());
        out.collect("商品ID: "+productId+"  浏览量: "+input.iterator().next());
        }
    
    

    5.主函数,代码如下:

  3. import org.apache.flink.api.common.functions.FilterFunction;
    import org.apache.flink.api.common.functions.MapFunction;
    import org.apache.flink.api.java.functions.KeySelector;
    import org.apache.flink.streaming.api.TimeCharacteristic;
    import org.apache.flink.streaming.api.datastream.DataStream;
    import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
    import org.apache.flink.streaming.api.functions.timestamps.AscendingTimestampExtractor;
    import org.apache.flink.streaming.api.windowing.time.Time;
     
    
    public class AggregateFunctionMain2 {
     
        public  static int windowSize=6000;
        public  static int windowSlider=3000;
        
        public static void main(String[] args) throws Exception {
            StreamExecutionEnvironment senv = StreamExecutionEnvironment.getExecutionEnvironment();
            senv.setParallelism(1);
            senv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
             
            //从文件读取数据,也可以从socket读取数据       
            DataStream sourceData = senv.readTextFile("D:\projectData\ProductViewData2.txt");
            DataStream productViewData = sourceData.map(new MapFunction() {
                @Override
                public ProductViewData map(String value) throws Exception {
                    String[] record = value.split(",");
                    return new ProductViewData(record[0], record[1], Long.valueOf(record[2]), Long.valueOf(record[3]));
                }
            }).assignTimestampsAndWatermarks(new AscendingTimestampExtractor(){
                @Override
                public long extractAscendingTimestamp(ProductViewData element) {
                    return element.timestamp;
                }
            });
            
            DataStream productViewCount = productViewData.filter(new FilterFunction() {
                @Override
                public boolean filter(ProductViewData value) throws Exception {
                    if(value.operationType==1){
                        return true;
                    }
                    return false;
                }
            }).keyBy(new KeySelector() {
                @Override
                public String getKey(ProductViewData value) throws Exception {
                    return value.productId;
                }
                //时间窗口 6秒  滑动间隔3秒
            }).timeWindow(Time.milliseconds(windowSize), Time.milliseconds(windowSlider))
            
            .aggregate(new MyCountAggregate(), new MyCountWindowFunction2());
            //聚合结果输出
            productViewCount.print();
     
            senv.execute("AggregateFunctionMain2");
        }
    }
    

       这里自定义聚合函数演示完毕,感谢关注!!!

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/457669.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号