栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

基于Java开发Streaming篇

基于Java开发Streaming篇

package com.hj.spark;
import java.util.Arrays;
import java.util.Iterator;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import org.apache.spark.streaming.Durations;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaPairDStream;
import org.apache.spark.streaming.api.java.JavaReceiverInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;

import scala.Tuple2;
public class SparkStreaming {

	public static void main(String[] args) throws InterruptedException {

		SparkConf conf = new SparkConf().setAppName("NetwordCount").setMaster("local[2]");
		// 功能入口
		JavaStreamingContext jssc =new  JavaStreamingContext(conf, Durations.seconds(1));
		// 创建一个Dstream 接收来自TCP的数据流 主机名 端口号
		JavaReceiverInputDStream lines = jssc.socketTextStream("hadoop", 9999);
		
		JavaDStream words = lines.flatMap(new FlatMapFunction() {

			@Override
			public Iterator call(String s) throws Exception {
				// TODO Auto-generated method stub
				return Arrays.asList(s.split(" ")).iterator();
				
			}
		});
		
		JavaPairDStream pairs = words.mapToPair(new PairFunction() {

			@Override
			public Tuple2 call(String s) throws Exception {
				// TODO Auto-generated method stub
				return new Tuple2(s, 1);
			}
		});
		
		// reduceByKey
		JavaPairDStream wordCount = pairs.reduceByKey(new Function2() {
			
			@Override
			public Integer call(Integer arg0, Integer arg1) throws Exception {
				// TODO Auto-generated method stub
				return arg0 + arg1;
			}
		});
		
		wordCount.print();
		jssc.start();
		jssc.awaitTermination();
	}

}

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/423068.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号