Ubuntu 20.04
笔者使用 Multipass 安装 Ubuntu 虚拟机
参考:
《如何在 MacOS 安装轻量级虚拟机工具 Multipass》
《如何在 Windows 安装轻量级虚拟机工具 Multipass》
$ sudo apt update $ sudo apt install python3 python3-pip $ python3 --version Python 3.8.10 ## 创建软连接 $ sudo ln -s /usr/bin/python3 /usr/bin/python ubuntu@ubuntu:~$ python --version Python 3.8.10 $ sudo python -m pip install --upgrade pip $ sudo python -m pip install --upgrade setuptools $ pip --version pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)安装 PyFlink
sudo python -m pip install apache-flink==1.13.6WordCount 案例
wordcount.py 代码如下:
from pyflink.table import DataTypes, TableEnvironment, EnvironmentSettings
from pyflink.table.descriptors import Schema, OldCsv, FileSystem
from pyflink.table.expressions import lit
settings = EnvironmentSettings.new_instance().in_batch_mode().use_blink_planner().build()
t_env = TableEnvironment.create(settings)
# write all the data to one file
t_env.get_config().get_configuration().set_string("parallelism.default", "1")
t_env.connect(FileSystem().path('/tmp/input'))
.with_format(OldCsv().field('word', DataTypes.STRING()))
.with_schema(Schema().field('word', DataTypes.STRING()))
.create_temporary_table('mySource')
t_env.connect(FileSystem().path('/tmp/output'))
.with_format(OldCsv().field_delimiter('t')
.field('word', DataTypes.STRING())
.field('count', DataTypes.BIGINT()))
.with_schema(Schema()
.field('word', DataTypes.STRING())
.field('count', DataTypes.BIGINT()))
.create_temporary_table('mySink')
tab = t_env.from_path('mySource')
tab.group_by(tab.word).select(tab.word, lit(1).count).execute_insert('mySink').wait()
在shell 命令行执行:
$ echo -e "flinknpyflinknflink" > /tmp/input $ sudo python wordcount.py $ cat /tmp/output flink 2 pyflink 1



