栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 系统运维 > 运维 > Linux

StarRocks docker 环境编译,测试最新留存retention和漏斗window

Linux 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

StarRocks docker 环境编译,测试最新留存retention和漏斗window

近期在做留存,漏斗等用户行为分析调研时,留存retention函数随着官网2.2 rc版本发布已经可以下载,然后漏斗函数window_funnel已经开发完成,合并到main版本,预计2.3版本会发布,现阶段通过编译,提前测试调研漏斗函数,以下为操作过程:

1.部署docker环境

mac环境下载docker.dmg,安装好,然后调整好docker资源:

可以编译好后,把docker资源再调整回来,之间就因为docker资源不够,编译不过,各种被kill,报错:

be部分编译到一半,然后停止了,再拉起编译,报错:

此时需要 sh build.sh --be --clean

note:可以通过sh build.sh --help查看compile支持的选项,fe和be模块可以分开构建;

2.docker pull starrocks main主干编译环境 

可以参考docs/Build_in_docker.md at main · StarRocks/docs · GitHub

docker pull starrocks/dev-env:main

docker run -it --name starrocks2.3 -d starrocks/dev-env:main

进去编译环境:

docker exec -it starrocks-compile-env-container-id /bin/bash

3.编译

参考docs/Build_in_docker.md at main · StarRocks/docs · GitHub

git clone https://github.com/StarRocks/starrocks.git
cd starrocks
sh build.sh

整个编译过程大概40-60mins

[100%] Built target starrocks_be

real	39m46.800s
user	168m35.045s
sys	10m22.659s

编译好后,生成的starrocks版本会在output目录,配置fe,be配置,启动服务,添加backend节点; 

查看builtin functions:

MySQL [(none)]> select current_version();
+-------------------+
| current_version() |
+-------------------+
| UNKNOWN ec27d13   |
+-------------------+
1 row in set (2.05 sec)

MySQL [(none)]> 
MySQL [(none)]> 
MySQL [(none)]> create database gong;      
Query OK, 0 rows affected (0.00 sec)

MySQL [(none)]> use gong;
Database changed
MySQL [gong]> show builtin functions from gong like '%retention%';
+---------------+
| Function Name |
+---------------+
| retention     |
+---------------+
1 row in set (0.02 sec)

MySQL [gong]> show builtin functions from gong like '%window%';
+---------------+
| Function Name |
+---------------+
| window_funnel |
+---------------+
1 row in set (0.00 sec)

4.测试

4.1 留存函数retention

mysql> select current_version();
+--------------------+
| current_version()  |
+--------------------+
| 2.2.0-RC01 f233d21 |
+--------------------+
1 row in set (0.00 sec)

mysql> show builtin functions from gong like '%retention%';
+---------------+
| Function Name |
+---------------+
| retention     |
+---------------+
1 row in set (0.01 sec)

准备数据:

mysql> select * from retention_test;
+------+---------------------+
| uid  | date                |
+------+---------------------+
|    4 | 2020-01-01 00:00:00 |
|    0 | 2020-01-01 00:00:00 |
|    1 | 2020-01-01 00:00:00 |
|    2 | 2020-01-01 00:00:00 |
|    3 | 2020-01-01 00:00:00 |
|    0 | 2020-01-02 00:00:00 |
|    1 | 2020-01-02 00:00:00 |
|    2 | 2020-01-02 00:00:00 |
|    3 | 2020-01-02 00:00:00 |
|    4 | 2020-01-02 00:00:00 |
|    5 | 2020-01-02 00:00:00 |
|    6 | 2020-01-02 00:00:00 |
|    7 | 2020-01-02 00:00:00 |
|    8 | 2020-01-02 00:00:00 |
|    9 | 2020-01-02 00:00:00 |
|    0 | 2020-01-03 00:00:00 |
|    1 | 2020-01-03 00:00:00 |
|    2 | 2020-01-03 00:00:00 |
|    3 | 2020-01-03 00:00:00 |
|    4 | 2020-01-03 00:00:00 |
|    5 | 2020-01-03 00:00:00 |
|    6 | 2020-01-03 00:00:00 |
|    7 | 2020-01-03 00:00:00 |
|    8 | 2020-01-03 00:00:00 |
|    9 | 2020-01-03 00:00:00 |
|   10 | 2020-01-03 00:00:00 |
|   11 | 2020-01-03 00:00:00 |
|   12 | 2020-01-03 00:00:00 |
|   13 | 2020-01-03 00:00:00 |
|   14 | 2020-01-03 00:00:00 |
+------+---------------------+
30 rows in set (0.01 sec)

让我们考虑使用 retention 功能的一个例子 ,以确定网站流量。

按唯一ID uid 对用户进行分组,使用 retention 功能。

mysql> select uid,
retention([date='2020-01-01',date='2020-01-02',date='2020-01-03']) 
from retention_test where date in('2020-01-01','2020-01-02','2020-01-03') 
group by uid order by uid;
+------+----------------------------------------------------------------------------+
| uid  | retention([date = '2020-01-01', date = '2020-01-02', date = '2020-01-03']) |
+------+----------------------------------------------------------------------------+
|    0 | [1,1,1]                                                                    |
|    1 | [1,1,1]                                                                    |
|    2 | [1,1,1]                                                                    |
|    3 | [1,1,1]                                                                    |
|    4 | [1,1,1]                                                                    |
|    5 | [0,0,0]                                                                    |
|    6 | [0,0,0]                                                                    |
|    7 | [0,0,0]                                                                    |
|    8 | [0,0,0]                                                                    |
|    9 | [0,0,0]                                                                    |
|   10 | [0,0,0]                                                                    |
|   11 | [0,0,0]                                                                    |
|   12 | [0,0,0]                                                                    |
|   13 | [0,0,0]                                                                    |
|   14 | [0,0,0]                                                                    |
+------+----------------------------------------------------------------------------+
15 rows in set (0.03 sec)

计算每天的现场访问总数:

mysql> SELECt sum(r[1]) AS r1,sum(r[2]) AS r2,sum(r[3]) AS r3 
       FROM (
       SELECt uid,
       retention([date = '2020-01-01', date = '2020-01-02', date = '2020-01-03']) As r
       FROM retention_test WHERe date IN ('2020-01-01', '2020-01-02', '2020-01-03')
       GROUP BY uid
       ) as rr;
+------+------+------+
| r1   | r2   | r3   |
+------+------+------+
|    5 |    5 |    5 |
+------+------+------+
1 row in set (0.39 sec)

条件:

  • r1-2020-01-01期间访问该网站的独立访问者数量( cond1 条件)。
  • r2-在2020-01-01和2020-01-02之间的特定时间段内访问该网站的唯一访问者的数量 (cond1 和 cond2条件)。
  • r3-在2020-01-01和2020-01-03之间的特定时间段内访问该网站的唯一访问者的数量 (cond1 和 cond3条件)。

4.2 window funnel函数测试:

window_funnel并merge到主分支,还没正式release,参考:

https://github.com/StarRocks/starrocks/pull/5542

4.2.1 Use case(datetime)

mysql> select * from action;
+------+------------+---------------------+
| uid  | event_type | time                |
+------+------------+---------------------+
| 1    | 浏览       | 2020-01-02 11:00:00 |
| 1    | 点击       | 2020-01-02 11:10:00 |
| 1    | 下单       | 2020-01-02 11:20:00 |
| 1    | 支付       | 2020-01-02 11:30:00 |
| 1    | 浏览       | 2020-01-02 11:00:00 |
| 2    | 下单       | 2020-01-02 11:00:00 |
| 2    | 支付       | 2020-01-02 11:10:00 |
| 3    | 浏览       | 2020-01-02 11:20:00 |
| 3    | 点击       | 2020-01-02 12:00:00 |
| 4    | 浏览       | 2020-01-02 11:50:00 |
| 4    | 点击       | 2020-01-02 12:00:00 |
| 5    | 浏览       | 2020-01-02 11:50:00 |
| 5    | 点击       | 2020-01-02 12:00:00 |
| 5    | 下单       | 2020-01-02 11:10:00 |
| 6    | 浏览       | 2020-01-02 11:50:00 |
| 6    | 点击       | 2020-01-02 12:00:00 |
| 6    | 下单       | 2020-01-02 12:10:00 |
+------+------------+---------------------+
17 rows in set (0.01 sec)

mysql> select uid, window_funnel(600, time, 0, [event_type="浏览", event_type="点击", event_type="下单", event_type="支付"]) as level from action group by uid order by uid;
+------+-------+
| uid  | level |
+------+-------+
| 1    |     2 |
| 2    |     0 |
| 3    |     1 |
| 4    |     2 |
| 5    |     2 |
| 6    |     2 |
+------+-------+
6 rows in set (0.04 sec)

4.2.2 Use case(date)

mysql> select * from action9;
+------+------------+------------+
| uid  | event_type | time       |
+------+------------+------------+
| 1    | 浏览       | 2020-01-02 |
| 1    | 点击       | 2020-01-03 |
| 1    | 下单       | 2020-01-04 |
| 1    | 支付       | 2020-01-05 |
| 2    | 浏览       | 2020-01-02 |
| 2    | 点击       | 2020-01-06 |
+------+------------+------------+
6 rows in set (0.01 sec)

mysql> select uid, window_funnel(3, time, 0, [event_type="浏览", event_type="点击", event_type="下单", event_type="支付"]) as level from action9 group by uid order by uid;
+------+-------+
| uid  | level |
+------+-------+
| 1    |     4 |
| 2    |     1 |
+------+-------+
2 rows in set (0.02 sec)

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/851518.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号