栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

【ClickHouse实战】1. Clickhouse部署(单节点及集群)及实战案例验证

【ClickHouse实战】1. Clickhouse部署(单节点及集群)及实战案例验证

ClickHouse实战系列目录
    Clickhouse部署(单节点及集群)及实战案例验证


文章目录

ClickHouse实战系列目录前言一、安装指引

1. 验证硬件配置2. 安装部署(单节点)3. 关于配置文件4. 修改数据存储路径 二、使用验证

1. 启动服务和客户端登入

1.1 启动clickhouse-server1.2 通过clickhouse-client快速访问 2. 创建数据库和相关测试表

2.1 创建数据库2.2 数据准备2.3 新建表格2.4 通过clickhouse-local导入数据2.5 通过Tick聚合生成分钟Bar 监控工具 三、构造集群

1. 添加节点 总结


前言

由于工作需要,需要搭建公司级别的OLAP服务。经过前期的选型,初步确定使用Clickhouse。本文会根据官方文档和网上的资料,学习如何搭建Clickhouse单节点和集群。并且通过实例验证Clickhouse相关的功能

配置清单

名称版本
Ubuntu20.04
clickhouse22.3.x
docker20.10.7
硬盘SSD
CPUIntel 11th Gen Core i9-11900KF
RAM64G

一、安装指引 1. 验证硬件配置

通过官网的 操作指引 进行操作。因为需要下载数据的原因,需要较长时间。
备注1:hardware.sh中间有一步涉及某个需要【科学上网】才能下载的文件,请提前做好准备
备注2:自动测试总是有不可控步骤(这里是坑,你懂的...),建议使用人工运行

备注3:人工运行的操作步骤中,由于wget经常会出现“段终断”错误且断点续传无效,建议使用迅雷下载 hits_100m_obfuscated_v1.tar.xz 再拷贝到测试文件夹中

关于数据表结构:参考如下schema,schema的设置会跟查询的延时相关,可作为参考

CREATE TABLE datasets.hits_v1
(
  WatchID UInt64,
  JavaEnable UInt8,
  Title String,
  GoodEvent Int16,
  EventTime DateTime,
  EventDate Date,
  CounterID UInt32,
  ClientIP UInt32,
  ClientIP6 FixedString(16),
  RegionID UInt32,
  UserID UInt64,
  CounterClass Int8,
  OS UInt8,
  UserAgent UInt8,
  URL String,
  Referer String,
  URLDomain String,
  RefererDomain String,
  Refresh UInt8,
  IsRobot UInt8,
  RefererCategories Array(UInt16),
  URLCategories Array(UInt16),
  URLRegions Array(UInt32),
  RefererRegions Array(UInt32),
  ResolutionWidth UInt16,
  ResolutionHeight UInt16,
  ResolutionDepth UInt8,
  FlashMajor UInt8,
  FlashMinor UInt8,
  FlashMinor2 String,
  NetMajor UInt8,
  NetMinor UInt8,
  UserAgentMajor UInt16,
  UserAgentMinor FixedString(2),
  cookieEnable UInt8,
  JavascriptEnable UInt8,
  IsMobile UInt8,
  MobilePhone UInt8,
  MobilePhoneModel String,
  Params String,
  IPNetworkID UInt32,
  TraficSourceID Int8,
  SearchEngineID UInt16,
  SearchPhrase String,
  AdvEngineID UInt8,
  IsArtifical UInt8,
  WindowClientWidth UInt16,
  WindowClientHeight UInt16,
  ClientTimeZone Int16,
  ClientEventTime DateTime,
  SilverlightVersion1 UInt8,
  SilverlightVersion2 UInt8,
  SilverlightVersion3 UInt32,
  SilverlightVersion4 UInt16,
  PageCharset String,
  CodeVersion UInt32,
  Islink UInt8,
  IsDownload UInt8,
  IsNotBounce UInt8,
  FUniqID UInt64,
  HID UInt32,
  IsOldCounter UInt8,
  IsEvent UInt8,
  IsParameter UInt8,
  DontCountHits UInt8,
  WithHash UInt8,
  HitColor FixedString(1),
  UTCEventTime DateTime,
  Age UInt8,
  Sex UInt8,
  Income UInt8,
  Interests UInt16,
  Robotness UInt8,
  GeneralInterests Array(UInt16),
  RemoteIP UInt32,
  RemoteIP6 FixedString(16),
  WindowName Int32,
  OpenerName Int32,
  HistoryLength Int16,
  BrowserLanguage FixedString(2),
  BrowserCountry FixedString(2),
  SocialNetwork String,
  SocialAction String,
  HTTPError UInt16,
  SendTiming Int32,
  DNSTiming Int32,
  ConnectTiming Int32,
  ResponseStartTiming Int32,
  ResponseEndTiming Int32,
  FetchTiming Int32,
  RedirectTiming Int32,
  DOMInteractiveTiming Int32,
  DOMContentLoadedTiming Int32,
  DOMCompleteTiming Int32,
  LoadEventStartTiming Int32,
  LoadEventEndTiming Int32,
  NSToDOMContentLoadedTiming Int32,
  FirstPaintTiming Int32,
  RedirectCount Int8,
  SocialSourceNetworkID UInt8,
  SocialSourcePage String,
  ParamPrice Int64,
  ParamOrderID String,
  ParamCurrency FixedString(3),
  ParamCurrencyID UInt16,
  GoalsReached Array(UInt32),
  OpenstatServiceName String,
  OpenstatCampaignID String,
  OpenstatAdID String,
  OpenstatSourceID String,
  UTMSource String,
  UTMMedium String,
  UTMCampaign String,
  UTMContent String,
  UTMTerm String,
  FromTag String,
  HasGCLID UInt8,
  RefererHash UInt64,
  URLHash UInt64,
  CLID UInt32,
  YCLID UInt64,
  ShareService String,
  ShareURL String,
  ShareTitle String,
  ParsedParams Nested(
    Key1 String,
    Key2 String,
    Key3 String,
    Key4 String,
    Key5 String,
    ValueDouble Float64
  ), 
  IslandID FixedString(16),  
  RequestNum UInt32,  
  RequestTry UInt8
) ENGINE = MergeTree() 
PARTITION BY toYYYYMM(EventDate) 
ORDER BY(CounterID, EventDate, intHash32(UserID)) 
SAMPLE BY intHash32(UserID) 
SETTINGS index_granularity = 8192

关于验证结果:

在我的服务器配置下,得到以下测试数据详细测试报告请点击 测试结果文件

查询说明平均耗时首次查询第二次第三次
1count0.0010.0010.0010.001
2count+where0.0080.0100.0070.006
3sum+avg0.0240.0300.0220.020
4sum0.0560.1000.0330.035
5uniq(int)0.0890.1140.0790.074
6uniq(string)0.2250.2410.2090.225
7min+max0.0010.0020.0010.001
8where+groupby+orderby0.0070.0080.0070.007
9uniq(int)+where+groupby+orderby+limit0.5320.5650.5190.511
10sum+count+avg+uniq(int)+where+groupby+orderby+limit0.6060.6290.5900.599
11uniq(int)+where+groupby+orderby+limit0.1410.1590.1300.134
12uniq(int)+where+双groupby+orderby+limit0.1630.1900.1490.150
13count+where+groupby+orderby+limit0.9390.9760.9270.915
14uniq(int)+where+groupby+orderby+limit1.2271.2731.2081.199
15ucount+where+双groupby+orderby+limit1.0571.0861.0441.041
16count+where+groupby+orderby(count)+limit1.2081.2291.1961.200
17count+where+双groupby+orderby(count)+limit3.3033.2063.4913.213
18count+where+双groupby+limit1.8081.8411.7741.809
19count+toMinute+where+双groupby+orderby+limit5.8725.9195.8975.799
20where精确匹配0.0600.1040.0390.037
21count+where模糊匹配0.8361.1760.6390.694
22any+count+where模糊匹配+groupby+orderby+limit1.0151.4070.8140.825
23any+count+uniq+where模糊匹配+groupby+orderby+limit2.4992.9842.3912.121
24select*(大宽表)+where模糊匹配+orderby+limit1.1952.1000.7700.716
25orderby(time)+limit0.2580.3420.2200.211
26orderby(string)+limit0.2070.2220.2110.189
27orderby(time+string)+limit0.2640.3460.2220.224
28count+avg+where+groupby+having+orderby+limit0.9751.2720.8320.822
29domainWithoutWWW+count+any+where+groupby+having+orderby+limit1.3651.5071.3061.282
30sum(89个field)3.5963.6193.5733.597
31select(uint16字段)+count+sum+avg+where+双groupby+orderby+limit0.7200.7610.6950.703
32select(uint64字段)+count+sum+avg+where+双groupby+orderby+limit1.2741.3751.2171.229
33select(uint64字段)+count+sum+avg+双groupby+orderby+limit9.1118.5769.6869.070
34select(string字段)+count+groupby(string字段)+orderby+limit5.7115.6345.6995.801
35select(string字段)+count+双groupby(其中之一为string字段)+orderby+limit5.8926.0905.7895.797
36select(UInt3)+算术运算+count+groupby(算术运算)+orderby+limit2.0001.9962.0571.946
37select(URL字段更长)+count+where(复杂查询)+groupby(URL)+orderby+limit0.1120.1190.1050.112
38select(Title字段稍短)+count+where(复杂查询)+groupby(URL)+orderby+limit0.0430.0490.0400.040
39select(URL字段更长)+count+where(复杂查询与37号查询条件不一样)+groupby(URL)+orderby+limit0.0410.0480.0380.038
40select(包含Nested字段)+count+where(复杂查询)+groupby(多个条件)+orderby+limit0.2430.2610.2370.231
41select+count+where(复杂查询+halfMD5函数)+groupby(多个条件)+orderby+limit0.0190.0290.0130.014
42select+count+where(复杂查询+halfMD5函数)+groupby(多个条件)+orderby+limit0.0140.0170.0130.011
43select(包含toStartOfMinute函数)+count+where(复杂查询)+groupby(多个条件)+orderby+limit0.0030.0030.0020.003
2. 安装部署(单节点)

参考官网 安装部署 已经足够详细。

3. 关于配置文件

参考官网 配置文件。
由于权限问题,在后续配置操作中建议转换成root用户进行操作

默认clickhouse的配置目录在/etc/clickhouse-server

config.xml – 负责管理clickhouse服务的相关配置users.xml – 负责管理用户权限的配置users.d – 文件夹,可以将每一个用户单独配置xml文件,然后存放在users.d中,系统会扫描该文件夹加载用户配置

4. 修改数据存储路径

原目录:/var/lib/clickhouse
该路径为clickhouse默认存放数据的路径
新目录:/opt/clickhouse
有时候由于原目录会挂载在系统盘,系统盘通常的空间较小,需要扩展空间(例如挂载一个外部的磁盘),这时候需要创建一个新的路径。该路径不要建在/home/xxx下,某些版本的clickhouse会报错

方法一:修改confgi.xml
请首先切换成root权限

vim /etc/clickhouse-server/config.xml

找到等包含/var/lib/clickhouse的路径,替换成目标路径,如:/opt/clickhouse

方法二:迁移目录+软链接
个人建议使用这个方法,对配置的入侵性较低

    移动默认数据路径到目标路径
mv /var/lib/clickhouse/* /opt/clickhouse/
mv /var/log/clickhouse-server /opt/clickhouse/log/
    建立软链接
ln -s /opt/clickhouse /var/lib
ln -s /opt/clickhouse/log/clickhouse-server /var/log
    对新的目录赋权
chown -Rc clickhouse:clickhouse /opt/clickhouse
chown -Rc clickhouse:clickhouse /opt/clickhouse/log/clickhouse-server

二、使用验证 1. 启动服务和客户端登入 1.1 启动clickhouse-server
sudo service clickhouse-server start
1.2 通过clickhouse-client快速访问

第一次登录,请使用默认用户,-m选项是为了让clickhouse-client支持多行输入

clickhouse-client -m --user=default --password=[默认用户密码]

当看见以下输入框时,代表成功连接

2. 创建数据库和相关测试表 2.1 创建数据库
create database test;

show databases;

2.2 数据准备

任务目标:

    通过期货Tick数据的csv文件,创建future_tick_test的表结构将csv文件导入到future_tick_test的表中通过clickhouse的相关函数操作,生成1分钟的Bar

样例文件

rb1610.SHFE.csv该csv是2016-03-25的数据,期货合约为rb1610.SHFE,csv没有header所以后续建表的时候需要对列头进行处理

2.3 新建表格

根据csv中间的数据结构,创建一张列数与csv相同的表。
备注:表的列数可以与csv不相同只需要导入的时候做好处理就可以
可以参考官方说明 csv导入

# 加入IF NOT ESIXTS字段,防止重新创建新表
# 亚秒精度为6位
# 价格的精度为小数点后6位
CREATE TABLE IF NOT EXISTS test.future_tick_test(
  ExchangeTime DateTime64(6),
  LastPrice Decimal64(6),
  Volume Int32,
  TotalVoume Int64,
  TotalAmount Decimal64(6),
  Bid1 Decimal64(6),
  Bid2 Decimal64(6),
  Bid3 Decimal64(6),
  Bid4 Decimal64(6),
  Bid5 Decimal64(6),
  BidVolume1 Int32,
  BidVolume2 Int32,
  BidVolume3 Int32,
  BidVolume4 Int32,
  BidVolume5 Int32,
  Ask1 Decimal64(6),
  Ask2 Decimal64(6),
  Ask3 Decimal64(6),
  Ask4 Decimal64(6),
  Ask5 Decimal64(6),
  AskVolume1 Int32,
  AskVolume2 Int32,
  AskVolume3 Int32,
  AskVolume4 Int32,
  AskVolume5 Int32,
  UpperLimit Decimal64(6),
  LowerLimit Decimal64(6),
  LocalTime DateTime(6),
  OpenInterest Int64,
  Open Decimal64(6),
  High Decimal64(6),
  Low Decimal64(6),
  Close Decimal64(6),
  SettlementPrice Decimal64(6) 
) 
ENGINE = MergeTree
ORDER BY ExchangeTime;
2.4 通过clickhouse-local导入数据

执行脚本如下:

# 如果执行不成功,请在最后加上 --user=default --password=[对应的密码]
cat rb1610.SHFE.csv | clickhouse-client --query="INSERT INTO test.future_tick_test SELECT  parseDateTime64BestEffort(concat('2016-03-25 ', col1)), col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18, col19, col20, col21, col22, col23, col24, col25, col26, col27, parseDateTime64BestEffort(concat('2016-03-25 ', col28)), col29, col30, col31, col32, col33, col34 FROM input('col1 String, col2 Decimal64(6), col3 Int32, col4 Int64, col5 Decimal64(6), col6 Decimal64(6), col7 Decimal64(6), col8 Decimal64(6), col9 Decimal64(6), col10 Decimal64(6), col11 Int32, col12 Int32, col13 Int32, col14 Int32, col15 Int32, col16 Decimal64(6), col17 Decimal64(6), col18 Decimal64(6), col19 Decimal64(6), col20 Decimal64(6), col21 Int32, col22 Int32, col23 Int32, col24 Int32, col25 Int32, col26 Decimal64(6), col27 Decimal64(6), col28 String, col29 Int64, col30 Decimal64(6), col31 Decimal64(6), col32 Decimal64(6), col33 Decimal64(6), col34 Decimal64(6)') FORMAT CSV"

为了更方便看到Insert语句的内容,格式化后的Insert语句如下

# parseDateWithBestEffort是clickhouse中的函数,方便将String转成DateTime
# input('') 需要定义csv中对应列的数据类型,colN为第几列,从1开始计数
INSERT INTO 
	test.future_tick_test 
	SELECt  
		parseDateTimeWithBestEffort(concat('2016-03-25 ', col1)), 
		col2, col3, col4, col5, col6, col7, col8, col9, col10,
		col11, col12, col13, col14, col15, col16, col17, col18, col19, col20,
		col21, col22, col23, col24, col25, col26, col27,
		parseDateTimeWithBestEffort(concat('2016-03-25 ', col28)), 
		col29, col30, col31, col32, col33, col34 
	FROM 
		input('col1 string, col2 Decimal64(6), col3 Int32, col4 Int64, col5 Decimal64(6), col6 Decimal64(6), col7 Decimal64(6), col8 Decimal64(6), col9 Decimal64(6), col10 Decimal64(6), col11 Int32, col12 Int32, col13 Int32, col14 Int32, col15 Int32, col16 Decimal64(6), col17 Decimal64(6), col18 Decimal64(6), col19 Decimal64(6), col20 Decimal64(6), col21 Int32, col22 Int32, col23 Int32, col24 Int32, col25 Int32, col26 Decimal64(6), col27 Decimal64(6), col28 string, col29 Int64, col30 Decimal64(6), col31 Decimal64(6), col32 Decimal64(6), col33 Decimal64(6), col34 Decimal64(6)') 
FORMAT CSV

执行成功后通过clickhouse-client查询对应表,发现已经有数据

TX-Power-G :) select * from future_tick_test limit 10;

SELECt *
FROM future_tick_test
LIMIT 10

Query id: 6e37077b-9f53-433d-aaee-702b14274438

┌───────────────ExchangeTime─┬─LastPrice─┬─Volume─┬─TotalVoume─┬─TotalAmount─┬─Bid1─┬─Bid2─┬─Bid3─┬─Bid4─┬─Bid5─┬─BidVolume1─┬─BidVolume2─┬─BidVolume3─┬─BidVolume4─┬─BidVolume5─┬─Ask1─┬─Ask2─┬─Ask3─┬─Ask4─┬─Ask5─┬─AskVolume1─┬─AskVolume2─┬─AskVolume3─┬─AskVolume4─┬─AskVolume5─┬─UpperLimit─┬─LowerLimit─┬──────────────────LocalTime─┬─OpenInterest─┬─Open─┬─High─┬──Low─┬─Close─┬─SettlementPrice─┐
│ 2016-03-25 20:59:00.500000 │      2190 │   6028 │       6028 │   132013000 │ 2189 │    0 │    0 │    0 │    0 │         81 │          0 │          0 │          0 │          0 │ 2190 │    0 │    0 │    0 │    0 │        446 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:58:37.832000 │      2599820 │ 2190 │ 2190 │ 2190 │     0 │               0 │
│ 2016-03-25 21:00:00.500000 │      2190 │   5392 │      11420 │   250071000 │ 2190 │    0 │    0 │    0 │    0 │        388 │          0 │          0 │          0 │          0 │ 2191 │    0 │    0 │    0 │    0 │         24 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:37.464000 │      2600460 │ 2190 │ 2191 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:01.000000 │      2192 │   3022 │      14442 │   316259000 │ 2191 │    0 │    0 │    0 │    0 │         28 │          0 │          0 │          0 │          0 │ 2192 │    0 │    0 │    0 │    0 │        425 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:37.950000 │      2600130 │ 2190 │ 2192 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:01.500000 │      2193 │   3290 │      17732 │   388360000 │ 2192 │    0 │    0 │    0 │    0 │         50 │          0 │          0 │          0 │          0 │ 2193 │    0 │    0 │    0 │    0 │        438 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:38.458000 │      2601360 │ 2190 │ 2193 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:02.000000 │      2193 │   1714 │      19446 │   425939000 │ 2192 │    0 │    0 │    0 │    0 │        380 │          0 │          0 │          0 │          0 │ 2193 │    0 │    0 │    0 │    0 │        290 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:38.958000 │      2602080 │ 2190 │ 2193 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:02.500000 │      2192 │   1488 │      20934 │   458566000 │ 2193 │    0 │    0 │    0 │    0 │         10 │          0 │          0 │          0 │          0 │ 2194 │    0 │    0 │    0 │    0 │        992 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:39.464000 │      2602560 │ 2190 │ 2194 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:03.000000 │      2193 │   1684 │      22618 │   495495000 │ 2193 │    0 │    0 │    0 │    0 │        155 │          0 │          0 │          0 │          0 │ 2194 │    0 │    0 │    0 │    0 │       1054 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:39.947000 │      2603760 │ 2190 │ 2194 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:03.500000 │      2193 │   1886 │      24504 │   536854000 │ 2192 │    0 │    0 │    0 │    0 │        784 │          0 │          0 │          0 │          0 │ 2193 │    0 │    0 │    0 │    0 │        110 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:40.455000 │      2604160 │ 2190 │ 2194 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:04.000000 │      2192 │    574 │      25078 │   549439000 │ 2192 │    0 │    0 │    0 │    0 │        653 │          0 │          0 │          0 │          0 │ 2193 │    0 │    0 │    0 │    0 │       1711 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:40.963000 │      2604440 │ 2190 │ 2194 │ 2187 │     0 │               0 │
│ 2016-03-25 21:00:04.500000 │      2192 │   2172 │      27250 │   597060000 │ 2192 │    0 │    0 │    0 │    0 │         42 │          0 │          0 │          0 │          0 │ 2193 │    0 │    0 │    0 │    0 │       1637 │          0 │          0 │          0 │          0 │          0 │          0 │ 2016-03-25 20:59:41.448000 │      2605580 │ 2190 │ 2194 │ 2187 │     0 │               0 │
└────────────────────────────┴───────────┴────────┴────────────┴─────────────┴──────┴──────┴──────┴──────┴──────┴────────────┴────────────┴────────────┴────────────┴────────────┴──────┴──────┴──────┴──────┴──────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────────────────────┴──────────────┴──────┴──────┴──────┴───────┴─────────────────┘

10 rows in set. Elapsed: 0.019 sec. Processed 8.19 thousand rows, 1.87 MB (438.61 thousand rows/s., 100.00 MB/s.)

TX-Power-G :) 
2.5 通过Tick聚合生成分钟Bar
    clickhouse自带非常好用的时间聚合函数,如:toStartOfXXX(),可以按多种时间周期聚合当聚合之后,需要计算OHLCV的这些指标的时候,因为需要在分片中做简单的统计,所以需要partition相关的函数。幸运的是在clickhouse 21.x版本之后,已经支持window function,可以方便我们做窗口统计。关于window function的说明,请看window function然而window function的说明文档写得过于精简,如果想要更多的例子,请查看GitHub上面window_function例子

下面会循序渐进的演示如何从Tick数据聚合成分钟Bar

    先查询出相关的数据,并且Group by在一起排序
# + INTERVAL 1 MINUTE 的意思是将09:00:00-09:01:00之间的Tick聚合之后,算作09:01的Bar
SELECt 
  ExchangeTime,
  toStartOfMinute(ExchangeTime) + INTERVAL 1 MINUTE AS Time,
  LastPrice,
  Volume AS TickVol
FROM future_tick_test
GROUP BY 
  ExchangeTime,
  Time,
  LastPrice,
  TickVol
ORDER BY 
  ExchangeTime,
  Time
    这时候首先想到需要基于Group by的数据进行分片,在21.x版本之前,由于没有window function,需要通过复杂的groupArray和arrayEnumerate等函数来解决。在21.x版本之后,我们可以直接通过partition关键字来处理
# 此处可以看到,通过window关键字定义了两个partition,分别是基于ExchangeTime做升序和降序的排序
# 这样处理的目的是通过any()函数可以方便访问到partition中第一条数据和最后一条数据,方便生成Open和Close
SELECt *,
  any(LastPrice) over w1 as Open,
  max(LastPrice) over w1 as High,
  min(LastPrice) over w1 as Low,
  any(LastPrice) over w2 as Close,
  sum(TickVol) over w1 as MinVol,
  count() over w2 as num
FROM (
  SELECt 
    ExchangeTime,
    toStartOfMinute(ExchangeTime) + INTERVAL 1 MINUTE AS Time,
    LastPrice,
    Volume AS TickVol
  FROM future_tick_test
  GROUP BY 
    ExchangeTime,
    Time,
    LastPrice,
    TickVol
  ORDER BY 
    ExchangeTime,
    Time
)
WINDOW 
  w1 AS (PARTITION BY Time ORDER BY ExchangeTime),
  w2 as (PARTITION BY Time ORDER BY ExchangeTime DESC)
    基于这个数据集,通过where将num=1的时候,可以取出所有分钟Bar
SELECt 
  Time, Open, High, Low, Close, MinVol as Volume
FROM (
  SELECt *,
    any(LastPrice) over w1 as Open,
    max(LastPrice) over w1 as High,
    min(LastPrice) over w1 as Low,
    any(LastPrice) over w2 as Close,
    sum(TickVol) over w1 as MinVol,
    count() over w2 as num
  FROM (
    SELECt 
      ExchangeTime,
      toStartOfMinute(ExchangeTime) + INTERVAL 1 MINUTE AS Time,
      LastPrice,
      Volume AS TickVol
    FROM future_tick_test
    GROUP BY 
      ExchangeTime,
      Time,
      LastPrice,
      TickVol
    ORDER BY 
      ExchangeTime,
      Time
  )
  WINDOW 
    w1 AS (PARTITION BY Time ORDER BY ExchangeTime),
    w2 as (PARTITION BY Time ORDER BY ExchangeTime DESC)
)
WHERe num = 1
    最后结果取前十条如下:
Query id: 466a5a99-3464-4119-aec2-b15fc8260654

┌────────────────Time─┬─Open─┬─High─┬──Low─┬─Close─┬─Volume─┐
│ 2016-03-25 21:00:00 │ 2190 │ 2190 │ 2190 │  2190 │   6028 │
│ 2016-03-25 21:01:00 │ 2190 │ 2193 │ 2186 │  2193 │  94536 │
│ 2016-03-25 21:02:00 │ 2192 │ 2192 │ 2186 │  2188 │  66494 │
│ 2016-03-25 21:03:00 │ 2187 │ 2189 │ 2185 │  2186 │  46082 │
│ 2016-03-25 21:04:00 │ 2187 │ 2188 │ 2186 │  2186 │  40514 │
│ 2016-03-25 21:05:00 │ 2186 │ 2186 │ 2182 │  2183 │  67240 │
│ 2016-03-25 21:06:00 │ 2184 │ 2191 │ 2183 │  2190 │  68632 │
│ 2016-03-25 21:07:00 │ 2189 │ 2195 │ 2189 │  2193 │  69874 │
│ 2016-03-25 21:08:00 │ 2193 │ 2197 │ 2192 │  2197 │  86218 │
│ 2016-03-25 21:09:00 │ 2197 │ 2199 │ 2195 │  2195 │  77088 │
└─────────────────────┴──────┴──────┴──────┴───────┴────────┘

10 rows in set. Elapsed: 0.018 sec. Processed 21.59 thousand rows, 431.90 KB (1.20 million rows/s., 23.95 MB/s.)

TX-Power-G :) 
    整个Tick数据集总数为21595,聚合分钟Bar处理的时间如下:
181 rows in set. Elapsed: 0.020 sec. 
Processed 21.59 thousand rows, 431.90 KB (1.07 million rows/s., 21.45 MB/s.)
监控工具

待完成


三、构造集群 1. 添加节点

待完成


总结

本篇文章基于clickhouse的官方文档,做了以下操作:

    尝试搭建clickhouse的服务,并且通过一个1亿条数据的数据集,测试各种查询语句的性能通过一份csv数据导入到clickhouse的表结构里面通过clickhouse的函数,将Tick数据聚合成分钟Bar重要提示 官方文档的 示例数据集 中有很多很好的实践例子,可以反复查阅学习
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/779333.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号