本文使用的版本是21.12.3.32, 准备6台虚拟机作为集群
单机版-传送门
https://blog.csdn.net/xushijie89/article/details/122987121
(1)修改每台配置文件, 文件不存在可以新建一下
[root@172.16.131.56 /etc/clickhouse-server]$ vim /etc/clickhouse-server/config.d/metrika.xmltrue hadoop1 9001 hadoop2 9001 true hadoop3 9001 hadoop4 9001 true hadoop5 9001 hadoop6 9001 hadoop1 2181 hadoop2 2181 hadoop3 2181 01 database_name hadoop1 10000000000 0.01 lz4
(2)检查一下ClickHouse配置指向路径是否正确
[root@172.16.131.56 /etc/clickhouse-server]$ vim /etc/clickhouse-server/config.xml 检查这行配置/etc/clickhouse-server/config.d/metrika.xml
(3)启动服务
/etc/init.d/clickhouse-server start
(4)连接控制台, 创建数据库
[root@172.16.131.56 /etc/clickhouse-server/config.d]$ clickhouse-client -m CREATE DATAbase db_name on cluster cluster_3shards_2replicas COMMENT '测试库'
(5)新建复制表(物理表), 真实保存数据的
create table table_name on cluster cluster_3shards_2replicas (
id UInt32,
sku_id String,
total_amount Decimal(16,2),
create_time Datetime
) engine
= ReplicatedMergeTree('/clickhouse/tables/{shard}/{database}/table_name','{replica}')
partition by toYYYYMMDD(create_time)
primary key (id)
order by (id,sku_id);
on cluster cluster_3shards_2replicas:
建表时加上这个参数, 可以在cluster_3shards_2replicas指定的分片中, 都执行一次命令
ReplicatedMergeTree()的参数:
zoo_path: ZooKeeper中表的路径
replica_name: ZooKeeper中副本的名字
other_parameters(可以不填): Parameters of an engine which is used for creating the replicated version, for example, version in ReplacingMergeTree.
(6)新建分布式表(逻辑表), 不保存数据
CREATE TABLE db_name.table_name_all on cluster cluster_3shards_2replicas(
id UInt32,
sku_id String,
total_amount Decimal(16,2),
create_time Datetime
) ENGINE
= Distributed(cluster_3shards_2replicas, db_name, table_name, hiveHash(id));
Distributed()的参数:
cluster: 配置里的分片名
database: 数据库名
table: 表名
sharding_key(可选): 分片键
policy_name(可选): 策略名, 用于存储临时文件进行异步发送
(6)先在分布式表上插入数据
insert into table_name_all values (201, ‘sku_001’, 1000.00, ‘2020-06-01 12:00:00’);
逐台机器查看数据落在哪个分片的物理表上
select * from table_name;
最后看下分片的副本是否也有数据存在



