本人菜鸟一枚,此文章用于记录所学知识点,如有错误,望各位大佬指点,我及时更正!
核心概念-
索引
相当于MySQL中数据库的概念 -
文档
相当于MySQL中一行数据的概念
注意:在ElasticSearch中 所有的数据操作都是以JSON格式表示
架构仍在学习当中。。。
IK分词器- 最少切分算法 ik_smart,示例
GET _analyze //分词器
{
"analyzer": "ik_smart", //分词要求
"text": "罗老师喜欢讲张三" //具体文本
}
//返回结果
{
"tokens" : [
{
"token" : "罗",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "老师",
"start_offset" : 1,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 1
},
{
"token" : "喜欢",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "讲",
"start_offset" : 5,
"end_offset" : 6,
"type" : "CN_CHAR",
"position" : 3
},
{
"token" : "张三",
"start_offset" : 6,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 4
}
]
}
- 最细粒度划分算法 ik_max_word示例
GET _analyze
{
"analyzer": "ik_max_word", //除了输入文本,elasticSearch也会把出现在它词库里的词分开
"text": "罗老师喜欢讲张三"
}
//返回结果
{
"tokens" : [
{
"token" : "罗",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "老师",
"start_offset" : 1,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 1
},
{
"token" : "喜欢",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "讲",
"start_offset" : 5,
"end_offset" : 6,
"type" : "CN_CHAR",
"position" : 3
},
{
"token" : "张三",
"start_offset" : 6,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 4
},
{
"token" : "三",
"start_offset" : 7,
"end_offset" : 8,
"type" : "TYPE_CNUM",
"position" : 5
}
]
}
基本操作命令
索引操作命令
PUT /test1 //创建索引 test1 DELETe /test1 //删除索引 GET /_cat/indices?v //查看索引状态文档操作命令
PUT /test1/doc/1 //单挑数据插入 索引: test1 类型:doc Id: 1
{
"name": "张三"
}
GET /test1/doc/1 //查看数据
POST /test1/doc/1/_update //修改数据
{
"doc": {"name": "张三2号"}
}
DELETE /test1/doc/1 //删除数据
POST /test1/doc/_bulk //批量插入
{"index":{"_id": "1"}}
{"name": "张三1号","age":1,"tel": 111,"father":"1号父亲","nickName":"小三三1号"}
{"index":{"_id": "2"}}
{"name": "张三2号","age":2,"tel": 222,"father":"2号父亲","nickName":"小三三2号"}
{"index":{"_id": "3"}}
{"name": "张三3号","age":3,"tel": 333,"father":"3号父亲","nickName":"小三三3号"}
{"index":{"_id": "4"}}
{"name": "张三4号","age":4,"tel": 444,"father":"4号父亲","nickName":"小三三4号"}
{"index":{"_id": "5"}}
{"name": "张三5号","age":5,"tel": 555,"father":"5号父亲","nickName":"小三三5号"}
{"index":{"_id": "6"}}
{"name": "张三6号","age":6,"tel": 666,"father":"6号父亲","nickName":"小三三6号"}
{"index":{"_id": "7"}}
{"name": "张三7号","age":7,"tel": 777,"father":"7号父亲","nickName":"小三三7号"}
{"index":{"_id": "8"}}
{"name": "张三8号","age":8,"tel": 888,"father":"8号父亲","nickName":"小三三8号"}
{"index":{"_id": "9"}}
{"name": "张三9号","age":9,"tel": 999,"father":"9号父亲","nickName":"小三三9号"}
数据搜索命令
基础数据搜索命令
- 搜索全部
GET /test1/_search
{
"query": {"match_all": {}},
}
- 分页搜索全部
GET /test1/_search
{
"query": {"match_all": {}},
"from": 1,
"size": 5
}
- 按照指定字段降序排列
GET /test1/_search
{
"query": {"match_all": {}},
"sort": {"tel": "desc"}
}
- 搜索并返回指定字段
GET /test1/_search
{
"query": {"match_all": {}},
"_source": ["name" , "tel"]
}
- 匹配搜索
{
"query": {
"match": {
"father": "三三3号" //对于文本类型是模糊匹配
"tel": 222 //对于数值类型是精准匹配
}
}
}
- 短语匹配搜索
GET /test1/_search
{
"query": {
"match_phrase": {
"father": "号 父亲"
}
}
}
进阶数据搜索命令
- 组合搜索
// must:同时满足
// should: 满足任意一个
// must_not: 同时不满足
GET /test1/_search
{
"query": {
"bool": {
"must": [
{"match":{"father": "号"}},
{"match":{"father": "亲"}}
],
"must_not": [
{"match":{"name": "2"}}
]
}
}
}
- 过滤搜索
GET /test1/_search // 过滤出 tel在 300-500之间的数据
{
"query": {
"bool": {
"must":{"match_all": {}},
"filter":{
"range": {
"tel": {
"gte": 300,
"lte": 500
}
}
}
}
}
}
SpringBoot集成ElasticSearch
- 导入依赖
org.springframework.boot spring-boot-starter-data-elasticsearch
- 配置ElasticSearch配置类
@Configuration
public class ElasticSearchClientConfig {
// elasticSearch 默认配置
@Bean
public RestHighLevelClient restHighLevelClient(){
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("127.0.0.1", 9200, "http")));
return client;
}
}
-
爬取数据(或 从数据库数据获取)
-
将数据放入ES索引中
public Boolean parseContent(String keywords) throws Exception {
List contentList = new HtmlParseUtil().parseJD(keywords);
//把查询出来的数据 放入es中
//创建批量插入请求
BulkRequest bulkRequest = new BulkRequest();
//请求设置
bulkRequest.timeout("2m");
//向请求中插入数据
for (int i = 0; i < contentList.size(); i++) {
bulkRequest.add(new IndexRequest("jd_goods") //指定索引
.source(JSON.toJSONString(contentList.get(i)), XContentType.JSON));
}
//客户端发送请求
BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
return !bulkResponse.hasFailures();
}
- 获取数据并实现搜索功能 和高亮功能
public List源码地址
地址



