原文网址:ElasticSearch--布尔查询--使用/教程/实例_IT利刃出鞘的博客-CSDN博客
简介说明
本文用实例介绍ES的布尔查询的使用,实例有:简单查询、复杂查询:嵌套查询(布尔查询中有布尔查询)。
布尔查询的所有子查询之间的逻辑关系是与(and):只有当一个文档满足布尔查询中的所有子查询条件时,ElasticSearch引擎才认为该文档满足查询条件。
对于单个子句,只要一个文档满足该子句的查询条件,返回的逻辑结果就是true。对于should子句,它一般包含多个子查询条件,参数 minimum_should_match 控制文档必须满足should子句中的子查询条件的数量,只有当文档满足 minimum_should_match 时,should子句返回的逻辑结果才是true。
布尔查询的子查询
布尔查询支持的子查询类型共有四种,分别是:must,should,must_not和filter:
- must:文档必须匹配must查询条件;(相当于逻辑与)
- should:文档应该匹配should子句查询条件(相当于逻辑或)
- 若查询语句中有must或者filter,则should只影响评分,文档就算是没有匹配should中的项也会被查到(即:匹配should中的0项)
- 若查询语句中没有must和filter,文档必须匹配should中的至少1项
- must_not:与must相反,匹配该选项下的查询条件的文档不会被返回(相当于逻辑与)
- filter:和must—样 ,匹配filter选项下的查询条件的文档才会被返回;跟must的区别是:filter不评分(即:不影响score),只起到过滤功能。(相当于逻辑与)
四个子句,都可以是数组字段,因此,支持嵌套逻辑操作的查询。
官网
Bool Query | Elasticsearch Reference [7.15] | Elastic
详解说明
布尔查询将多个查询条件组合在一起。
在布尔查询中,查询被分为Query Context 和 Filter Context。must和should使用Query Context;filter参数和must_not参数使用Filter Context。这两个查询上下文的唯一区别是:Filter Context不影响查询的评分(score)。
由于过滤上下文不影响查询的评分,而评分计算让搜索变得复杂,消耗更多CPU资源,因此,filter和must_not查询能减轻搜索的工作负载。若经常使用Filter Context,引擎会自动缓存数据,提高查询性能。
should
通常should子句是数组字段,包含多个should子查询。默认情况下,匹配的文档必须满足其中大于等于0个子查询条件。该数值由minimum_should_match控制。
如果查询语句中有must或者filter,文档就算是没有匹配should中的项也会被查到,在这种情况下,should子句仅用来影响评分。(也就是minimum_should_match为0)。
如果查询语句中没有must也没有filter,那么文档必须匹配should中的至少一项。(也就是minimum_should_match为1)。
例如,对于以下should查询,一个文档必须满足should子句中两个以上的词条查询:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "ElasticSearch 学习"}},
{ "match": { "title": "Java学习笔记" }},
{ "match": { "title": "搜索引擎学习笔记" }}
],
"minimum_should_match" : 2
}
}
}
索引结构和数据
为了下边的实例,先创建好索引,然后插入数据。
索引结构
http://localhost:9200/
PUT blog
{
"mappings": {
"properties": {
"id":{
"type":"long"
},
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"author":{
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"category":{
"type": "keyword"
},
"createTime": {
"type": "date",
"format":"yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd'T'HH:mm:ss.SSS||yyyy-MM-dd HH:mm:ss||epoch_millis"
},
"updateTime": {
"type": "date",
"format":"yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd'T'HH:mm:ss.SSS||yyyy-MM-dd HH:mm:ss||epoch_millis"
},
"status":{
"type":"integer"
},
"serialNum": {
"type": "keyword"
}
}
}
}
数据
- 每个文档必须独占一行,不能换行。
- 此命令要放到postman中去执行,如果用head执行会失败
http://localhost:9200/
POST _bulk
{"index":{"_index":"blog","_id":1}}
{"blogId":1,"title":"Spring Data ElasticSearch学习教程1","content":"这是批量添加的文档1","author":"Iron Man","category":"ElasticSearch","status":1,"serialNum":"1","createTime":"2021-10-10 11:52:01.249","updateTime":null}
{"index":{"_index":"blog","_id":2}}
{"blogId":2,"title":"Spring Data ElasticSearch学习教程2","content":"这是批量添加的文档2","author":"Iron Man","category":"ElasticSearch","status":1,"serialNum":"2","createTime":"2021-10-10 11:52:02.249","updateTime":null}
{"index":{"_index":"blog","_id":3}}
{"blogId":3,"title":"Spring Data ElasticSearch学习教程3","content":"这是批量添加的文档3","author":"Captain America","category":"ElasticSearch","status":1,"serialNum":"3","createTime":"2021-10-10 11:52:03.249","updateTime":null}
{"index":{"_index":"blog","_id":4}}
{"blogId":4,"title":"Spring Data ElasticSearch学习教程4","content":"这是批量添加的文档4","author":"Captain America","category":"ElasticSearch","status":1,"serialNum":"4","createTime":"2021-10-10 11:52:04.249","updateTime":null}
{"index":{"_index":"blog","_id":5}}
{"blogId":5,"title":"Spring Data ElasticSearch学习教程5","content":"这是批量添加的文档5","author":"Spider Man","category":"ElasticSearch","status":1,"serialNum":"5","createTime":"2021-10-10 11:52:05.249","updateTime":null}
{"index":{"_index":"blog","_id":6}}
{"blogId":6,"title":"Java学习教程6","content":"这是批量添加的文档6","author":"Spider Man","category":"ElasticSearch","status":1,"serialNum":"6","createTime":"2021-10-10 11:52:06.249","updateTime":null}
{"index":{"_index":"blog","_id":7}}
{"blogId":7,"title":"Java学习教程7","content":"这是批量添加的文档7","author":"Iron Man","category":"ElasticSearch","status":1,"serialNum":"7","createTime":"2021-10-10 11:52:07.249","updateTime":null}
{"index":{"_index":"blog","_id":8}}
{"blogId":8,"title":"Java学习教程8","content":"这是批量添加的文档8","author":"Iron Man","category":"ElasticSearch","status":1,"serialNum":"8","createTime":"2021-10-10 11:52:08.249","updateTime":null}
{"index":{"_index":"blog","_id":9}}
{"blogId":9,"title":"Java学习教程9","content":"这是批量添加的文档9","author":"Captain America","category":"ElasticSearch","status":1,"serialNum":"9","createTime":"2021-10-10 11:52:09.249","updateTime":null}
{"index":{"_index":"blog","_id":10}}
{"blogId":10,"title":"Java学习教程10","content":"这是批量添加的文档10","author":"God Of Thunder","category":"ElasticSearch","status":1,"serialNum":"10","createTime":"2021-10-10 11:52:10.249","updateTime":null}
执行之后的结果
简单查询 正确写法实例1:不指定minimum_should_match
需求:查询标题中带有“java”的博客,作者名字若包含“Iron”或者“Captain”则靠前展示。
http://localhost:9200/
POST blog/_search
{
"query": {
"bool": {
"must": {
"match": {
"title": "java"
}
},
"should": [{
"match": {
"author": "Iron"
}
}, {
"match": {
"author": "Captain"
}
}
]
}
}
}
结果
贴出所有结果
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 1.5296494,
"hits": [{
"_index": "blog",
"_type": "_doc",
"_id": "9",
"_score": 1.5296494,
"_source": {
"blogId": 9,
"title": "Java学习教程9",
"content": "这是批量添加的文档9",
"author": "Captain America",
"category": "ElasticSearch",
"status": 1,
"serialNum": "9",
"createTime": "2021-10-10 11:52:09.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "6",
"_score": 0.7361701,
"_source": {
"blogId": 6,
"title": "Java学习教程6",
"content": "这是批量添加的文档6",
"author": "Spider Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "6",
"createTime": "2021-10-10 11:52:06.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "7",
"_score": 0.63270766,
"_source": {
"blogId": 7,
"title": "Java学习教程7",
"content": "这是批量添加的文档7",
"author": "Iron Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "7",
"createTime": "2021-10-10 11:52:07.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "8",
"_score": 0.63270766,
"_source": {
"blogId": 8,
"title": "Java学习教程8",
"content": "这是批量添加的文档8",
"author": "Iron Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "8",
"createTime": "2021-10-10 11:52:08.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "10",
"_score": 0.13353139,
"_source": {
"blogId": 10,
"title": "Java学习教程10",
"content": "这是批量添加的文档10",
"author": "God Of Thunder",
"category": "ElasticSearch",
"status": 1,
"serialNum": "10",
"createTime": "2021-10-10 11:52:10.249",
"updateTime": null
}
}
]
}
}
可见,此时 minimum_should_match是0。但好像should并没有影响评分,后期会专门研究一下这里。
实例2:指定minimum_should_match
需求:查询标题中带有“java”的博客,作者名字必须包含“Iron”或者“Captain”。
http://localhost:9200/
POST blog/_search
{
"query": {
"bool": {
"must": {
"match": {
"title": "java"
}
},
"should": [{
"match": {
"author": "Iron"
}
}, {
"match": {
"author": "Captain"
}
}
],
"minimum_should_match": 1
}
}
}
结果
贴出所有结果
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.5296494,
"hits": [{
"_index": "blog",
"_type": "_doc",
"_id": "9",
"_score": 1.5296494,
"_source": {
"blogId": 9,
"title": "Java学习教程9",
"content": "这是批量添加的文档9",
"author": "Captain America",
"category": "ElasticSearch",
"status": 1,
"serialNum": "9",
"createTime": "2021-10-10 11:52:09.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "7",
"_score": 0.63270766,
"_source": {
"blogId": 7,
"title": "Java学习教程7",
"content": "这是批量添加的文档7",
"author": "Iron Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "7",
"createTime": "2021-10-10 11:52:07.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "8",
"_score": 0.63270766,
"_source": {
"blogId": 8,
"title": "Java学习教程8",
"content": "这是批量添加的文档8",
"author": "Iron Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "8",
"createTime": "2021-10-10 11:52:08.249",
"updateTime": null
}
}
]
}
}
注意事项
要注意JSON格式,如果格式不对,ES服务器不会报错,查出的结果会不正确。见下方示例
需求:查询标题中带有“java”和“文档”的博客,作者名字必须包含“Iron”或者“Captain”。
http://localhost:9200/
POST blog/_search
{
"query": {
"bool": {
"must": {
"match": {
"title": "java"
}
},
"must": {
"match": {
"title": "学习"
}
},
"should": [{
"match": {
"author": "Iron"
}
}, {
"match": {
"author": "Captain"
}
}
],
"minimum_should_match": 1
}
}
}
你觉得这个语句正常吗? 如果不正常,会出现什么结果?
结果:你没看错,它竟然把标题不包含“java”的也给查了出来
原因:下边的must把上边的给覆盖了 。其实这不符合JSON格式的,竟然有两个相同的key,但ES却没有报错。
正确写法:
{
"query": {
"bool": {
"must": [{
"match": {
"title": "java"
}
}, {
"match": {
"title": "学习"
}
}
],
"should": [{
"match": {
"author": "Iron"
}
}, {
"match": {
"author": "Captain"
}
}
],
"minimum_should_match": 1
}
}
}
结果
贴出所有
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.8181726,
"hits": [{
"_index": "blog",
"_type": "_doc",
"_id": "9",
"_score": 1.8181726,
"_source": {
"blogId": 9,
"title": "Java学习教程9",
"content": "这是批量添加的文档9",
"author": "Captain America",
"category": "ElasticSearch",
"status": 1,
"serialNum": "9",
"createTime": "2021-10-10 11:52:09.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "7",
"_score": 0.89977044,
"_source": {
"blogId": 7,
"title": "Java学习教程7",
"content": "这是批量添加的文档7",
"author": "Iron Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "7",
"createTime": "2021-10-10 11:52:07.249",
"updateTime": null
}
}, {
"_index": "blog",
"_type": "_doc",
"_id": "8",
"_score": 0.89977044,
"_source": {
"blogId": 8,
"title": "Java学习教程8",
"content": "这是批量添加的文档8",
"author": "Iron Man",
"category": "ElasticSearch",
"status": 1,
"serialNum": "8",
"createTime": "2021-10-10 11:52:08.249",
"updateTime": null
}
}
]
}
}
嵌套布尔查询
需求:查询标题中带有“java”和“文档”的博客,而且作者名字必须包含“Iron”或者“Captain”,而且序列号是1,2,3,4中或者是5,6,7,8。
http://localhost:9200/
POST blog/_search
{
"query": {
"bool": {
"must": [{
"match": {
"title": "java"
}
}, {
"match": {
"title": "学习"
}
}, {
"bool": {
"should": [{
"match": {
"author": "Iron"
}
}, {
"match": {
"author": "Captain"
}
}
]
}
}, {
"bool": {
"should": [{
"terms": {
"serialNum": [1, 2, 3, 4]
}
}, {
"terms": {
"serialNum": [5, 6, 7, 8]
}
}
]
}
}
]
}
}
}
结果
其他网址ElasticSearch查询 第五篇:布尔查询 - 悦光阴 - 博客园
elasticSearch多条件高级检索语句,包含多个must、must_not、should嵌套示例,并考虑nested对象的特殊检索 - 近朱朱者赤 - 博客园
《从Lucene到Elasticsearch:全文检索实战》=> 6.4.2 bool query



