- 引用说明
- 版本说明
- 本文源码
- 一、springboot整合连接Es做基本操作
- 创建springboot项目
- 导入依赖
- 写一个高级客户端的配置类
- 创建一个测试用的实体类
- 编写配置文件
- 测试类
- 二、编写jd全局搜索例子
- 创建项目
- 引入jar包
- 封装一个抓取jd的商品数据的工具类
- 封装一个es操作的工具类
- 编写控制器
- 编写启动类
- 三、测试
- 测试插入数据
- 查询数据(不高亮)
- 查询数据(高亮)
引用说明
本文参考狂神说狂神的视频以及文章,大家可以支持下狂神(狂神ElasticSearch文章地址、狂神ElasticSearch视频地址)
版本说明
本文涉及一下软件以及版本的声明(以下下载地址除ik之外,都是国内华为镜像)
| 软件 | 版本 |
|---|---|
| elasticSearch | 7.6.1 |
| elasticSearch-head-master | 7.6.1 |
| kibana | 7.6.1 |
| elasticsearch-analysis-ik1 | 7.6.1 |
| jdk | 8 (最低要求) |
本文源码
本文源码放到gitee仓库上了,地址
一、springboot整合连接Es做基本操作 创建springboot项目过程省略
导入依赖注意因为elasticSearch涉及到版本问题,所以注意自定义版本依赖(es是什么版本,就用什么版本的jar包,在如下指定)
写一个高级客户端的配置类4.0.0 org.springframework.boot spring-boot-starter-parent 2.5.5 cn.es elasticSearch_api 0.0.1-SNAPSHOT elasticSearch_api elasticSearch_api 1.8 7.6.1 com.alibaba fastjson 1.2.76 org.springframework.boot spring-boot-starter-data-elasticsearch org.springframework.boot spring-boot-starter-web org.springframework.boot spring-boot-devtools runtime true org.springframework.boot spring-boot-configuration-processor true org.projectlombok lombok true org.springframework.boot spring-boot-starter-test test org.springframework.boot spring-boot-maven-plugin org.projectlombok lombok
package cn.es.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.Node;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
//类似于spring 的xml
@Configuration
public class ESConfig {
//注入一个高级客户端
@Bean
public RestHighLevelClient restHighLevelClient(){
RestHighLevelClient restHighLevelClient=new RestHighLevelClient(RestClient.builder(new HttpHost("localhost",9200,"http")));
return restHighLevelClient;
}
}
这样就将高级客户端注入到bean容器中了,待会就可以使用该高级客户端,对es进行操作了。
创建一个测试用的实体类
package cn.es.pojo;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
@Data
@AllArgsConstructor
@NoArgsConstructor
public class User {
private int id;
private String name;
private int age;
}
编写配置文件
由于我的es是在本机,并且没有设置任何的密码,所以不需要配置,需要配置的可以在application.yml(properties)中进行配置
测试类此时目录结构如下:
因为涉及到bean容器的问题,请使用springboot默认的测试包,进行测试,防止bean注入不进去。
测试类如下:
package cn.es;
import cn.es.pojo.User;
import com.alibaba.fastjson.JSON;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.common.unit.Timevalue;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.FetchSourceContext;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import java.io.IOException;
import java.util.ArrayList;
import java.util.concurrent.TimeUnit;
@SpringBootTest
class ElasticSearchApiApplicationTests {
@Autowired
RestHighLevelClient client;
@Test
void contextLoads() {
}
//一、创建索引
@Test
void createIndex() throws IOException {
//1/创建新建索引(库) 的请求
CreateIndexRequest createIndexRequest= new CreateIndexRequest("blog_demo");
// createIndexRequest
//2、执行请求,获得响应
CreateIndexResponse createIndexResponse= client.indices().create(createIndexRequest, RequestOptions.DEFAULT);
System.out.println(createIndexResponse);
}
//二、获取索引
@Test
void getIndex() throws IOException {
//1/创建新建索引(库) 的请求
GetIndexRequest request= new GetIndexRequest("blog_demo");
//2、判断该索引是否存在
boolean flag_exist= client.indices().exists(request,RequestOptions.DEFAULT);
System.out.println(flag_exist);
}
//三、删除索引
@Test
void deleteIndex() throws IOException {
//1/创建新建索引(库) 的请求
DeleteIndexRequest request= new DeleteIndexRequest("blog_demo");
//2、判断该索引是否存在
AcknowledgedResponse response= client.indices().delete(request,RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());
}
//四、插入文档 (表中的行) 第一次插入,第二次覆盖更新
@Test
void adddocument() throws IOException {
User user=new User(1,"小明",18);
//1、创建请求
IndexRequest indexRequest= new IndexRequest("blog_demo");
//3、规则
//id可以不指定 因为es提供了默认的生成策略 如果需要指定,则需要注意id是否重复问题,一旦重复,则会将之前的数据进行覆盖
indexRequest.id(String.valueOf(user.getId()));
indexRequest.timeout(Timevalue.timevalueSeconds(1));
indexRequest.timeout("1s");
//将我们数据放入请求 json格式
indexRequest.source(JSON.toJSONString(user), XContentType.JSON);
//客户端发送请求
IndexResponse response= client.index(indexRequest,RequestOptions.DEFAULT);
System.out.println(response.toString());
System.out.println(response.status());
}
//获取文档 判断文档是否存在
@Test
void existdocument() throws IOException {
GetRequest getRequest=new GetRequest("blog_demo","1");
//不返回source上下文
getRequest.fetchSourceContext(new FetchSourceContext(false));
getRequest.storedFields("_none_");
boolean flag=client.existsSource(getRequest,RequestOptions.DEFAULT);
System.out.println(flag);
}
//获取文档 获取文档内容 根据id获取指定文档
@Test
void getdocument() throws IOException {
GetRequest getRequest=new GetRequest("blog_demo","1");
GetResponse getResponse =client.get(getRequest,RequestOptions.DEFAULT);
System.out.println(getResponse.toString());
System.out.println(getResponse.getSourceAsString());//打印文档的内容
}
//更新文档记录 更新文档内容 空属性值保留原值,非空属性值更新
@Test
void updatedocument() throws IOException {
User user=new User(1,null,81);
UpdateRequest request=new UpdateRequest("blog_demo","1");
request.timeout("1s");
//将我们数据放入请求 json格式
request.doc(JSON.toJSONString(user), XContentType.JSON);
UpdateResponse updateResponse =client.update(request,RequestOptions.DEFAULT);
System.out.println(updateResponse.toString());
}
//删除文档 根据文档id删除指定文档
@Test
void deletedocument() throws IOException {
DeleteRequest request=new DeleteRequest("blog_demo","1");
request.timeout("1s");
DeleteResponse deleteResponse =client.delete(request,RequestOptions.DEFAULT);
System.out.println(deleteResponse.toString());
}
//真实项目大批量的插入数据
@Test
void insertBulkRequest() throws IOException {
BulkRequest request=new BulkRequest();
request.timeout("10s");
ArrayList arrayList=new ArrayList<>();
arrayList.add(new User(10,"zs",18));
arrayList.add(new User(11,"lisi",21));
arrayList.add(new User(12,"ww",16));
arrayList.add(new User(13,"zl",15));
arrayList.add(new User(14,"hh",16));
arrayList.add(new User(15,"ll",14));
for (User user : arrayList) {
request.add(
new IndexRequest("blog_demo")
.id(String.valueOf(user.getId()))
.source(JSON.toJSONString(user),XContentType.JSON));
}
BulkResponse response= client.bulk(request,RequestOptions.DEFAULT);
System.out.println(response.hasFailures());
}
@Test
//查询
void searchRequest() throws IOException {
SearchRequest searchRequest=new SearchRequest("blog_demo");
//构建搜索条件
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//查询条件 精确匹配
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "zs");
searchSourceBuilder.query(termQueryBuilder);
searchSourceBuilder.timeout(new Timevalue(60, TimeUnit.SECONDS));
searchRequest.source(searchSourceBuilder);
//执行查询 返回结果
SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println(JSON.toJSONString(response.getHits()));
System.out.println("-----------------------");
for (SearchHit hit : response.getHits().getHits()) {
System.out.println(hit.getSourceAsMap());
}
}
}
二、编写jd全局搜索例子
原本想写一个jd一样的全局搜索的例子,但是由于时间仓促,没找到合适的模板,所以直接用postman测试,请忽略前端页面的缺少
创建项目创建过程省略
引入jar包4.0.0 org.springframework.boot spring-boot-starter-parent 2.5.5 cn.es elasticSearch_jd 0.0.1-SNAPSHOT elasticSearch_jd elasticSearch_jd 1.8 7.6.1 org.jsoup jsoup 1.10.2 com.alibaba fastjson 1.2.76 org.springframework.boot spring-boot-starter-data-elasticsearch org.springframework.boot spring-boot-starter-web org.springframework.boot spring-boot-devtools runtime true org.springframework.boot spring-boot-configuration-processor true org.projectlombok lombok true org.springframework.boot spring-boot-starter-test test org.springframework.boot spring-boot-starter-thymeleaf junit junit 4.13.2 compile org.springframework.boot spring-boot-maven-plugin org.projectlombok lombok
因为要使用jsoup爬取jd上的商品数据,所以需要jsoup的jar包
封装一个抓取jd的商品数据的工具类
package cn.es.util;
import cn.es.pojo.Content;
import org.jsoup.Jsoup;
import org.jsoup.nodes.document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
public class HtmlParseUtil {
public static void main(String[] args) throws IOException {
System.out.println(HtmlParseUtil.getList("华为"));
// String keyword="";
}
public static List getList(String keyword) throws IOException {
//获取请求
String url = "https://search.jd.com/Search?keyword="+keyword;
//解析网页
document document = Jsoup.parse(new URL(url), 30000);
Element element = document.getElementById("J_goodsList");
// System.out.println(element.html());
Elements elements = document.getElementsByTag("li");
List contents=new ArrayList<>();
int i=0;
for (Element el:elements){
if (el.attr("class").equalsIgnoreCase("gl-item")) {
String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");
String price = el.getElementsByClass("p-price").eq(0).text();
String title = el.getElementsByClass("p-name").eq(0).text();
// System.out.println("============================");
// System.out.println(img);
// System.out.println(price);
// System.out.println(title);
contents.add(new Content(++i,title,img,price));
}
}
// System.out.println(contents);
return contents;
}
}
想要抓取其他的网页,可以根据自己的需求修改页面地址,以及前端标签的关系
封装一个es操作的工具类
package cn.es.util;
import ch.qos.logback.core.net.SyslogOutputStream;
import cn.es.pojo.Content;
import com.alibaba.fastjson.JSON;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.unit.Timevalue;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.MatchQueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestParam;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;
public class EsUtil {
public static boolean createIndex(RestHighLevelClient client,String index) throws IOException {
//1、获取查询索引(库) 的请求
GetIndexRequest request= new GetIndexRequest(index);
//2、判断该索引是否存在
boolean flag_exist= client.indices().exists(request,RequestOptions.DEFAULT);
if (flag_exist){
//如果存在该索引,则返回true
return true;
}
//如果不存在,该索引,则创建该索引
//3、创建新建索引(库) 的请求
CreateIndexRequest createIndexRequest= new CreateIndexRequest(index);
//4、执行请求,获得响应
CreateIndexResponse createIndexResponse= client.indices().create(createIndexRequest, RequestOptions.DEFAULT);
return createIndexResponse.isAcknowledged();
}
public static boolean insertListEs(RestHighLevelClient client,String key,String index) throws IOException {
//1、创建大批量数据插入请求
BulkRequest request = new BulkRequest();
//2、设置超时时间
request.timeout("10s");
//3、从京东网页中抓取数据,封装为实体类集合
List contents = HtmlParseUtil.getList(key);
//4、判断是否存在该索引
if (createIndex(client,index)) {
// 没有就创建,有就执行使用
//5、此时存在该索引,往该索引插入数据
for (Content content : contents) {
request.add(
new IndexRequest(index)
// .id(String.valueOf(content.getId())) # 重复的id会覆盖之前的数据 因为不是从数据库中查询,没有唯一的主键id,所以暂时不指定id
.source(JSON.toJSONString(content), XContentType.JSON));
}
BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
return !response.hasFailures();
}
return false;
}
public static List
编写控制器
package cn.es.controller;
import cn.es.util.EsUtil;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.*;
import java.io.IOException;
import java.util.List;
import java.util.Map;
@RestController
public class EsController {
@Autowired
RestHighLevelClient client;
String index="jd_index";
@GetMapping("/insertEs/{key}")
public Boolean insertEs(@PathVariable String key){
Boolean flag=false;
try {
flag= EsUtil.insertListEs(client,key,index);
} catch (IOException e) {
e.printStackTrace();
}
return flag;
}
@GetMapping("/searchEs/{key}/{pageNum}/{pageSize}")
public List> searchEs(@PathVariable String key, @PathVariable int pageNum, @PathVariable int pageSize) throws IOException {
List> list=EsUtil.searchEs(client,index,key,pageNum,pageSize);
//返回结果
return list;
}
@GetMapping("/searchEsHighlight/{key}/{pageNum}/{pageSize}")
public List> searchEsHighlight(@PathVariable String key, @PathVariable int pageNum, @PathVariable int pageSize) throws IOException {
List> list=EsUtil.searchEsHighlight(client,index,key,pageNum,pageSize);
//返回结果
return list;
}
}
编写启动类
注意扫描servlet组件的注解
package cn.es;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.web.servlet.ServletComponentScan;
@SpringBootApplication
@ServletComponentScan
public class ElasticSearchJdApplication {
public static void main(String[] args) {
SpringApplication.run(ElasticSearchJdApplication.class, args);
}
}
三、测试
启动该项目
测试插入数据从jd网页上抓取指定关键词的数据到索引库,如下:
发送请求:
查看数据
一切ok
查询数据(不高亮)
发送请求,如下:
查询数据(高亮)
发送请求:
具体查询的逻辑、高亮的规则,以及高亮的样式还有很多,可以另外拓展!!!
本文结束!!!



