Docker安装Elasticsearch8
本人使用的Elasticsearch版本是8.3.2。
Windows基于WSL2的Docker Desktop,版本24.0.6。
启用https
docker-compose.yml
下面配置未指定es的配置文件,es启动后会默认生成。
version: "3"
services:
es:
image: elasticsearch:8.3.2
container_name: es
networks:
- elastic
ports:
- "9200:9200"
- "9300:9300"
volumes:
- ./volumes/data:/usr/share/elasticsearch/data
- ./volumes/plugins:/usr/share/elasticsearch/plugins
environment:
discovery.type: single-node
kibana:
image: kibana:8.3.2
container_name: kibana
networks:
- elastic
ports:
- "5601:5601"
depends_on:
- es
networks:
elastic:
此时如果直接浏览器打开http://localhost:9200,无法看到elasticsearch的状态信息。
使用docker单节点模式安装es8,默认启动安全配置。
- 生成证书和密钥
- TLS加密配置会写入到elasticsearch.yml
- 为elastic用户生成密码
- 为kibana生成注册令牌
要正常访问es需要ca证书以及用户名密码。
讲ca证书从容器中拷贝出来
docker cp es:/usr/share/elasticsearch/config/certs/http_ca.crt .
按照官方的说法,用户密码和注册令牌仅会在es第一次启动时打印在控制台,但是本人发现并没有。
{"@timestamp":"2023-10-18T07:21:27.922Z", "log.level": "INFO", "message":"Auto-configuration will not generate a password for the elastic built-in superuser, as we cannot determine if there is a terminal attached to the elasticsearch process. You can use the `bin/elasticsearch-reset-password` tool to set the password for the elastic user.", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.xpack.security.InitialNodeSecurityAutoConfiguration","elasticsearch.node.name":"e537cedd5778","elasticsearch.cluster.name":"docker-cluster"}
密码和令牌可以使用es的脚本生成。
# -i参数可以指定密码
docker exec -it es /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic
docker exec -it es /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token --scope kibana
curl --cacert http_ca.crt -u elastic:uLBV*RmHZRrt7_*zw7Py https://localhost:9200
{
"name" : "e537cedd5778",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "VS3WhhdUQtGOnH4mwS8U4w",
"version" : {
"number" : "8.3.2",
"build_type" : "docker",
"build_hash" : "8b0b1f23fbebecc3c88e4464319dea8989f374fd",
"build_date" : "2022-07-06T15:15:15.901688194Z",
"build_snapshot" : false,
"lucene_version" : "9.2.0",
"minimum_wire_compatibility_version" : "7.17.0",
"minimum_index_compatibility_version" : "7.0.0"
},
"tagline" : "You Know, for Search"
}
查看自动生成的es配置文件
docker cp es:/usr/share/elasticsearch/config/elasticsearch.yml .
cluster.name: "docker-cluster"
network.host: 0.0.0.0
#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
#
# The following settings, TLS certificates, and keys have been automatically
# generated to configure Elasticsearch security features on 18-10-2023 07:21:12
#
# --------------------------------------------------------------------------------
# Enable security features
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
enabled: true
keystore.path: certs/http.p12
# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/transport.p12
truststore.path: certs/transport.p12
#----------------------- END SECURITY AUTO CONFIGURATION -------------------------
在Postman中访问https的es
-
postman中配置刚刚从es中复制的ca证书,settings->Certificates,配置域名、端口、ca证书。
-
HTTP请求添加Basic Auth,每个请求的Authorization选择Basic Auth,输入用户名密码。
不使用https
只需要需改配置文件即可,在docker-compose.yml
挂载配置文件elasticsearch.yml
#cluster.name: "docker-cluster"
#network.host: 0.0.0.0
cluster.name: "docker-cluster"
network.host: 0.0.0.0
#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
#
# The following settings, TLS certificates, and keys have been automatically
# generated to configure Elasticsearch security features on 18-10-2023 07:21:12
#
# --------------------------------------------------------------------------------
# Enable security features
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
# 跨域配置
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: X-Requested-With,Content-Type,Content-Length,Authorization
相关概念
-
index,索引,文档的集合,相当于关系型数据库的表(Table),包含表结构(mapping)和表配置(setting)两个选项。
-
mapping,表结构,每个字段的数据类型相关配置。
-
doc,文档,每个文档(Document)相当于关系型数据库中的行(Row),文档的字段(Field)相当于数据库中的列(Column)。
-
Inverted index,倒排索引,先对文档进行分词,词条记录对应文档信息,查询时通过词条定位到文档。
-
analyzer,分词器,将文本拆分成词条,对于英文,可直接按照空格拆分,默认情况下中文会按每个字拆分,支持中文分词需要安装插件。es中分词器的组合包含三个部分
- character filters,字符过滤器,在分词之前对文本进行处理,例如删除停用词,替换字符等。
- tokenizer,将文本切分成词条(term)。
- tokenizer filters,进一步处理分词结果,例如大小写转换,同义词替换等。
检索特性
-
collapse字段折叠,按照特定的字段分组,每组均返回结果,例如搜索手机,每个品牌都想看看,按品牌字段折叠,返回每个品牌的可排序、过滤的数据。
-
filter过滤,与query使用场景不同。
-
highlight高亮,对存在检索关键词的结果字段添加特殊标签。
-
async异步搜索,检索大量数据,可查看检索的运行状态。
-
near real-time近实时搜索,添加或更新文档不修改旧的索引文件,写新文件到缓存,延迟刷盘。
-
pagination排序,普通排序,深度分页scroll,search after。
-
inner hits子文档命中,对嵌套对象子文档进行搜索时,可以满足查询条件的具体子文档。[]
-
selected field返回需要的字段,使用_source和fileds返回需要的文档字段。
-
across clusters分布式检索,支持多种检索API的分布式搜索。
-
multiple indices多索引检索,支持同时从一次从多个索引检索数据。
-
shard routing分片路由,自适应分片路由以减少搜索响应时间,可自定义检索哪个节点。
-
自定义检索模板search templates,可复用的检索模板,根据不同变量生成不同query dsl。
-
同义词检索search with synonyms,定义同义词集、过滤器和分词器,提高检索准确度。
-
排序sort results,支持多字段,数组字段、嵌套字段排序。
-
最邻近搜索knn search,检索最邻近的向量,常用于相关性排名、搜索建议、图像视频检索。
-
语义检索semantic search,按语义和意图检索,而不是词汇检索,基于NLP和向量检索,支持上传模型,在存储和检索时自动编码,支持混合检索。
所有的检索特性可以查看官方文档
Python Client
from pprint import pprint
from elasticsearch import Elasticsearch
es_password = 'h-3yzvInloC6Dl+==7UX'
client = Elasticsearch(hosts='http://localhost:9200',
# ca_certs=os.path.join(os.path.dirname(__file__), 'http_ca.crt'),
basic_auth=('elastic', es_password))
pprint(client.info().body)
response = client.perform_request('POST', '/kyjy-test/_search',
headers={
'Content-Type': "application/vnd.elasticsearch+json;compatible-with=8",
"Accept": "application/vnd.elasticsearch+json;compatible-with=8"},
body={'query': {'match': {'content': '人员管理'}},
'_source': {'excludes': 'vector'}})
pprint(response.body)
Java Client
本人在Springboot2.7.16项目中使用访问es,对应Springdata使用的es版本是7.17,使用Rest-High-Level-Client Api,8.x此Client已不再维护,官方推荐Java Client Api,引入和es版本对应的该包。
<dependency>
<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>8.3.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/jakarta.json/jakarta.json-api -->
<dependency>
<groupId>jakarta.json</groupId>
<artifactId>jakarta.json-api</artifactId>
<version>2.0.1</version>
</dependency>
Elasticsearch Config配置类
package com.windcf.eslearn.config;
import co.elastic.clients.elasticsearch.ElasticsearchClient;
import co.elastic.clients.json.jackson.JacksonJsonpMapper;
import co.elastic.clients.transport.ElasticsearchTransport;
import co.elastic.clients.transport.rest_client.RestClientTransport;
import org.elasticsearch.client.RestClient;
import org.springframework.boot.autoconfigure.elasticsearch.ElasticsearchProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.elasticsearch.client.ClientConfiguration;
import org.springframework.data.elasticsearch.client.elc.AutoCloseableElasticsearchClient;
import org.springframework.data.elasticsearch.client.elc.ElasticsearchConfiguration;
import org.springframework.lang.NonNull;
import java.net.URI;
/**
* @author chunf
*/
@Configuration
public class MyElasticsearchConfiguration extends ElasticsearchConfiguration {
private final ElasticsearchProperties elasticsearchProperties;
MyElasticsearchConfiguration(ElasticsearchProperties elasticsearchProperties) {
this.elasticsearchProperties = elasticsearchProperties;
}
@Override
@NonNull
@Bean
public ClientConfiguration clientConfiguration() {
String[] hostAndPort = elasticsearchProperties.getUris()
.stream()
.map(s -> {
URI uri = URI.create(s);
return uri.getHost() + ":" + uri.getPort();
}).toArray(String[]::new);
return ClientConfiguration
.builder()
.connectedTo(hostAndPort)
.withBasicAuth(elasticsearchProperties.getUsername(), elasticsearchProperties.getPassword())
.withPathPrefix(elasticsearchProperties.getPathPrefix())
.withConnectTimeout(elasticsearchProperties.getConnectionTimeout())
.withSocketTimeout(elasticsearchProperties.getSocketTimeout())
.build();
}
@Override
@Bean
@NonNull
public ElasticsearchClient elasticsearchClient(@NonNull RestClient restClient) {
ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
return new AutoCloseableElasticsearchClient(transport);
}
/**
* not recommanded
*/
@Override
protected boolean writeTypeHints() {
return false;
}
}
踩坑
截至2023年11月04日,Springboot2.17整合Springdata Elasticsearch与Java Client Api还有些不兼容,本人踩了几个BUG。
-
Sort排序不支持
@SpringBootTest @ActiveProfiles("dev") class EsLearnApplicationTests { @Autowired private ElasticsearchClient elasticsearchClient; @Autowired private ElasticsearchOperations elasticsearchOperations; @Test void sortQueryByEsOperations() { Sort sort = Sort.by(Sort.Direction.DESC, "score"); NativeQuery nativeQuery = new NativeQueryBuilder().withSort(sort) .withQuery(QueryBuilders.range(b -> b.field("price").gte(JsonData.of(300)))) .build(); SearchHits<HotelDoc> searchHits = elasticsearchOperations.search(nativeQuery, HotelDoc.class); for (SearchHit<HotelDoc> searchHit : searchHits) { System.out.println(searchHit.getContent()); } } @Test void sortQueryByEsClient() throws IOException { Query rangeQuery = QueryBuilders.range().field("price").gte(JsonData.of(300)).build()._toQuery(); SortOptions sortOptions = SortOptionsBuilders.field(builder -> builder.field("score").order(SortOrder.Desc).mode(SortMode.Min)); SearchRequest searchRequest = SearchRequest.of(builder -> builder.index("hotel").query(rangeQuery).sort(sortOptions)); System.out.println(searchRequest.toString()); SearchResponse<HotelDoc> searchResponse = elasticsearchClient.search(searchRequest, HotelDoc.class); for (Hit<HotelDoc> hit : searchResponse.hits().hits()) { System.out.println(hit.source()); } } }
使用
java client api8.3.2
时,Test1报错,Test通过,所以是Springdata Elasticsearch功能还没有跟上,毕竟官方标配的ES版本还是7.17。升级
java client api
包到8.5及以上版本解决此问题。 -
不支持Suggestion
在Springdata Elasticsearch解析Es Java Client的请求响应时,直接放弃了对suggestion的解析。
建议直接使用ElasticsearchClient
而不是ElasticsearchOperations
,毕竟Springdata支持的是es7.17。
使用较新的版本就是容易踩坑。
Q.E.D.