Elasticsearch的CRUD:REST與Java API
CRUD(Create, Retrieve, Update, Delete)是數(shù)據(jù)庫系統(tǒng)的四種基本操作,分別表示創(chuàng)建、查詢、更改、刪除,俗稱“增刪改查”。Elasticsearch作為NoSQL數(shù)據(jù)庫(雖然ES是為搜索引擎而生的,但我更愿意將其看作帶有強(qiáng)大文本搜索功能的NoSQL)。
以下示例基于Elasticsearch 2.4版本。
Create
在默認(rèn)情況下,ES的REST接口的端口號(hào)為9200,對接Java client的端口號(hào)為9300。
Create操作為向index中索引文檔,若index不存在則ES會(huì)自動(dòng)創(chuàng)建;
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{<json data>}'
Java API("org.elasticsearch" % "elasticsearch" % "2.4.1")通過TransportClient與ES集群連接,CRUD操作便是基于此而實(shí)現(xiàn)的。
final Settings settings = Settings.settingsBuilder()
.put("client.transport.sniff", true)
.put("client.transport.ping_timeout", 20, TimeUnit.SECONDS)
.put("client", true)
.put("data", false)
.put("cluster.name", "<cluster name>")
.build();
Client client = TransportClient.builder()
.settings(settings).build()
.addTransportAddresses(
new InetSocketTransportAddress(InetAddress.getByName("host1"), 9300),
new InetSocketTransportAddress(InetAddress.getByName("host2"), 9300));
Index Java API創(chuàng)建index或索引document:
import org.elasticsearch.action.index.IndexResponse;
IndexResponse response = client.prepareIndex("twitter", "tweet")
.setSource(documentJson)
.get();
Retrieve
ES的查詢DSL大致可以分為兩種:
- Query DSL,主要配合bool、match等使用,相當(dāng)于SQL中的where子句;
- Aggregations,相當(dāng)于SQL中的group by部分,細(xì)分為如下三類:
- Bucketing,聚合函數(shù)只能是
count(*),表示的是doc命中數(shù),可以嵌套子aggs; - Metric,相比于Bucketing其非常靈活,可配合
avg、max、sum等聚合函數(shù),但是不能嵌套子aggs; - Pipeline,以其他aggs的結(jié)果作為輸入,而不是直接在文檔集合上進(jìn)行操作。
ES的Query DSL功能實(shí)在是強(qiáng)大,在本文短短的篇幅中很難闡述完全,故只列舉了兩個(gè)簡單實(shí)例。在以前的項(xiàng)目中,我使用過1.7版本ES,后來發(fā)現(xiàn)2.0.0-beta1版本及之后DSL語法發(fā)生很大的變化,比如filtered、and、or等被廢棄掉了,而被bool取而代之;對應(yīng)的Java API支持鏈?zhǔn)讲僮鳎cJava 8配合寫起來非常舒服。
REST通過_search接口進(jìn)行DSL查詢:
$ curl -XGET 'localhost:9200/<index>/_search?pretty' -d'{<dsl>}'
實(shí)戰(zhàn):List<List<String>> idsList作為過濾條件,其中內(nèi)一層為and關(guān)系、內(nèi)二層為or關(guān)系;然后多字段(為bucketSizeMap的key)aggs,Java 8實(shí)現(xiàn):
BoolQueryBuilder mustQueryBuilder = boolQuery();
if (!(idsList.size() == 1 && idsList.get(0).isEmpty())) {
mustQueryBuilder = idsList.stream().reduce(
boolQuery(),
(mustQB, ids) -> {
BoolQueryBuilder shouldQB = ids.stream().reduce(boolQuery(),
(qb, id) -> qb.should(termQuery(SearchSystem.getEsType(id, idMap), id)),
BoolQueryBuilder::should);
return mustQB.must(shouldQB);
},
BoolQueryBuilder::must);
}
SearchRequestBuilder searchRequestBuilder = client.prepareSearch(indexName)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(mustQueryBuilder);
for (Map.Entry<String, Integer> entry : bucketSizeMap.entrySet()) {
AggregationBuilder aggregationBuilder = AggregationBuilders
.terms(entry.getKey())
.field(entry.getKey()).size(entry.getValue());
searchRequestBuilder.addAggregation(aggregationBuilder);
}
SearchResponse response = searchRequestBuilder.execute().actionGet();
Bucket Aggregations支持filter aggs,即滿足過濾條件后做aggs,
aggs:
<aggs_name>:
filter:
aggs:
其與filter query + aggs在功能上是等價(jià)的,
query:
bool:
filter:
aggs:
但是,經(jīng)測試發(fā)現(xiàn)filter query + aggs是比filter aggs查詢要快。
Update
update為document級(jí)別的操作,即僅支持對某個(gè)具體document進(jìn)行更新;REST通過_update接口:
$ curl -XPOST 'localhost:9200/<_index>/<_type>/<_id>/_update' -d '{<data>}'
Java API則有兩種實(shí)現(xiàn)方式:UpdateRequest + update與prepareUpdate,
// case 1
UpdateRequest updateRequest = new UpdateRequest();
updateRequest.index("index");
updateRequest.type("type");
updateRequest.id("1");
updateRequest.doc(jsonBuilder()
.startObject()
.field("gender", "male")
.endObject());
client.update(updateRequest).get();
// case 2
client.prepareUpdate("ttl", "doc", "1")
.setDoc(jsonBuilder()
.startObject()
.field("gender", "male")
.endObject())
.get();
Delete
delete操作通常都伴隨著檢查index是否存在(exist),exist的RESTful接口與Java API分別如下:
$ curl -XHEAD -i 'http://localhost:9200/twitter'
client.admin().indices()
.prepareExists(indexName)
.execute().actionGet().isExists();
ES提供了三種粗細(xì)粒度的刪除操作:
- 刪除整個(gè)index;
- 刪除index中某一type;
- 刪除特定的document.
RESTful接口:
-- delete complete index
$ curl -XDELETE 'http://localhost:9200/<indexname>'
-- delete a type in index
$ curl -XDELETE 'http://localhost:9200/<indexname>/<typename>'
-- delete a particular document
$ curl -XDELETE 'http://localhost:9200/<indexname>/<typename>/<documentId>
Java API實(shí)現(xiàn):
// delete complete index
client.admin().indices().delete(new DeleteIndexRequest("<indexname>")).actionGet();
// delete a type in index
client.prepareDelete().setIndex("<indexname>").setType("<typename>").setId("*").execute().actionGet();
// delete a particular document
client.prepareDelete().setIndex("<indexname>").setType("<typename>").setId("<documentId>").execute().actionGet();
// or
DeleteResponse response = client.prepareDelete("twitter", "tweet", "1")
.execute()
.actionGet();

浙公網(wǎng)安備 33010602011771號(hào)