elasticsearch操作
添加
类型为employee,该类型位于索引megacorg,每个雇员索引一个文档,该文档包含该雇员的全部信息(面向文档),该雇员的id为1
需要index、type、id
curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/1 --data '{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}'
添加更多的雇员
curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/2 --data '{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}'
增加索引的时候,默认会有5个主分片,主分片是在创建索引的时候就要固定的,而副本分片个数随时可修改,比如,创建一个主分片为3,副本分片为1的索引。当往es中put数据时,会按照id进行hash,然后put到对应的分片上。
[[email protected] ~]# curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp2 --data '{
> "settings" : {
> "number_of_shards" : 3,
> "number_of_replicas" : 1
> }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 68
{"acknowledged":true,"shards_acknowledged":true,"index":"megacorp2"}
往es添加数据时,也可以不指定id,会自动创建id,需要使用post请求,方式如下:
[[email protected] ~]# curl -X POST -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/ --data '{
> "first_name": "John2",
> "last_name": "Smith2",
> "age": 256,
> "about": "I love to go rock climbing",
> "interests": [
> "sports",
> "music"
> ]
> }'
HTTP/1.1 201 Created
Location: /megacorp/employee/TfQ8ymMBtknNDl0i3mwi
content-type: application/json; charset=UTF-8
content-length: 179
{"_index":"megacorp","_type":"employee","_id":"TfQ8ymMBtknNDl0i3mwi","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":1}
更新文档时,
和添加时是一样的,返回一个version,是一个不同于之前的version。更新时,elasticsearch将旧的文档标记为已删除,并增加一个全新的文档,旧的文档会在后台自动清除,但是不会立即清除。
创建文档
返回409,代表已存在,不能创建,如果不加op_type=create,会更新。也可以在URL最后加上/_create
[[email protected] ~]# curl -X PUT -H 'Content-Type: application/json' -i http://focuson1:9200/megacorp/employee/1?op_type=create --data '{
> "first_name": "John",
> "last_name": "Smith",
> "age": 25,
> "about": "I love to go rock climbing",
> "interests": [
> "sports",
> "music"
> ]
> }'
HTTP/1.1 409 Conflict
content-type: application/json; charset=UTF-8
content-length: 445
{"error":{"root_cause":[{"type":"version_conflict_engine_exception","reason":"[employee][1]: version conflict, document already exists (current version [4])","index_uuid":"hKhKh3YRT6yRiWQiBPSYuw","shard":"3","index":"megacorp"}],"type":"version_conflict_engine_exception","reason":"[employee][1]: version conflict, document already exists (current version [4])","index_uuid":"hKhKh3YRT6yRiWQiBPSYuw","shard":"3","index":"megacorp"},"status":409}
检索文档:
- 根据需要index、type、id,返回某个文档
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 249
{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}}
- pretty
在请求参数中加上pretty,会使返回更加可读,但是source不会,会按照我们添加时候的格式返回
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1?pretty
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 294
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_version" : 3,
"found" : true,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
}
- 返回部分字段
只返回部分字段
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1?_source=first_name,last_name
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 128
{"_index":"megacorp","_type":"employee","_id":"1","_version":3,"found":true,"_source":{"last_name":"Smith","first_name":"John"}}
- 只返回source部分
只返回source里面的值
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/1/_source
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 162
{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}
- 返回全部文档,默认最多十个:
/_search
在所有的索引中搜索所有的类型
/gb/_search
在 gb 索引中搜索所有的类型
/gb,us/_search
在 gb 和 us 索引中搜索所有的文档
/g*,u*/_search
在任何以 g 或者 u 开头的索引中搜索所有的类型
/gb/user/_search
在 gb 索引中搜索 user 类型
/gb,us/user,tweet/_search
在 gb 和 us 索引中搜索 user 和 tweet 类型
/_all/user,tweet/_search
在所有的索引中搜索 user 和 tweet 类型
//该例子是返回索引为megacorp,类型为employee的全部文档
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 611
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":1.0,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":1.0,"_source":{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}},{"_index":"megacorp","_type":"employee","_id":"1","_score":1.0,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}}]}}
分页
GET /_search?size=5
GET /_search?size=5&from=5
GET /_search?size=5&from=10
分页会在每个分片进行排序然后返回,分页过深会使成本成指数上升全文搜索,
返回与该词相关的文档,并返回相关系数
写法一,这样在URL中写不能使用空格等特殊符号:
curl -X GET -i 'http://focuson1:9200/megacorp/employee/_search?q=about:like'
查询条件前面+表示前缀必须与可选条件匹配,-标示前缀一定不与查询条件匹配,没有+-就是其他情况。http://focuson1:9200/megacorp/employee/_search?q=-about:to%20go
写法二,使用match:
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search -H 'Content-Type: application/json' --data '{
> "query" : {
> "match" : {
> "about" : "rock climbing"
> }
> }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 629
{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}},{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}}]}}
短语搜索,
match_phrase只搜索使用这个短语的
[[email protected] ~]# curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
> {
> "query" : {
> "match_phrase" : {
> "about" : "rock climbing"
> }
> }
> }'
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}}]}}
高亮搜索,
让用户知道为何匹配到该文档,在json请求和返回中会有highlight部分
[[email protected] ~]# curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
> {
> "query" : {
> "match_phrase" : {
> "about" : "rock climbing"
> }
> },
> "highlight": {
> "fields" : {
> "about" : {}
> }
> }
> }
> '
{"took":6,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"megacorp","_type":"employee","_id":"1","_score":0.5753642,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
},"highlight":{"about":["I love to go <em>rock</em> <em>climbing</em>"]}}]}}
聚合,分析。
查询last_name为Smith,年龄大于30(gt表示grant_than大于)
[[email protected] ~]# curl -X GET -i http://focuson1:9200/megacorp/employee/_search -H 'Content-Type: application/json' --data '{
> "query" : {
> "bool": {
> "must": {
> "match" : {
> "last_name" : "smith"
> }
> },
> "filter": {
> "range" : {
> "age" : { "gt" : 30 }
> }
> }
> }
> }
> }'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 388
{"took":153,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"megacorp","_type":"employee","_id":"2","_score":0.2876821,"_source":{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}}]}}
删除文档和索引。不会立即删除,只会标记为删除状态。
删除文档
curl -X DELETE -i http://focuson1:9200/megacorp/employee/1
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 160
{"_index":"megacorp","_type":"employee","_id":"1","_version":5,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":5,"_primary_term":2}
删除索引
[[email protected] ~]# curl -X DELETE -i http://focuson1:9200/megacorp
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 21
{"acknowledged":true}
查看集群健康状况
[[email protected] ~]# curl http://focuson1:9200/_cluster/health
{"cluster_name":"elasticsearch","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":5,"active_shards":5,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":5,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}
更新丢失问题
在数据库层面,存在悲观锁和乐观锁,悲观锁是认为每次更新都存在更新丢失的可能性,会在每次读取数据之后就加上锁,其他就不能再操作了,知道锁释放之后,别的线程才能操作;乐观锁认为在每次读取时都不存在更新丢失的问题,但是会有一个版本号,查询时查得这个版本号,在更新时,查得该版本号并更新他,发现被别人更新时,就不再更新,这样也能方式更新丢失;所以乐观锁效率更高。
而elasticsearch明显可以使用乐观锁,因为他里面有版本号。比如在web界面加载所有的es里信息时,每条信息都有版本号,更新或删除时,会在条件中加上版本号为加载时的版本号,如果不是,则更新失败。
例子如下:
[[email protected] ~]# curl -X GET http://focuson1:9200/megacorp/employee/2
{"_index":"megacorp","_type":"employee","_id":"2","_version":2,"found":true,"_source":{
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}}
查得该条数据版本为2,则更新该条数据时,加上在version=2的基础上更新,如下:
[[email protected] ~]# curl -X PUT -H 'Content-Type: application/json' http://focuson1:9200/megacorp/employee/2?version=2 --data '{
> "first_name" : "Jane3",
> "last_name" : "Smith3",
> "age" : 32,
> "about" : "I like to collect rock albums",
> "interests": [ "music" ]
> }'
{"_index":"megacorp","_type":"employee","_id":"2","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1}
此时version变成了3,如果再使用version=2更新,则会失败,返回状态409失败:
[[email protected] ~]# curl -X PUT -H 'Content-Type: application/json' http://focuson1:9200/megacorp/employee/2?version=2 --data '{
"first_name" : "Jane3",
"last_name" : "Smith3",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}'
{"error":{"root_cause":[{"type":"version_conflict_engine_exception","reason":"[employee][2]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"FeUwsg9lTPuFTABIuT77BQ","shard":"2","index":"megacorp"}],"type":"version_conflict_engine_exception","reason":"[employee][2]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"FeUwsg9lTPuFTABIuT77BQ","shard":"2","index":"megacorp"},"status":409}
使用外部的版本号
新增文档时:
如果该版本号比123小,则更新成123,如果比123大或等于,则返回409
[[email protected] ~]# curl -X PUT -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/2?version=123&version_type=external' --data '{
> "first_name" : "Jane",
> "last_name" : "Smith",
> "age" : 32,
> "about" : "I like to collect rock albums",
> "interests": [ "music" ]
> }'
{"_index":"megacorp","_type":"employee","_id":"2","_version":123,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":1}
文档部分更新
在doc内添加一些字段,存在的更新,不存在的新增[[email protected] ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/2/_update' --data '{
> "doc" : {
> "tags" : [ "testing" ],
> "views": 0
> }
> }'
{"_index":"megacorp","_type":"employee","_id":"2","_version":125,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":6,"_primary_term":1}
使用脚本部分更新文档,把nimei字段加1
[[email protected] ~]# curl -X POST "http://focuson1:9200/megacorp/employee/2/_update" -H 'Content-Type: application/json' -d'
> {
> "script" : "ctx._source.nimei+=1"
> }
> '
{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":8,"_primary_term":1}[[email protected] ~]#
[[email protected] ~]#
[[email protected] ~]# curl http://focuson1:9200/megacorp/employee/2?pretty
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_version" : 127,
"found" : true,
"_source" : {
"doc" : {
"tags" : [
"testing"
],
"views" : 0
},
"views" : 0,
"tags" : [
"testing"
],
"nimei" : 1234567891
}
}
upsert
更新的文档不存在先创建他
[[email protected] ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/100/_update' --data '{
> "doc" : {
> "tags" : [ "testing" ],
> "views": 0,
> "nimei":1234567890
> },
> "upsert": {}
> }'
{"_index":"megacorp","_type":"employee","_id":"100","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}[[email protected] ~]#
[[email protected] ~]#
//下面结果可以看出,不存在会创建,但是不能把doc里面的进行更新
[[email protected] ~]# curl http://focuson1:9200/megacorp/employee/100
{"_index":"megacorp","_type":"employee","_id":"100","_version":1,"found":true,"_source":{}}[[email protected] ~]#
[[email protected] ~]#
[[email protected] ~]#
[[email protected] ~]# curl -X POST -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/100/_update' --data '{
> "doc" : {
> "tags" : [ "testing" ],
> "views": 0,
> "nimei":1234567890
> },
> "upsert": {}
> }'
{"_index":"megacorp","_type":"employee","_id":"100","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}[[email protected] ~]#
[[email protected] ~]#
[[email protected] ~]# curl http://focuson1:9200/megacorp/employee/100
{"_index":"megacorp","_type":"employee","_id":"100","_version":2,"found":true,"_source":{"nimei":1234567890,"views":0,"tags":["testing"]}}
更新重试
在程序中,我们可以使用乐观锁控制,每次传入version,这样就不会存在冲突的情况,但是当我们不存入version时,每次更新时会先检索,拿出version,然后重建索引,此时,可能会存在冲突。此时可以通过一个参数重试。retry_on_conflict,默认是0次。
curl -X POST "localhost:9200/website/pageviews/1/_update?retry_on_conflict=5" -H 'Content-Type: application/json' -d'
{
"script" : "ctx._source.views+=1",
"upsert": {
"views": 0
}
}
'
取回多个文档
[[email protected] ~]# curl -X GET -H 'Content-Type: application/json' 'http://focuson1:9200/_mget' --data '{
> "docs" : [
> {
> "_index" : "megacorp",
> "_type" : "employee",
> "_id" : 1
> },
> {
> "_index" : "megacorp",
> "_type" : "employee",
> "_id" : 2,
> "_source": "first_name"
> }
> ]
> }'
{"docs":[{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}},{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"found":true,"_source":{}}]}
如果在一个index或一个type中,可以把index或type写到URL中
[[email protected] ~]# curl -X GET -H 'Content-Type: application/json' 'http://focuson1:9200/megacorp/employee/_mget' --data '{
> "docs" : [
> {
> "_id" : 1
> },
> {
> "_id" : 2,
> "_source": "first_name"
> }
> ]
> }'
{"docs":[{"_index":"megacorp","_type":"employee","_id":"1","_version":1,"found":true,"_source":{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}},{"_index":"megacorp","_type":"employee","_id":"2","_version":127,"found":true,"_source":{}}]}
批量操作(bulk)有下面几个动作:create(创建文档)、index(创建一个文档或替换一个现有文档)、update(更新文档)、delete
例子如下:
[[email protected] ~]# curl -X POST "http://focuson1:9200/_bulk" -H 'Content-Type: application/json' -d'
> { "delete": { "_index": "megacorp", "_type": "employee", "_id": "123" }}
> { "create": { "_index": "megacorp", "_type": "employee", "_id": "123" }}
> { "title": "My first blog post" }
> { "index": { "_index": "megacorp", "_type": "employee" }}
> { "title": "My second blog post" }
> { "update": { "_index": "megacorp", "_type": "employee", "_id": "123", "_retry_on_conflict" : 3} }
> { "doc" : {"title" : "My updated blog post"} }
> '
{"took":87,"errors":false,"items":[{"delete":{"_index":"megacorp","_type":"employee","_id":"123","_version":1,"result":"not_found","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1,"status":404}},{"create":{"_index":"megacorp","_type":"employee","_id":"123","_version":2,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1,"status":201}},{"index":{"_index":"megacorp","_type":"employee","_id":"8iZmy2MBAdBddqEKxy1b","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":9,"_primary_term":1,"status":201}},{"update":{"_index":"megacorp","_type":"employee","_id":"123","_version":3,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":1,"status":200}}]}
这样有一个问题,每一个操作都要制定index、type,这样有点过分,可以在URL中制定index和type,这样,在每个操作中默认使用URL中的,如果自己指定,那么使用自己的。[[email protected] ~]# curl -X POST "http://focuson1:9200/megacorp/employee/_bulk" -H 'Content-Type: application/json' -d'
> { "delete": { "_id": "123" }}
> { "create": { "_id": "123" }}
> { "title": "My first blog post" }
> { "index": {}}
> { "title": "My second blog post" }
> { "update": {"_id": "123", "_retry_on_conflict" : 3} }
> { "doc" : {"title" : "My updated blog post"} }
> '
{"took":31,"errors":false,"items":[{"delete":{"_index":"megacorp","_type":"employee","_id":"123","_version":4,"result":"deleted","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":1,"status":200}},{"create":{"_index":"megacorp","_type":"employee","_id":"123","_version":5,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":1,"status":201}},{"index":{"_index":"megacorp","_type":"employee","_id":"8yZpy2MBAdBddqEKSi1M","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":10,"_primary_term":1,"status":201}},{"update":{"_index":"megacorp","_type":"employee","_id":"123","_version":6,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":5,"_primary_term":1,"status":200}}]}