首页   快速返回

elasticsearch5.0文档查询删除API
文章分类 elasticsearch
发布时间 2019-01-23 修改时间 2019-01-23
根据原文翻译整理
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/docs-delete-by-query.html

根据查询条件删除

获取索引 设置和映射信息
http://127.0.0.1:9200/twitter

http://127.0.0.1:9200/twitter/tweet/_search

http://127.0.0.1:9200/customer
http://127.0.0.1:9200/customer/type1/_search


curl -X PUT http://127.0.0.1:9200/customer/type1/2?pretty -d '{"name":"dog"}'
curl -X PUT http://127.0.0.1:9200/customer/type1/3?pretty -d '{"name":"dog"}'

删除 name=dog的记录

curl -X POST http://127.0.0.1:9200/customer/type1/_delete_by_query?pretty -d '{"query":{"match":{"name":"dog"}}}'


{
  "took" : 105,
  "timed_out" : false,
  "total" : 2,
  "deleted" : 2,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}


_delete_by_query gets a snapshot of the index when it starts 
and deletes what it finds using internal versioning. 
That means that you’ll get a version conflict if the document changes 
between the time when the snapshot was taken and when the delete request is processed. 
When the versions match the document is deleted.

快照 ,内部版本 

Since internal versioning does not support the value 0 as a valid version number, 
documents with version equal to zero cannot be deleted using _delete_by_query and will fail the request.

内部版本不支持0 , 版本为0的文档不能使用 _delete_by_query 删除

_delete_by_query执行期间,将顺序执行多个搜索请求,以查找要删除的所有匹配文档。
每次找到一批文档时,会执行相应的批量请求来删除所有这些文档。如果一个搜索或批量请求被拒绝,
依赖于一个默认策略来重试被拒绝的请求(最多10次,指数级后退)。
达到最大重试限制会导致中止,已经执行的删除仍然有效。
当第一个失败导致中止时,失败的批量请求返回的所有失败都在failure元素中返回;

conflicts=proceed

delete documents of multiple indexes and multiple types at once
同时删除 多个索引和类型

POST twitter,blog/tweet,post/_delete_by_query
{
  "query": {
    "match_all": {}
  }
}

指定路由 routing=1

By default _delete_by_query uses scroll batches of 1000. 
You can change the batch size with the scroll_size URL parameter:

scroll_size=5000

In addition to the standard parameters like pretty, 
the Delete By Query API also supports refresh, wait_for_completion, wait_for_active_shards, and timeout.


wait_for_completion=false

perform some preflight checks, launch the request, 
and then return a task which can be used with Tasks APIs to cancel or get the status of the task. 

异步删除 返回任务 可以取消或者查看状态

create a record of this task as a document at .tasks/task/${taskId}. 

http://127.0.0.1:9200/_cat/tasks?v
http://127.0.0.1:9200/_tasks

requests_per_second
控制每秒查询删除的请求数

bursty 丛发性
smooth 平滑

{
  "took" : 639,
  "deleted": 0,
  "batches": 1,
  "version_conflicts": 2,
  "retries": 0,
  "throttled_millis": 0,
  "failures" : [ ]
}
throttled_millis  
Number of milliseconds the request slept to conform to requests_per_second.
请求睡眠毫秒数以符合requests_per_second的配置。

You can fetch the status of any running delete-by-query requests with the Task API:
GET _tasks?detailed=true&actions=*/delete/byquery

can estimate the progress by adding the updated, created, and deleted fields. 
The request will finish when their sum is equal to the total field.

实时查询任务状态 

With the task id you can look up the task directly:

GET /_tasks/taskId:1

取消任务

Any Delete By Query can be canceled using the Task Cancel API:
POST _tasks/task_id:1/_cancel


requests_per_second 这个参数可以动态设置 
POST _delete_by_query/task_id:1/_rethrottle?requests_per_second=-1

Delete-by-query supports Sliced Scroll allowing you to manually parallelize the process relatively easily:
支持切片滚动 手动并行化处理过程

POST twitter/_delete_by_query
{
  "slice": {
    "id": 0,
    "max": 2
  },
  "query": {
    "range": {
      "likes": {
        "lt": 10
      }
    }
  }
}
POST twitter/_delete_by_query
{
  "slice": {
    "id": 1,
    "max": 2
  },
  "query": {
    "range": {
      "likes": {
        "lt": 10
      }
    }
  }
}

http://127.0.0.1:9200/_refresh

上一篇     下一篇
elasticsearch5.0文档读取API

elasticsearch5.0文档删除API

kafka这些年

elasticsearch5.0文档更新API

elasticsearch5.0文档查询更新API

elasticsearch5.0批量读取API