文章详情|elasticsearch5.0文档读取API

elasticsearch5.0文档读取API 所属分类 elasticsearch 浏览量 1982
根据原文翻译整理
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/docs-get.html

The get API allows to get a typed JSON document from the index based on its id. 
跟据ID从索引获取json格式的文档

curl -XGET 'http://localhost:9200/twitter/tweet/1'
{
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1",
    "_version" : 1,
    "found": true,
    "_source" : {
        "user" : "kimchy",
        "postDate" : "2009-11-15T14:12:12",
        "message" : "trying out Elasticsearch"
    }
}

_index  _type  _id   _version _source

curl -XHEAD -i 'http://localhost:9200/twitter/tweet/1'
文档存在返回 200 ，不存在返回 404

By default, the get API is realtime, 
and is not affected by the refresh rate of the index (when data will become visible for search). 
If a document has been updated but is not yet refreshed, 
the get API will issue a refresh call in-place to make the document visible. 
This will also make other documents changed since the last refresh visible. 
In order to disable realtime GET, one can set the realtime parameter to false.

默认情况下，get API是实时的，并且不受索引刷新速度的影响(当数据对搜索可见时)。
如果文档已更新，但尚未刷新，get API将发出一个就地刷新调用，以使文档可见。
这还将使上次刷新之后更改的其他文档可见。
为了禁用realtime GET，可以将realtime参数设置为false。

Optional Type
The get API allows for _type to be optional. 
Set it to _all in order to fetch the first document matching the id across all types.


将_type设置为_all，获取所有类型匹配id的第一个文档。

Source filtering
默认会返回 _source字段 
http://localhost:9200/twitter/tweet/1?_source=false
_source=false  不返回 _source字段 

_source_include=*.id&_source_exclude=entities
_source=*.id,retweeted


http://localhost:9200/twitter/tweet/1?_source=user,message

The get operation allows specifying a set of stored fields that will be returned 
by passing the stored_fields parameter. 
If the requested fields are not stored, they will be ignored.
stored_fields参数，指定一组将返回的存储字段
如果请求的字段没有存储，它们将被忽略。

stored_fields=tags,counter
only leaf fields can be returned via the stored_field option

如果在索引时没有刷新，GET将访问事务日志来获取文档。
然而，有些字段仅在索引时生成。试图访问仅在索引时生成的字段，将得到一个异常(默认)。
如果要访问事务日志，可以通过设置 ignore_errors_on_generated_fields=true 来忽略生成的字段。


直接获取 source 
http://localhost:9200/twitter/tweet/1/_source

指定routing
curl -XGET 'http://localhost:9200/twitter/tweet/1?routing=kimchy'
routing 不对会 找不到文档

指定分片读取 
Controls a preference of which shard replicas to execute the get request on. 
By default, the operation is randomized between the shard replicas.

默认 随机从分片副本上读取数据


_primary
The operation will go and be executed only on the primary shards.

_local
The operation will prefer to be executed on a local allocated shard if possible.

从主分片读取 尽量本地读取 


Custom (string) value
A custom value will be used to guarantee that the same shards will be used for the same custom value.
A sample value can be something like the web session id, or the user name.

The refresh parameter can be set to true in order to 
refresh the relevant shard before the get operation and make it searchable.

刷新参数可以设置为true，以便在get操作之前刷新相关碎片并使其可搜索。
将其设置为true应该在仔细考虑并验证这不会给系统带来沉重的负载(并降低索引速度)之后进行。

The get operation gets hashed into a specific shard id. 
It then gets redirected to one of the replicas within that shard id and returns the result.
The replicas are the primary shard and its replicas within that shard id group. 
This means that the more replicas we will have, the better GET scaling we will have.

get操作被散列到一个特定的分片id中，然后被重定向到该分片id中的一个副本并返回结果。
更多的副本，将扩展读的能力。
主副本+其他副本 
所有活跃副本  随机选择一个读取

CAP 
数据一致性  可用性 
性能
elasticsearch5.0重要系统配置

elasticsearch5.0API约定

elasticsearch5.0文档索引API

elasticsearch5.0文档删除API

kafka这些年

elasticsearch5.0文档查询删除API