首页   快速返回

elasticsearch5.0文档索引API     所属分类 elasticsearch

Index API

inserts the JSON document into the "twitter" index, under a type called "tweet" with an id of 1

PUT twitter/tweet/1
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"

The result of the above index operation is:
    "_shards" : {
        "total" : 2,
        "failed" : 0,
        "successful" : 2
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1",
    "_version" : 1,
    "created" : true,
    "result" : created
Replica shards may not all be started when an indexing operation successfully returns 
(by default, only the primary is required, but this behavior can be changed). 

Automatic Index Creation

The index operation automatically creates an index if it has not been created before,
and also automatically creates a dynamic type mapping for the specific type if one has not yet been created
索引操作会自动创建索引,自动创建动态 type mapping
The mapping itself is very flexible and is schema-free. 
New fields and objects will automatically be added to the mapping definition of the type specified.



禁用自动创建索引  节点配置文件  action.auto_create_index=false
Automatic mapping creation can be disabled by setting index.mapper.dynamic to false per-index as an index setting.
索引设置 index.mapper.dynamic=false

Automatic index creation can include a pattern based white/black list, 
for example, set action.auto_create_index to +aaa*,-bbb*,+ccc*,-* 
(+ meaning allowed, and - meaning disallowed).


Each indexed document is given a version number. 
The index API optionally allows for optimistic concurrency control when the version parameter is specified. 

每个文档都有一个版本, 索引时提供版本,会使用乐观锁机制

 A good example of a use case for versioning is performing a transactional read-then-update. 
 Specifying a version from the document initially read ensures no changes have happened in the meantime 
 (when reading in order to update, it is recommended to set preference to _primary).

PUT twitter/tweet/1?version=2
    "message" : "elasticsearch now has versioning support, double cool!"

Optionally, the version number can be supplemented with an external value 
(for example, if maintained in a database). 
To enable this functionality, version_type should be set to external. 

使用外部版本号  version_type=external

The value provided must be a numeric, long value greater or equal to 0

If the value provided is less than or equal to the stored document’s version number, 
a version conflict will occur and the index operation will fail.

外部版本号从0开始 ,
documents with version number equal to zero cannot neither be updated using the Update-By-Query API 
nor be deleted using the Delete By Query API as long as their version number is equal to zero.


version types 
Here is an overview of the different version types and their semantics.

only index the document if the given version is identical to the version of the stored document.

external or external_gt


Operation Type 操作类型

The index operation also accepts an op_type that can be used to force a create operation, 
allowing for "put-if-absent" behavior. 
When create is used, the index operation will fail if a document by that id already exists in the index.

不存在才创建的实现使用 create 模式时,文档存在时,索引操作失败

PUT twitter/tweet/1?op_type=create
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"

PUT twitter/tweet/1/_create
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"

索引文档时可以不指定ID , 这样 ID自动生成,使用CREATE模式,
note the POST used instead of PUT
注意使用 POST 代替 PUT  !!!

shard placement — or routing  is controlled by using a hash of the document’s id value.


 For more explicit control, the value fed into the hash function used by the router 
 can be directly specified on a per-operation basis using the routing parameter. 
使用 routing 设置路由 hash函数值

POST twitter/tweet?routing=kimchy
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"

 the "tweet" document is routed to a shard based on the routing parameter provided: "kimchy".
路由hash函数使用 kimchy 进行路由


A child document can be indexed by specifying its parent when indexing. 

When indexing a child document, the routing value is automatically set to be the same as its parent, 

The index operation is directed to the primary shard based on its route 
and performed on the actual node containing this shard. 
After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.


To improve the resiliency of writes to the system, 
indexing operations can be configured to wait for a certain number of active shard copies 
before proceeding with the operation. 


By default, write operations only wait for the primary shards to be active before proceeding 
默认 写操作主分片成功即可

This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. 

Valid values are all or any positive integer up to the total number of configured copies per shard in the index 
(which is number_of_replicas+1). 
Specifying a negative value or a number greater than the number of shard copies will throw an error.

主分片副本数配置为0 ,则没有副本

3个节点 ,设置3个副本 ( 1主副本+3个副本 总共有4个副本)
默认 ,主副本写入成功即可
wait_for_active_shards 设置为3 ,需要3个副本写入成功

active shard copies



When updating a document using the index api a new version 
of the document is always created even if the document hasn’t changed. 

_update api with detect_noop set to true

This option isn’t available on the index api because the index api doesn’t fetch the old source 
and isn’t able to compare it against the new source.



The primary shard assigned to perform the index operation 
might not be available when the index operation is executed. 
Some reasons for this might be that the primary shard 
is currently recovering from a gateway or undergoing relocation. 
By default, the index operation will wait on the primary shard to become available 
for up to 1 minute before failing and responding with an error. 
The timeout parameter can be used to explicitly specify how long it waits. 

执行索引操作时,主分片可能不可用。 当前正在恢复或重新定位
PUT twitter/tweet/1?timeout=5m
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elasticsearch"

上一篇     下一篇