elasticsearch5.0索引状态管理     所属分类 elasticsearch 浏览量 1560

Clear Cache 清除缓存

POST /twitter/_cache/clear

by default, will clear all caches. 
Specific caches can be cleaned explicitly by setting query, fielddata or request.

POST /kimchy,elasticsearch/_cache/clear

POST /_cache/clear


flush one or more indices through an API

The flush process of an index basically frees memory from the index 
by flushing data to the index storage and clearing the internal transaction log. 
By default, Elasticsearch uses memory heuristics in order to 
automatically trigger flush operations as required in order to clear memory.

刷新数据到索引存储  清除事务日志 

POST twitter/_flush


If set to true the flush operation will block until the flush can be executed 
if another flush operation is already executing. 
The default is false and will cause an exception to be thrown on the shard level 
if another flush operation is already running.

默认为 FALSE , 如果其他flush操作正在进行会引发异常


Whether a flush should be forced even if it is not necessarily needed 
ie. if no changes will be committed to the index. 
This is useful if transaction log IDs should be incremented even if 
no uncommitted changes are present. 
(This setting can be considered as internal)


POST kimchy,elasticsearch/_flush

POST _flush

Synced Flush

Elasticsearch tracks the indexing activity of each shard. 
Shards that have not received any indexing operations for 5 minutes are automatically marked as inactive. 
This presents an opportunity for Elasticsearch to reduce shard resources 
and also perform a special kind of flush, called synced flush. 
A synced flush performs a normal flush, 
then adds a generated unique marker (sync_id) to all shards.


Since the sync id marker was added when there were no ongoing indexing operations, 
it can be used as a quick way to check if the two shards' lucene indices are identical. 
This quick sync id comparison (if present) is used during recovery 
or restarts to skip the first and most costly phase of the process. 
In that case, no segment files need to be copied 
and the transaction log replay phase of the recovery can start immediately. 
Note that since the sync id marker was applied together with a flush, 
it is very likely that the transaction log will be empty, 
speeding up recoveries even more.

由于sync id标记是在没有进行索引操作的情况下添加的,

This is particularly useful for use cases having lots of indices 
which are never or very rarely updated, such as time based data. 
This use case typically generates lots of indices whose recovery 
without the synced flush marker would take a long time.


To check whether a shard has a marker or not, 
look for the commit section of shard stats returned by the indices stats API

查看 标记信息  commit部分
sync_id AWhZn76Lt8zQvWGuVKCx

POST twitter/_flush/synced
POST _flush/synced
POST kimchy,elasticsearch/_flush/synced

Any ongoing indexing operations will cause the synced flush to fail on that shard. 

The sync_id marker is removed as soon as the shard is flushed again. 

It is harmless to request a synced flush while there is ongoing indexing. 
Shards that are idle will succeed and shards that are not will fail. 
Any shards that succeeded will have faster recovery times.


synced flush fails due to concurrent indexing operations. 
The HTTP status code in that case will be 409 CONFLICT.
Sometimes the failures are specific to a shard copy.

synced flush和flush 

重启结点的时候,先对比一下shard的synced flush ID,就可以知道两个shard是否完全相同,
避免了不必要的segment file拷贝,极大加快了冷索引的恢复速度。
synced flush只对冷索引有效,对于热索引(5分钟内有更新的索引)没有作用

重启一个结点之前,为加快恢复启动速度 ,可以按照以下步骤执行 
关闭集群shard allocation
手动执行POST /_flush/synced
重新开启集群shard allocation 
等待recovery完成,集群health status变成green

对于冷索引,由于数据不再更新,利用synced flush特性,可以快速直接从本地恢复数据。 
而对于热索引,特别是shard很大的热索引,synced flush派不上用场,
需要大量跨结点拷贝segment file以外,translog recovery 会导致恢复慢。

对主片上的segment file做一个快照,然后拷贝到复制片分配到的结点。数据拷贝期间,不会阻塞索引请求,新增索引操作记录到translog里。


The refresh API allows to explicitly refresh one or more index, 
making all operations performed since the last refresh available for search. 
The (near) real-time capabilities depend on the index engine used. 

刷新之后 索引更新对搜索可见

POST /twitter/_refresh

POST /kimchy,elasticsearch/_refresh

POST /_refresh

Force Merge  强制合并

The merge relates to the number of segments a Lucene index holds within each shard. 
The force merge operation allows to reduce the number of segments by merging them.

合并段文件 ,减少段文件个数

POST /twitter/_forcemerge
POST /kimchy,elasticsearch/_forcemerge
POST /_forcemerge

This call will block until the merge is complete. 
If the http connection is lost, the request will continue in the background, 
and any new requests will block until the previous force merge is complete.

The number of segments to merge to. To fully merge the index, set it to 1. 
Defaults to simply checking if a merge needs to execute, and if so, executes it.

Should the merge process only expunge segments with deletes in it. 
In Lucene, a document is not deleted from a segment, just marked as deleted. 
During a merge process of segments, 
a new segment is created that does not have those deletes. 
This flag allows to only merge segments that have deletes. 
Defaults to false. Note that this won’t override the 
index.merge.policy.expunge_deletes_allowed threshold.

该参数设置为true则只合并包含删除文档的段 默认为false

Should a flush be performed after the forced merge. Defaults to true.

合并之后做一次FLUSH 默认为TRUE

上一篇     下一篇




elasticsearch aerospike kafka副本数设置