首页   快速返回

elasticsearch5.0磁盘使用优化

根据原文翻译整理
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/tune-for-disk-usage.html



Disable the features you do not need
禁用不需要的特性

有一个名为foo的数字字段,需要在其上运行分布统计,但不需要在其上进行筛选,可以在映射中禁用该字段的索引:

PUT index
{
  "mappings": {
    "type": {
      "properties": {
        "foo": {
          "type": "integer",
          "index": false
        }
      }
    }
  }
}


text fields store normalization factors in the index in order to be able to score documents. 
If you only need matching capabilities on a text field but do not care about the produced scores, 
you can configure elasticsearch to not write norms to the index

文本字段将规范化因子存储在索引中,以便能够对文档进行评分。
如果只需要文本字段上的匹配功能,而不关心生成的分数,可以配置为不向索引写入规范

PUT index
{
  "mappings": {
    "type": {
      "properties": {
        "foo": {
          "type": "text",
          "norms": false
        }
      }
    }
  }
}

text fields also store frequencies and positions in the index by default. 
Frequencies are used to compute scores and positions are used to run phrase queries. 
If you do not need to run phrase queries, you can tell elasticsearch to not index positions:

词频 位置信息 
词频用于计算分数
位置用于短语查询
不使用短语查询 可以不写入位置信息




PUT index
{
  "mappings": {
    "type": {
      "properties": {
        "foo": {
          "type": "text",
          "index_options": "freqs"
        }
      }
    }
  }
}

Furthermore if you do not care about scoring either, 
you can configure elasticsearch to just index matching documents for every term. 
You will still be able to search on this field, 
but phrase queries will raise errors and scoring will assume that terms appear only once in every document.

Don’t use default dynamic string mappings
不要使用默认的String映射

The default dynamic string mappings will index string fields both as text and keyword. 
This is wasteful if you only need one of them. 
Typically an id field will only need to be indexed as a keyword 
while a body field will only need to be indexed as a text field.

This can be disabled by either configuring explicit mappings on string fields 
or setting up dynamic templates that will map string fields as either text or keyword.

For instance, here is a template that can be used in order to only map string fields as keyword

只将字符串字段映射为关键字

PUT index
{
  "mappings": {
    "type": {
      "dynamic_templates": [
        {
          "strings": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ]
    }
  }
}

Disable _all
The _all field indexes the value of all fields of a document and can use significant space. 
If you never need to search against all fields at the same time, it can be disabled.

禁用_all
_all字段索引文档中所有字段的值,可以使用很大的空间。
如果不需要同时搜索所有字段,可以禁用它。

Use best_compression
The _source and stored fields can easily take a non negligible amount of disk space. 
They can be compressed more aggressively by using the best_compression codec.

negligible 可忽略不计

Use the smallest numeric type that is sufficient
使用更小的数字类型 
sufficient 足够的

The type that you pick for numeric data can have a significant impact on disk usage. 
In particular, integers should be stored using an integer type (byte, short, integer or long) 
and floating points should either be stored in a scaled_float if appropriate 
or in the smallest type that fits the use-case: using float over double, 
or half_float over float will help save storage.

上一篇     下一篇
elasticsearch5.0的一般建议

elasticsearch5.0索引性能优化

elasticsearch5.0搜索性能优化

plantUML安装使用

通过男女关系形象解读大数据技术

elasticsearch5.0索引别名