05 Detailed Explanation of Index Management #
索引是数据库中用于加快数据检索速度的一种数据结构。它类似于书籍的目录,可以帮助我们快速找到特定数据。
索引有助于减少查询中需要扫描的数据量,从而提高查询的性能。当我们执行一个查询时,数据库引擎会首先检查索引,然后按照索引中的顺序进行数据的访问,从而加快查询速度。
索引通常是在数据库表的一个或多个列上创建的,它们会根据这些列的值来排序和分组数据。例如,我们可以在一个包含客户信息的表中创建一个索引,按照客户的姓氏进行排序。在执行查询时,数据库引擎可以根据这个索引快速定位到姓氏为某个特定值的客户。
创建索引需要一定的存储空间,因此我们不能在所有列上都创建索引。一般来说,我们应该根据查询的频率和数据的分布情况来选择创建索引的列。
此外,索引也需要成本来维护。当我们对数据进行插入、更新或删除操作时,也要更新相应的索引。因此,在表中频繁进行数据修改的情况下,索引可能会减慢数据操作的速度。
总之,索引是数据库中一个重要的概念,合理使用索引可以大大提高数据库的性能。在实际应用中,我们需要根据具体的业务需求和数据特点,进行索引的设计和管理。
Introduction to Index Management #
In our previous section, we showed how to dynamically create an index for a customer by using the following statement:
PUT /customer/_doc/1
{
"name": "John Doe"
}
This index automatically creates a mapping for its fields, such as “name”. Let’s take a look at the mapping that was automatically created:
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
If we need more control over the indexing process, such as ensuring the index has an appropriate number of primary shards and that analyzers and mappings are set up before indexing any data, we can introduce two things: disabling automatic index creation and manually creating the index.
- Disabling Automatic Index Creation
We can disable automatic index creation by adding the following configuration under each node in the config/elasticsearch.yml
file:
action.auto_create_index: false
Manually creating the index will be covered in the following sections.
Index formatting #
To set up or map types in the request body, use the following format:
PUT /my_index
{
"settings": { ... any settings ... },
"mappings": {
"properties": { ... any properties ... }
}
}
-
settings: Used to configure settings such as shards and replicas.
-
mappings
Field mapping and types.
- properties: Since types will be deprecated in subsequent versions, nesting the type is unnecessary.
Index Management Operations #
We use the dev tool in Kibana to learn about index management operations.
Create Index #
We create an index named test-index-users
for users, which includes three properties: name, age, remarks; stored on one shard and one replica.
PUT /test-index-users
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"age": {
"type": "long"
},
"remarks": {
"type": "text"
}
}
}
}
Execution result
- Insert test data
View data
-
Let’s test with mismatched data type (age):
POST /test-index-users/_doc { “name”: “test user”, “age”: “error_age”, “remarks”: “hello eeee” }
You can see the error message for mismatched data type:
Modify Index #
Check the created index, curl 'localhost:9200/_cat/indices?v' | grep users
yellow open test-index-users LSaIB57XSC6uVtGQHoPYxQ 1 1 1 0 4.4kb 4.4kb
We notice that the status of the newly created index is yellow, because the test environment is a single-node environment and cannot create replicas. However, the above number_of_replicas
setting specified that the replica number is 1. So at this point, we need to modify the index configuration.
Change the number of replicas to 0
PUT /test-index-users/_settings
{
"settings": {
"number_of_replicas": 0
}
}
Check the status again:
green open test-index-users LSaIB57XSC6uVtGQHoPYxQ 1 1 1 0 4.4kb 4.4kb
Open/Close Index #
- Close Index
Once an index is closed, only metadata information of the index can be displayed, and no read or write operations can be performed.
After closing the index, inserting data again:
- Open Index
After opening, data can be written again
Delete Index #
Finally, we delete the created test-index-users.
DELETE /test-index-users
View Index #
Since test-index-users has been deleted, let’s check the information of the previously created bank index.
-
Mapping
GET /bank/_mapping
-
Settings
GET /bank/_settings
Managing Indexes in Kibana #
In Kibana, we can view and manage indexes at the following path: