Elastic Search

Elasticsearch Reindex in Place

Elasticsearch indexing is an important feature that allows the engine to perform fast and accurately.

However, as you know, once data gets mapped into an index, it’s unmodifiable. To do this, you will need to reindex the data with the modifications you require. This process may lead to downtime, which is not a very good practice, especially for a service that is already in circulation.

To circumvent this, we can use index aliases, which allow us to switch between indices seamlessly.

How to Create an Index?

The first step is to ensure you have an existing index that you wish to update the data.

For this tutorial, we will have an old and new index that will function as their names.

PUT /old_index/
{
  "settings": {
    "number_of_shards": 1
  },
  "aliases": {
    "use_me": {}
  },
  “mappings”: {
    "properties": {
      "name":{
        "type": "text"
      },
      "id":{
        "type": "integer"
      },
      "paid": {
        "type": "boolean"
      }
    }
  }
}

For cURL users, use the appended command:

curl -XPUT "http://localhost:9200/old_index/" -H 'Content-Type: application/json' -d'{  "settings": {    "number_of_shards": 1  },  "aliases": {    "use_me": {}  },   "mappings": {    "properties": {      "name":{        "type": "text"      },      "id":{        "type": "integer"      },      "paid": {        "type": "boolean"      }    }  }}'

Next, create a new index that we are going to use. Copy all the settings and mappings from the old index as:

PUT /new_index
{
  "settings": {
    "number_of_shards": 1
  },
  "aliases": {
    "use_me": {}
  },
  “mappings”: {
    "properties": {
      "name":{
        "type": "text"
      },
      "id":{
        "type": "integer"
      },
      "paid": {
        "type": "object"
      }
    }
  }
}

Here’s the cURL command:

curl -XPUT "http://localhost:9200/new_index" -H 'Content-Type: application/json' -d'{  "settings": {    "number_of_shards": 1  },  "aliases": {    "use_me": {}  },   "mappings": {    "properties": {      "name":{        "type": "text"      },      "id":{        "type": "integer"      },      "paid": {        "type": "object"      }    }  }}'

Having the setting and mappings in the new index, use the reindex api to copy the data from the old index to the new one:

POST _reindex
{
  “source”: {
    "index": "old_index"
  },
  "dest": {
    "index": "new_index"
  }
}

Here’s the cURL command:

curl -XPOST "http:/localhost:9200/_reindex" -H 'Content-Type: application/json' -d'{  "source": {    "index": "old_index"  },  "dest": {    "index": "new_index"  }}'

Now, copy the alias of the old index to the new one using the _alias api as:

POST /_aliases
{
    "actions" : [
        { "add" : { "index" : "new_index", "alias" : "use_me" } }
    ]
}

Here’s the cURL command:

curl -XPOST "http://localhost:9200/_aliases" -H 'Content-Type: application/json' -d'{    "actions" : [        { "add" : { "index" : "new_index", "alias" : "use_me" } }    ]}'

Once completed, you can now remove the old index, and the applications will use the new index (due to the alias) with no downtime.

Conclusion

Once you master the concepts discussed in this tutorial, you will be in a position to reindex data from an old index to a new one in place.

About the author

John Otieno

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list