Upgrading elasticsearch server data

Published Nov 07, 2017
Upgrading elasticsearch server data

Migrating server data from elasticsearch 1.x to 5.x version using reindex api in nodejs. This is done with zero downtime and handle live-server data migration.

If you are running a 1.x cluster and would like to migrate directly to 5.x without first migrating to 2.x, you can do so using reindex-from-remote.

Create new index:

Create new index in 5.x cluster with update mappings. Read all the api changes in the new elasticsearch version that needs to be change.

Set the refresh_interval to -1 and set number_of_replicas to 0 for faster reindexing.

client.indices.create({  
  index: "myindex_v2",
  body:{
  settings:{    },
  mappings:{     }
  }
},function(err,res) {
  if(err) {
    console.log(err);
  }
  else {
    console.log("create",res);
  }
});

Reindexing:

Use reindex-from-remote to pull documents from the 1.x index into the new 5.x index.

You can reindex without using range query. That would pull all documents during that snapshot. This will not include the documents created or updated during reindexing.

--- whitelist for remote indexing --- Add this to elasticsearch.yml

reindex.remote.whitelist: hostname:9200

Take current timestamp T1 and query for all records in a given namespace that was modified before this timestamp and send them for reindexing. Timestamp can be field, it should show the time at which the document is last updated.

client.reindex({
   body:{
    source: {
        remote: {
    			    host: "http://hostname:9200"
    			},
    	index: "myindex_v1",
    	query: {
        	range : {
            			timestamp : {
               					lte : T1
            				        }
        			}
    			}
  			},
    dest: {
     index: "myindex_v2"
      }
    }
},function(err,res) {
  if(err) {
    console.log(err);
  }
  else {
    console.log("reindex",res);
  }
});

Once they are all reindexed, you take a new timestamp T2 and reindex any record that was modified in the T1-T2 period. You can redo this until there's no changes (or do a maximum number of iterations).

client.reindex({
   	body:{
   		source: {
    		remote: {
    			host: "http://hostname:9200"
    				},
    		index: "myindex_v1",
    		query: {
        		range : {
            		timestamp : {
                		gte : T1,
               			lte : T2
            					}
        				}
    				}
  				},
  		dest: {
    		index: "myindex_v2"
  			    }
      }
},function(err,res) {
  		if(err) {
  			console.log(err);
  			}
  		else {
    		console.log("reindex",res);
  		    }
});

Update the Alias:

Change the myindex alias to point to the new index, in a single atomic step.

client.indices.updateAliases({
  body:{
  actions : [
        { add : { index : "myindex_v2", alias : "myindex" } }
    	]
    }
},function(err,res) {
  if(err) {
    console.log(err);
  }
  else {
    console.log("alias updated",res);
  }

});

Remove the alias from old index that is connect to remote server.

client.indices.updateAliases({
  body:{
  actions : [
        { remove : { index : "myindex_v2", alias : "myindex" } }
    	]
    }
},function(err,res) {
  if(err) {
    console.log(err);
  }
  else {
    console.log("alias removed",res);
  }

});

client here should be hostname:9200

You can we the code on github at: https://github.com/monikamaheshwari/upgradeElasticsearch

Discover and read more posts from Monika Maheshwari
get started
Enjoy this post?

Leave a like and comment for Monika

7