Updating Elasticsearch

From translatewiki.net

This document describes the process of updating the Elasticsearch version running on translatewiki.net.

We then begin making changes on the server.

  • Set wgDisableSearchUpdate = true and wgTranslateTranslationDefaultService = false in LocalSettings.php in canary.
  • Check that canary shows This wiki does not have a translation search service. when accessing Special:SearchTranslations and then deploy the change.
  • Clone and run the Gerrit patch change to remove the old version of Elasticsearch. Once the old version of Elasticsearch is removed, the patch can be abandoned.
    • If there is a failure when removing a plugin, see the section below to debug the issue.
    • Verify that the Elasticsearch service is no longer running.
  • Clone the Gerrit patch that updates Elasticsearch, and run themake noop command to verify that everything works properly.
  • Then run make apply to actually perform the update.
    • Verify that the Elasticsearch service is now back up and running.
  • The Elasticsearch update patch on Gerrit can be merged.

Next we will rebuild the search and translation memory indexes. Note that for minor version updates of Elasticsearch, it may not be necessary to rebuild the entire search index.

  • Run search-reindex script that rebuilds the search index.
  • Start a screen and run nice bash search-reindex. This is a long-running script. See the section below for more details on how to review the output of this script.
    • See section below to determine if a failure occurred that needs to be fixed.
  • Remove $wgDisableSearchUpdate from LocalSettings.php and deploy.
  • In a separate screen, run nice php extensions/Translate/scripts/ttmserver-export.php > ttmserver-export.log. After the process ends, review the log file to see if there were any errors.
  • Remove $wgTranslateTranslationDefaultService from LocalSettings.php and deploy.
  • Verify that Special:SearchTranslations no longer shows that there are no translation search services.
  • After both the commands have finished running, remove the site notice.

Handling Elasticsearch plugin uninstall failure

  • Comment out the plugin that is having the failure in mwelasticsearch.pp.
  • Remove the plugin manually. Example: rm -rf /usr/share/elasticsearch/plugins/experimenal-highlighter.
  • Run puppet again.

Reviewing search-reindex output

The search-reindex command generates two files: reindex.log, and then reindex-2.log by two different commands. If reindex-2.log is not present that means that the first command failed and hence the second command wasn't run.

To determine if there is an error, take a look at the re-index log files. A non-zero Exitval denotes a failed command run.

Seq	Host	Starttime	JobRuntime	Send	Receive	Exitval	Signal	Command
1	:	1668081198.336	     0.881	0	110	0	0	php /srv/mediawiki/workdir/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks 1 --indexOnSkip 1 --batch-size 50 --fromId 711 --toId 1211
8	:	1668081198.499	     1.020	0	108	0	0	php /srv/mediawiki/workdir/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks 1 --indexOnSkip 1 --batch-size 50 --fromId 4211 --toId 4711
7	:	1668081198.469	     1.157	0	110	0	0	php /srv/mediawiki/workdir/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks 1 --indexOnSkip 1 --batch-size 50 --fromId 3711 --toId 4211
6	:	1668081198.436	     1.213	0	108	0	0	php /srv/mediawiki/workdir/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks 1 --indexOnSkip 1 --batch-size 50 --fromId 3211 --toId 3711
3	:	1668081198.346	     1.358	0	110	0	0	php /srv/mediawiki/workdir/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks 1 --indexOnSkip 1 --batch-size 50 --fromId 1711 --toId 2211

You can run that specific command again manually to determine what went wrong.