What’s an online DDL operation? In a nut shell, a DDL(Data Definition Language) operation is something like alter table ...
which will change the data structure of a table. Some changes only touch metadata, eg. index, etc. so those can be updated in-place. But if a field definition needs to change, eg. UTF8mb3 –> UTF8mb4, MySQL can’t do it in-place – the table will be locked during the operation until all rows have been migrated. This will be an issue if the table is huge and is in-use by a 24×7 application.
With gh-ost, a cool utility Github developed to do their own database migrations, The fore-mentioned blocking operation can be turned into a non-blocking online migration. I ran some tests with gh-ost and I really like it. Here’s how I tested it.
# I did my test in a ubuntu container k run --rm -ti raytest --image ubuntu:latest -- /bin/bash # inside the container # install tools apt update -y apt install -y curl netcat-openbsd # install gh-ost cd ~ curl -L https://github.com/github/gh-ost/releases/download/v1.1.7/gh-ost-binary-linux-amd64-20241219160321.tar.gz| tar zxvf - # pre-flight ./gh-ost --version 1.1.7 (git commit: d5ab048c1f046821f3c7384a386fc1c3ae399c92) # run an online migration ./gh-ost \ --max-load=Threads_running=100 \ --critical-load=Threads_running=1000 \ --critical-load-hibernate-seconds=15 \ --chunk-size=500 \ --max-lag-millis=1500 \ --user="root" \ --password=$MYSQL_ROOT_PASSWORD \ --host="192.168.1.41" \ # MySQL master instance --throttle-control-replicas="192.168.1.42" \ # optional MySQL replica instance --port=3306 \ --switch-to-rbr \ --allow-on-master \ --database=my_database \ --table=my_table \ --verbose \ --alter='DEFAULT CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci, MODIFY COLUMN `my_column` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL' \ --cut-over=default \ --exact-rowcount \ --concurrent-rowcount \ --default-retries=120 \ --panic-flag-file=/tmp/ghost.panic.flag \ # as long as this file exists, gh-ost will not perform the final cut-over --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag --execute # required otherwise just dry-run
What this does is:
- copy table my_table to _my_table_gho
- do the migration on _my_table_gho
- tail binlogs and update _my_table_gho to catch up with my_table
- check if /tmp/ghost.postpone.flag exists then go back to step 3
- lock tables for a very short period, delete my_table and rename _my_table_gho to my_table
- unlock and done
And if one asks why netcat
is needed here? Check this out 🙂