ClusterControl™ for MySQL Galera enables customers to Deploy, Manage, Monitor and Scale a clustered MySQL database platform based on the Galera Replication protocol.
Galera Replication is a synchronous multi-master replication plug-in for InnoDB. It is very different from the regular MySQL Replication, and addresses a number of issues including write conflicts when writing on multiple masters, replication lag and slaves being out of sync with the master. Users do not have to know which server they can write to (the master) and which servers they can read from (the slaves).
An application can write to any node in a Galera Replication cluster, and transaction commits (RBR events) are then applied on all servers, via a certification-based replication.
Certification-based replication is an alternative approach to synchronous database replication using Group Communication and transaction ordering techniques.
A minimal Galera cluster consists of 3 nodes. The reason is that, should there be a problem applying a transaction on one node (e.g., network problem or the machine becomes unresponsive), the two other nodes will have a quorum (i.e. a majority) and will be able to proceed with the transaction commit.
MySQL Replication is part of the standard MySQL database, and is mainly asynchronous in nature. Updates are always done on one master, and these are propagated to slaves. It is possible to create a ring topology with multiple masters, however this is not recommended as it is very easy for the servers to get out of sync in case of a master failing. There is no automatic failover or resynchronization in these cases.
Galera Replication is a plug-in to MySQL, and enables a true master-master setup for InnoDB. In a Galera Replication cluster, all nodes are masters and applications can read and write from any node. Transactions are synchronously committed on all nodes. In case of a node failing, the other nodes will continue to operate and kept up to date. When the failed node comes up again, it automatically synchronizes with the other nodes before it is allowed back into the cluster. No data is lost when a node fails.
Galera Replication has a number of benefits:
Like any solution, there are some limitations:
Although Galera Replication is synchronous, it is possible to deploy a Galera Replication cluster across data centers. Synchronous replication is traditionally implemented via 2-phase commit, where messages are sent to all nodes in a cluster in a 'prepare' phase, and another set of messages are sent in a 'commit' phase. This approach is usually not suitable for geographically disparate nodes, because of the latencies in sending messages between nodes.
Galera Replication makes use of certification based replication, that is a form of synchronous replication with reduced overhead.
Certification based replication uses group communication and transaction ordering techniques to achieve synchronous replication. Transactions execute optimistically in a single node (or replica) and, at commit time, run a coordinated certification process to enforce global consistency. Global coordination is achieved with the help of a broadcast service, that establishes a global total order among concurrent transactions.
Pre-requisites for certification based replication:
The main idea is that a transaction is executed conventionally until the commit point, under the assumption that there will be no conflict. This is called optimistic execution. When the client issues a COMMIT command (but before the actual commit has happened), all changes made to the database by the transaction and the primary keys of changed rows are collected into a writeset. This writeset is then replicated to the rest of the nodes. After that, the writeset undergoes a deterministic certification test (using the collected primary keys) on each node (including the writeset originator node) which determines if the writeset can be applied or not.
If the certification test fails, the writeset is dropped and the original transaction is rolled back. If the test succeeds, the transaction is committed and the writeset is applied on the rest of the nodes.
The certification test implemented in Galera depends on the global ordering of transactions. Each transaction is assigned a global ordinal sequence number during replication. Thus, when a transaction reaches the commit point, it is known what was the sequence number of the last transaction it did not conflict with. The interval between those two numbers is an uncertainty land: transactions in this interval have not seen the effects of each other. Therefore, all transactions in this interval are checked for primary key conflicts with the transaction in question. The certification test fails if a conflict is detected.
Since the procedure is deterministic and all replicas receive transactions in the same order, all nodes reach the same decision about the outcome of the transaction. The node that started the transaction can then notify the client application if the transaction has been committed or not.
Certification based replication (or more precisely, certification-based conflict resolution) is based on academic research, in particular on Fernando Pedone's Ph.D. thesis
A Galera Replication cluster can be configured using the Severalnines Configurator:
The wizard collects some high level data (e.g. IP addresses of machines, size of database, amount of RAM in machines, type of database workload, etc.). The application then generates the configuration of the Galera Replication cluster. At the end, a deployment package is automatically generated and emailed to the user.
The deployment package automates a number of tasks:
At the end, it has a verification phase that verifies whether the cluster has been properly set up. The Galera Replication cluster is now deployed.
In order to keep the database cluster stable and running, it is important for the system to be resilient to failures. Failures are caused by either software bugs or hardware problems, and can happen at any time. In case a server goes down, failure handling, failover and reconfiguration of the Galera cluster needs to be automatic, so as to minimize downtime.
In case of a node failing, applications connected to that node can connect to another node and continue to do database requests. Keepalive messages are sent between nodes in order to detect failures, in which case the failed node is excluded from the cluster.
ClusterControl™ will restart the failed database process, and point it to one of the existing nodes (a 'donor') to resynchronize. The resynchronization process is handled by Galera.
While the failed node is resynchronizing, any new transactions (writesets) coming from the existing nodes will be cached in a slave queue. Once the node has caught up, it will be considered as SYNCED and is ready to accept client connections.
Adding a new node is automatic with Galera. It follows the same process as recovering a failed node. When the node is introduced, ClusterControl™ will select a 'donor' and do a full state snapshot transfer from that node.
During the synchronization process, any incoming transactions are cached in a slave queue. Once the node is SYNCED, it is ready to accept client connections.