ClusterControl™ for MySQL Galera enables customers to Deploy, Manage, Monitor and Scale a clustered MySQL database platform based on the Galera Replication protocol.
Galera Replication is a synchronous multi-master replication plug-in for InnoDB. It is very different from the regular MySQL Replication, and addresses a number of issues including write conflicts when writing on multiple masters, replication lag and slaves being out of sync with the master. Users do not have to know which server they can write to (the master) and which servers they can read from (the slaves).
An application can write to any node in a Galera Replication cluster, and transaction commits (RBR events) are then applied on all servers, via a certification-based replication.
Certification-based replication is an alternative approach to synchronous database replication using Group Communication and transaction ordering techniques.
A minimal Galera cluster consists of 3 nodes. The reason is that, should there be a problem applying a transaction on one node (e.g., network problem or the machine becomes unresponsive), the two other nodes will have a quorum (i.e. a majority) and will be able to proceed with the transaction commit.
MySQL Replication is part of the standard MySQL database, and is mainly asynchronous in nature. Updates are always done on one master, and these are propagated to slaves. It is possible to create a ring topology with multiple masters, however this is not recommended as it is very easy for the servers to get out of sync in case of a master failing. There is no automatic failover or resynchronization in these cases.
Galera Replication is a plug-in to MySQL, and enables a true master-master setup for InnoDB. In a Galera Replication cluster, all nodes are masters and applications can read and write from any node. Transactions are synchronously committed on all nodes. In case of a node failing, the other nodes will continue to operate and kept up to date. When the failed node comes up again, it automatically synchronizes with the other nodes before it is allowed back into the cluster. No data is lost when a node fails.
Galera Replication has a number of benefits:
Like any solution, there are some limitations:
Although Galera Replication is synchronous, it is possible to deploy a Galera Replication cluster across data centers. Synchronous replication is traditionally implemented via 2-phase commit, where messages are sent to all nodes in a cluster in a 'prepare' phase, and another set of messages are sent in a 'commit' phase. This approach is usually not suitable for geographically disparate nodes, because of the latencies in sending messages between nodes.
Galera Replication makes use of certification based replication, that is a form of synchronous replication with reduced overhead.
Certification based replication uses group communication and transaction ordering techniques to achieve synchronous replication. Transactions execute optimistically in a single node (or replica) and, at commit time, run a coordinated certification process to enforce global consistency. Global coordination is achieved with the help of a broadcast service, that establishes a global total order among concurrent transactions.
Pre-requisites for certification based replication:
The main idea is that a transaction is executed conventionally until the commit point, under the assumption that there will be no conflict. This is called optimistic execution. When the client issues a COMMIT command (but before the actual commit has happened), all changes made to the database by the transaction and the primary keys of changed rows are collected into a writeset. This writeset is then replicated to the rest of the nodes. After that, the writeset undergoes a deterministic certification test (using the collected primary keys) on each node (including the writeset originator node) which determines if the writeset can be applied or not.
If the certification test fails, the writeset is dropped and the original transaction is rolled back. If the test succeeds, the transaction is committed and the writeset is applied on the rest of the nodes.
The certification test implemented in Galera depends on the global ordering of transactions. Each transaction is assigned a global ordinal sequence number during replication. Thus, when a transaction reaches the commit point, it is known what was the sequence number of the last transaction it did not conflict with. The interval between those two numbers is an uncertainty land: transactions in this interval have not seen the effects of each other. Therefore, all transactions in this interval are checked for primary key conflicts with the transaction in question. The certification test fails if a conflict is detected.
Since the procedure is deterministic and all replicas receive transactions in the same order, all nodes reach the same decision about the outcome of the transaction. The node that started the transaction can then notify the client application if the transaction has been committed or not.
Certification based replication (or more precisely, certification-based conflict resolution) is based on academic research, in particular on Fernando Pedone's Ph.D. thesis
A Galera Replication cluster can be configured using the Severalnines Configurator:
The wizard collects some high level data (e.g. IP addresses of machines, size of database, amount of RAM in machines, type of database workload, etc.). The application then generates the configuration of the Galera Replication cluster. At the end, a deployment package is automatically generated and emailed to the user.
The deployment package automates a number of tasks:
At the end, it has a verification phase that verifies whether the cluster has been properly set up. The Galera Replication cluster is now deployed.
In order to keep the database cluster stable and running, it is important for the system to be resilient to failures. Failures are caused by either software bugs or hardware problems, and can happen at any time. In case a server goes down, failure handling, failover and reconfiguration of the Galera cluster needs to be automatic, so as to minimize downtime.
In case of a node failing, applications connected to that node can connect to another node and continue to do database requests. Keepalive messages are sent between nodes in order to detect failures, in which case the failed node is excluded from the cluster.
ClusterControl™ will restart the failed database process, and point it to one of the existing nodes (a 'donor') to resynchronize. The resynchronization process is handled by Galera.
While the failed node is resynchronizing, any new transactions (writesets) coming from the existing nodes will be cached in a slave queue. Once the node has caught up, it will be considered as SYNCED and is ready to accept client connections.
Adding a new node is automatic with Galera. It follows the same process as recovering a failed node. When the node is introduced, ClusterControl™ will select a 'donor' and do a full state snapshot transfer from that node.
During the synchronization process, any incoming transactions are cached in a slave queue. Once the node is SYNCED, it is ready to accept client connections.
Stay clear of Galera's known limitations to avoid problems.
There are several reasons which can cause Galera to crash:
Galera node provisioning is smart enough to kick out any problematic node from the cluster if it detects inconsistency among members. When the mysqld runs out of disk space (in the data directory), the node is not able to apply writesets. Galera detects this as failed transactions. Since this compromises node consistency, Galera will then signal it to close the group communication and force mysqld to terminate.
Restarting the node will give disk-full errors (such as "no space left on device"), and quota-exceeded errors (such as "write failed" or "user block limit reached"). You might want to add another data file on another disk or clear up some space before the node can rejoin the cluster.
If the OS starts swapping and/or if iowait is very high it can "freeze" the server for duration of time. During this time the Galera node may stop responding to the other nodes and is deemed dead. In virtualized environments it can also be the Host OS that is swapping.
First of all, make sure you are running the latest Galera stable release so you do not run into older bugs that have already been fixed. Start with inspecting the MySQL error log on the Galera nodes as Galera will be logging to this file. Try to shed some light to any relevant line which indicates error or failing. If the Galera nodes happen to be responsive, you may also try to collect following output:
Next, inspect the system resources by checking network, firewall, disk usage and memory utilization as well as inspecting the general system activity log (syslog, message, dmesg). If still no indication of the problem found, you may hit into a bug which you can report it directly at Galera bugs on Launchpad page or request for technical support assistance directly from the vendor (Codership, Percona or MariaDB). You may also join the Galera Google Group mailing list to seek for open assistance.
Each transaction is assigned a global ordinal sequence number during replication. Thus, when a transaction reaches the commit point, it is known what was the sequence number of the last transaction it did not conflict with. The interval between those two numbers is an uncertainty land: transactions in this interval have not seen the effects of each other. Therefore, all transactions in this interval are checked for primary key conflicts with the transaction in question. The certification test fails if a conflict is detected.
DELETE FROM statement requires PK or node(s) will die. Always define an explicit PRIMARY KEY in all tables. A simple AUTO_INCREMENT primary key will be just enough.
Yes it is. Galera relies on group communication between nodes. For any matters, it will wait for all nodes to return the status of certification test before proceed with committing or rollbacking. At this phase, an overloaded node will surely facing a hard time to reply within the time manner, delaying the rest of the cluster to wait for it. Therefore it is recommended to have uniform servers across the cluster. Also, transaction latency is no shorter than the RTT (round trip time) to the slowest node.
MySQL Galera treats MyISAM tables in quite different way:
So, MyISAM tables will appear in all nodes (since DDL is replicated). If you access the MyISAM tables outside of InnoDB transactions, then all data changes in MyISAM tables will remain locally in each node. If you access MyISAM tables inside InnoDB transactions, then MyISAM changes are replicated along InnoDB changes. However, if there happen cluster wide conflicts, MyISAM changes cannot be rolled back and your MyISAM tables will remain inconsistent.
The Galera technology is developed by Codership Oy and is available as a patch for standard MySQL and InnoDB. Percona and MariaDB leverage the Galera library in Percona XtraDB Cluster (PXC) and MariaDB Galera Cluster respectively.
Since they all leverage the same Galera library, replication performance should be fairly similar. The Codership build usually has the latest version of Galera, although that could change in the future.