Galera Cluster is a synchronous multi-master database cluster developed by Codership of Finland. This open-source software operates as a patch for MySQL and MariaDB and offers its own command-line interface. It provides a range of features including certification-based replication, which uses individual nodes to certify the replicated write set against other write sets.
Database reads and writes can be directed to individual nodes with Galera Cluster, and nodes can even be lost without disrupting operations. Data integrity is taken care of as well as it’s fully automated within Galera Cluster. Unlike MySQL Cluster, which focuses on partitioning and availability, Galera Cluster aims for consistency and availability.
So how does Galera Cluster work?
In order for Galera Cluster to work effectively, at least three nodes need to be functioning properly. You can attempt to work with two nodes, but an arbiter would be required, essentially requiring three nodes anyway. The multi-master replication has been designed with InnoDB/XtraDV in mind, but other storage engines are also somewhat suited to the task.
If you intend to use a different storage engine that isn’t InnoDB/XtraDV, then you will be constrained by a few parameters. Replication with other nodes, for example, might not be supported. All of the software that uses Galera Cluster will only be able to write on a single node at any given time. Finally, there is no conflict management support available if you use a different storage engine.
In order to fully integrate replication, Galera Cluster uses a series of advanced mechanisms including:
– Write sets that reduce the number of operations between nodes.
– Database state machines, where read-only transactions are run through a local node and write transactions are executed locally then broadcast as a read set.
– Transaction reordering, where the reordering took place before commitments to other nodes in the network.
– Group communication between nodes is used to ensure consistency through the network. Galera Cluster uses GTID to achieve this consistency between nodes.
The advantages of Galera Cluster
Working with Galera Cluster offers a range of benefits that are well worth considering. Some of these inherent benefits include:
– The ability to read and write to any node at any time.
– No data is lost if there is a node crash and there is absolutely no slave lag either.
– All of the nodes share the same data so they have the same state.
– Improved performance with any workload due to the multithreaded slave functionality.
– There are no master/slave failover operations like Pacemaker, so there is no need for an HA Cluster for management.
– Drivers and application changes can stay the same as Galera Cluster is totally transparent.
– You don’t have to split the read and write requests.
– There is total WAN replication functionality.
The disadvantages of Galera Cluster
Although there are lots of inherent benefits to using Galera Cluster, there are drawbacks to consider too. One major issue is that you can’t go live into production if your applications have not been checked for compliance. Other disadvantages of Galera Cluster include:
– Only partial support is available for most storage engines like MyISAM. For full support, you must use InnoDB tables for now.
– You can’t use the delete operation on your tables unless you create your own primary keys.
– The lock and unlock functions of tables are not supported
– Global transactions are not yet supported
– Query cache is disabled in Galera Cluster
– Query logs can only be sent to a file instead of a table
Galera Cluster has the potential to add tremendous value to your work with a range of intuitive features that streamline clustering in your database. It’s not without its limitations, though. Hopefully, this guide has offered some key insights to help you decide if Galera Cluster is suitable for you.
If you would like to know more about clustering or need help with online hosting or server management, get in touch with our experts at catalyst2 today.