How many of you have heard about distributed coordination? I guess not many of you. But you may have heard about distributed computing. Well, distributed coordination is quite similar, but this is only used for coordination purposes.
Suppose you have a cluster of nodes and want a convenient mean to communicate among them. Telling each node that these are the nodes in your cluster and asking each node to communicate with all other nodes to gather information about the cluster will be a real pain and will also flood your internal network (if the cluster is quite large). In such occasions, what we usually propose is a mediator which act as the hub of communication. But then we have a single point of failure. To avoid that, you may make the mediator node high available through replication. But, how much effort you will need to configure all those by yourself.
The core idea of distributed coordination is that. To simplify our distributed coordination tasks, there are systems that provide the above mentioned task specifically. That is, those systems provide means to coordinate (which is more than just communication) among the nodes of a cluster.
High available systems mean a set of computers who work, act and appear as a one computer to the outside world. Yes, that doesn’t mean anything. Let’s discuss this with an example.
Suppose a big company like google, facebook has to serve tens of thousands of requests per second. More importantly, they have to make sure that their service does not go down at least for a second. This will be more important when it comes to commercial systems where a down time of seconds can cause millions of losses to that company. In simple terms, availability (building fault-tolerant systems) is the major concern in these use cases.
So, distributed systems are the solution for high availability. Instead of keeping just one server to serve the client requests, we will use a cluster (collection) of interconnected servers (called nodes) along with load balancers (to uniformly direct requests to nodes in the cluster) to serve the requests. Now, the failure of a node in the cluster will not cause the entire service (provided to the outside world) to go down. That is because we are using more than one server/node to serve the requests. This approach has made the service to the outside world high available without service down times.
Now, we made our service available, but what about the complexities added by different nodes providing the same service? We cannot use a central data store in this cases because it will be a bottleneck as well as it will cause a single point of failure. In order to address that, we may need to use a distributed data store. So, a distributed system consists of a set of nodes providing different or same service/services. Whatever it is, each node in the cluster should be aware of the services provided by nodes and whether those nodes are available at a given instance. Apart from that, the nodes has to share common configurations and common runtime variables, values and may be have to store data in distributed manner. We have heard of distributed data stores which can be used to store data in a distributed manner (data store becomes high available then). But what about the other scenarios where we need to share variables, locks and configuration at runtime among nodes? This requirement is called distributed coordination. In order to handle that we have to use implementations like apache zookeeper (apache curator), etcd, consul and hazlecast which provide distributed coordination and consensus.
What is consensus?
Following description obtained from raft.io which is a simple algorithm for consensus describes what is consensus in simple terms.
Consensus is a fundamental problem in fault-tolerant distributed systems. Consensus involves multiple servers agreeing on values. Once they reach a decision on a value, that decision is final. Typical consensus algorithms make progress when any majority of their servers are available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail. If more servers fail, they stop making progress (but will never return an incorrect result).
In simple terms, consensus is how the distributed systems agree upon a value. Let’s go directly into the implementations and each of those implementations’ advantages and disadvantages.
Distributed Coordination Systems
There are many popular distributed coordination schemes available like Apache Zookeeper and etcd. All these systems provide the above mentioned functionality of a distributed coordination system without letting the users to worry about the consistency of the values they are storing in those distributed storages.
If we didn’t had these kind of systems, as mentioned earlier we could have been spending more time on addressing consistency issues of the values we are storing in distributed manner for high availability purposes. However, we will not discuss about the implementations of distributed coordination systems in this article. For those who are interested, I have written another article describing Apache Zookeeper and etcd. Thank you for reading and I hope that you got some idea on distributed coordination.
Further reading and references
- Apache Zookeeper Wiki
- Zookeeper Internals
- Serializability and Distributed Software Transactional Memory with etcd3
- Introduction to etccd 3 by CoreOS CTO Brandon Philips
- etcd3 API
- Apache Curator in 5 Minutes
- Network partitioning in zookeeper
- Zookeeper Dev Mailing list (firstname.lastname@example.org)
- ZooKeeper Primer