Hi guys, I hope you are doing good and enjoying the articles so far.Today we are going to learn more about CAP theorem.This is a very important concept to understand when working with high scale distributed system with huge data.To start with lets discuss some of important terms that are used in the world of distributed computing.
1) Distributed computing : In this model of computing there are set of distributed nodes (servers, components) which are connected to each other through a network and talk to each other by passing messages. These messages can be simple Http requests, RPCs or may be messages flowing through a queue like SQS, Apache MQ, RabbitMQ etc.
2) Consistency – This refers to a feature of a distributed system which always show correct/latest data.For example your banking application, it will always show you correct balance in your account.It doesn’t matter how many request you send , it will always provide latest data.Stale data doesn’t exist in such systems.Even if it does, it won’t be used to serve end customer’s requests.
3) Availability – This refers to the ability of a system to be always available.For example social networking apps like facebook, twitter, instagram. These are always available.By available we mean that every customer request will be served always.
4) Partitioning – This usually refers to a geographically separated set of nodes which cannot talk to each other directly. For example a server which is located in Mumbai not directly networked with another server in Delhi, but they are part of one system when we see the bigger picture.
Now since we understand the basic terminology, let’s see what CAP theorem (also known as Brewer’s theorem) has to say :
“It is practically impossible for a distributed computing system to simultaneously provide all three of consistency, availability and partition tolerance.”
In a distributed environment a network failure is highly possible, which can create partitions in your system, which means that data packets would be lost, some of the nodes would be unreachable.Even in such cases, some distributed system need to be tolerant.On the other hand there are systems which can choose to be not partition tolerant and rather be consistent and highly available all the time. Also there are systems who can compromise on their consistency but need to be available and be partition tolerant such as Facebook.
Let’s now discuss, what all options exist :
1) CA but not P (Consistent, highly available but not partition tolerance) : When it is OK to loose some functionalities of your application due to network partition but NOT OK to be inconsistent and unavailable then you can choose a CA system.For example RDBMS (Oracle,MySQL) . A good example of such system would be a banking application. So when they face a network partition, they may not operate completely (you may not be able to use some features) but they will still chose to be consistent and available.
2) AP but not C (Available and partition tolerant but not consistent) : When it is OK to lose consistency in your application, but NOT OK to be unavailable or partition intolerant you can chose an AP system.A good example would be Facebook likes/comments or flight prices. At times it may happen that the number of likes that you see on you picture on Facebook is 10, but when you actually open the picture you see 20 likes.This inconsistent behavior is OK for such systems. It may not create a news if Facebook is showing incorrect number of comments, but it will definitely create a breaking news if it is down for 10 minute. Examples : CouchDB,Cassandra, Dynamo DB.
3) CP but not A (Consistent and partition tolerant but not highly available) : By now you would have understood, what could be a CP system.Just so that we are on the same page, a CP system would chose to be unavailable but not be inconsistent and partition intolerant.A very good example would be IRCTC website. They may go down for number of times a day, but they don’t want to be inconsistent. If at all there is a partition, they will go unavailable but chose to be consistent all the time. Example : MongoDB, Redis, HBase.
If a system wants to provide any two it has to compromise on the third one. In order to maintain consistency, systems may chose to be unavailable or be partition intolerant, but cannot achieve all three simultaneously (or any of the combinations as discussed above).
I hope this was helpful in understanding the basics of CAP theorem, please feel free to provide comments/feedbacks.
Cheers,
Kaivalya Apte.
Fantastic site. Lots of useful information here. I’m sending it to several pals ans also sharing in delicious. And of course, thank you to your sweat!