Let’s say, you decide to use redis in your software system but you couldn’t decide where to deploy redis in your system architecture. Yes, it is a simple 2 MB application but you are just not starting that application and doing some stuff with that. You also want to plug well it into your system and not to break other things that you care about. Here, you need to get some different colored pens and some paper to draw your architecture and the place of redis in it.
Here, I want to show you some designs and talk about their pros and cons.
Simple Master-Slave design
Most of times if you have one back end (which is control panel) server and some front end (user interacted interface) servers you will end up with one Master and multiple slaves:
It is very simple and efficient in terms of consistency for simple systems. You don’t have to worry about data consistency, because redis handling it inside with its own replication mechanism. So you will only write data to Master Instance and read-only from Slave instances. Both slave will have same data, because only Master instance pushing data to slaves. When we placed client requests into diagram, it looks like this:
As you can see above diagram shows us writes and reads works on separate machines and slaves are consistent by design. Red arrows defines write operations and their directions, blue arrows defines read operations and their directions.
Quick Note: Redis search time complexity is O(1) for one key in get, set and delete operations.
Quick Note: Redis power comes from its own built-in hashmap optimization, and active hashing feature.
There are some pros this system brings us but also we may face some other issues with this.
+ Read – write operations in different machines
+ Slaves have data consistency
+ You can deploy your redis instances anywhere in your architecture, doesn’t have to be on application servers.
+ Slaves can be managed from one place (like pushing some data to servers or change configurations)
+ In case of increased front end servers, you can easily add one more redis slave and replicate it.
– Data replication means more system resources, you need more RAM (or storage) on your servers.
– Too many writes will be a bottleneck for one master, for sure.
– Big chunk of writes will cause to bottleneck whole related network. If it is data persistence operation, it will also peak ram, cpu and disk read/write. So it may cause to lock of other operations.
– There is no guaranteed consistency between master and slaves. (Don’t worry, it is consistent)
– If data loss happens in master, you lost all data on slaves also, because of replication process naturally.
Slave of Slave design
Quick Note: Redis is Single-Threaded server. This means, when you use some long time required command, it will lock other commands to be processed.
Think about you need some heavy key search operation on your redis instance, or you want to profile your redis system. Thus, using long blocking commands on working machines would stuck your applications. In such that case, you can replicate any slave to a new deployed slave and send heavy computational operations to the new slave instance. This way, you don’t lock any operation of running applications and process commands on another instance.
Quick Note: When you set slave-read-only config to no, then you need to delete expired keys by yourself, otherwise allocated memory will be consumed by time and eviction policy rule will step into action. Redis slaves only deletes keys when master instance sends delete command to itself.
+ High computational operations don’t impact your structure.
+ Such domain specific operations can be divided into different instances (for ex. machine learning, analytics, intense sorting operations, high number of requests)
+ Can built as a different structure in different branch.
– It is actually related with its master.
– In case of loss of data in its master, it will lost data that comes from its master.
It is essential for a system to have a disaster recovery plan. In redis world, our lifesaver appears as sentinel mechanism. Basically, sentinels checking redis masters and in case of communication loss, another slave instance becoming a selected master instance.
When we think about number of sentinels, we see:
One sentinel is definitely not enough, there is no double-check.
Two sentinel still not enough, because in case of disagreement between sentinels it will not produces good prediction.
Three sentinels basically OK, only when physical servers that runs sentinels should be in different locations. It is not important if they are one meter away or in different region of world.
For more serious disaster recovery plans you have to put some of those instances and sentinels to highly available data centers like clouds. (sounds funny)
Um, OK, it looks scary. There are too much noise in this diagram, and so much infrastructure trash. But stay cool, because all these Pings are handled by sentinels, naively you won’t feel or see any of these sentinel communications (whispers in redis term). And also you only set master instances to sentinels, and sentinels find slaves automatically. But for sure, more sentinels than necessity brings your network more data packets. I mean, don’t let your system stuck itself. This diagram’s aim is simply showing you what is going on in the background.
What is split-brain? It is a term in computer world taken from medical split-brain syndrome. It is simply separation of one system to multiple sub-systems without knowing each other. In our simple case, lets say our redis system divided into two sub-systems. What we facing is each system have no idea other one is working or not. So each one deciding to start their own recovery plan. After that, when network getting to its stable position, our subsystems could not agreed of which one must be master again. Deciding to true master very important here, because after that other master’s data since separation will be lost.
After network traffic becomes stable position, sentinels reach to old master and add that as a slave of new selected master. Here old master’s data will be lost unfortunately. There is one thing more actually in this diagram and it is how the red arrows changed. But unfortunately since it is your application configuration, you have to correct these configurations in your application. Otherwise you keep getting connection errors in your application.
For different domains you need to configure redis different. You should check configurations carefully and find best fit for your system. For example caching and data persistence are different domains. You run it on Windows or Linux, and there are some platform based configs available. Replication strategy another optimization. Which one you prefer more in CAP theory. Do you need reads more or writes more. Simply like in most cases there is no free lunch theorem shows up.
Here, what I can say only some small notes:
+ Use it as a service (in what platform doesn’t matter) It is a must.
+ Always use logfile, dir, heapdir(in Windows). It is a must.
+ Definitely use maxmemory, it is a must.
+ Use template config file and include it into your current config file if you have many instances.
+ If you want to create your own redis farm use host names, not ip addresses, use bind config with them and get rid of IP addresses in your applications.
+ When you scratching your head and try to find the bug, remember you can start a new slave with loglevel debug config. In production keep it notice.
+ Enable syslog always. Sometimes redis cannot catch low level errors.
+ If you don’t need snapshot then disable it, otherwise don’t overuse it. Remember you can take snapshot operations in another slave for not to lag your actual system.
+ When you need to store data on filesystem, use beautiful rdb database.
+ Don’t trust your firewall and put some long password to your redis instances. (remember system admin attacks)
+ Put your unused commands to your template with rename-command and disable them.
+ If you have always increasing # of clients, use maxclients.(and try to solve where the flaw is)
+ Use slow logs in a slave instance and do profiling on it.
+ Give related slave priority with your physical machine.
+ Use odd number of sentinels and more than two.
+ For good failover write your own application or shell script and notify your system over it instantly.
Redis is huge and it’s file size is not even 2 MB. But things you can do with redis priceless. I even didn’t mentioned about messaging side of it, even not a word. So, what I am going to say, whether big or small is your application is not important. If it fits into your domain somehow, use it. Don’t try to make things complicated (it is also redis manifesto). As long as you configure it right for your domain, you will get good results for sure. I tried to write a tiny scratch here. Hope you like it, I have some other things to say but enough for today.