Performance is always a headache in web development. Sooner or later in the life cycle of your application, you will need to tune up the app to make it run faster, to be able to serve more users simultaneously or just to make sure it does not eat up all server resources.
And one of the first thing you should look at when talking about performance is the database. Usually, most performance-related issues are somehow related to database or SQL queries. Optimizing database can bring a huge improvement on the performance.
However, after several rounds of optimization, you might find it almost impossible to make any other further changes to the existing MySQL or PostgreSQL database. Adding new index? Yes, you did it. Adding counter cache? Yes, you did it. You have done all the things you have to do to optimize the database, but performance is still suck. In that case, you might need to think about restructuring the database a little bit. You can start asking yourself several questions like: "Do I really need to run this SQL query? Can I cache this SQL query somewhere in memory? Or do I really need this MySQL table at all for this functionality?" In today's post, I would like to introduce you to Redis, a No-SQL database which could be very helpful to you in many cases. And it is extremely powerful if you know how to make the most out of it.
An overview of Redis
Redis is a key-value database where you can literally store any objects you want. Regarding key-value structure, you might know how powerful the hash in Ruby is. Because of this structure, Redis is extremely fast as it does not perform scanning to get the data. Here are some interesting benchmarks https://redis-docs.readthedocs.org/en/latest/Benchmarks.html. Just a quick glance: Redis can process a hundred thousand of requests in less than 1 second. That's amazing.
In Rails, we can use
redis (http://www.rubydoc.info/gems/redis/3.2.1) gem to connect our application with Redis server. It is very easy to use and has a clear documentation. You will get familiar with it very quickly. One way to initialize Redis object to use in Rails application is to create an initializer called redis.rb with the following code
$redis = Redis.new(:host => "localhost", :port => 6379)
We often make
$redis global object as we want to be able to use it everywhere in the app.
One another thing, many people say that Redis is just an in-memory database, but you should be aware that it also supports saving the data to disk for backup or other purposes (http://redis.io/topics/persistence)
When should we use Redis
Based on my own experience, here are several functionalities you can think of using Redis instead of relying on traditional relational database.
- Tracking online/offline user
Someone might say this is unnecessary because certain library like
devise (https://github.com/plataformatec/devise) can support this through
last_sign_in_at column, and if
last_sign_in_at is greater than 10 minutes ago, for example, we can say that user is online. However, in this context, I want a real-time online/offline tracking model instead of that asynchronous model. More specifically, after a user closes all tabs on his browsers, that user will be considered offline.
There are several ways to track whether a user is online or not, but the most conventional way is by sending AJAX requests periodically to server to let it know that the user is still using the app. However, there is one drawback of this method: it consumes server resources a lot if we don't have an efficient way to handle the request. The request should be lightweight and extremely fast. Imagine using MySQL for this functionality, we might need to issue one or several SQL queries in the action. If we have thousands of users online at a time, then there will be thousands of SQL queries issued periodically on the server, that's really bad.
By using Redis, we can avoid the burden on MySQL database. There are many data types provided by Redis. Let's use a Sorted Set for this purpose:
- key: online_user_ids
- values: list of user ids with Timestamp as score
And we accept the fact that any user who does not make request to server within 5 seconds will be considered offline. Assume that I make the following request to server every 5 seconds
And to keep track of online users, I use the following code in the action handling the request
#add user_id to list, Time.now.to_i is the score $redis.zadd("online_user_ids", Time.now.to_i, user_id) #remove offline users (inactive for more than 5s) $redis.zremrangebyscore("online_user_ids", 0, 5.seconds.ago.to_i)
Explanation: we add the current user to a sorted set (http://redis.io/commands#sorted_set) with the timestamp as the score, and then we remove inactive user ids from the set, who have the score (timestamp) less than 5 seconds ago. All above function are processed very fast by Redis server and you don't have to worry about the performance, even hundreds of thousands of requests won't slow it down.
- Ranking statistics
In this scenario, we want to rank the data in some ways. Let's think about an example. Each users in the application will be assigned a score saved in the database, and we want to know top 10 users ranked by the score as well as the ranking position of an arbitrary user in the system. How we can do this?
The first task seems to be easy with SQL. Using ORDER and LIMIT clauses seems to be enough. However, the second task is a bit tricky. It is not easy to get the position of a row for all records stored in DB. And even if it does have the way, it is often complex and can possibly affect app performance.
By using Redis, we can just store the corresponding
score for each users in Sorted Set. And Redis supports retrieving items in order and even the ranking position of an item
#add user has id 10 to the `user_score` set #with score 11499 $redis.zadd("user_score", 11499, 10) #get the position of user 10 rank = $redis.zrank("user_score", 10) #get top 10 user with highest score user_ids = $redis.zrange("user_score", 0, 9)
And you can even sort, remove, update score for all users. Redis is very flexible and there are many commands available to use. Please refer to Redis documentation for more details.
Redis supports List structure (http://redis.io/commands#list) with pop, push method which is very suitable for a queue. If there is any functionality which need queue structure, think about using Redis. There is one nice article here describing how to build a message queue system http://big-elephants.com/2013-09/building-a-message-queue-using-redis-in-go/
And you might already know, there are a lot of libraries for handling background jobs making use of Redis for saving tasks. Resque (https://github.com/resque/resque), Sidekiq (http://sidekiq.org/) are two famous ones. I don't provide any example in this case, please search for Redis queue on Google and you will get many interesting articles to read.
There are obviously more places you can apply Redis in your application, it largely depends on your experience and nature of the app. In my opinion, if you are looking highly scalable data stored across multiple servers, or a great caching layer for your app, Redis perhaps is the best choice.