Thinking Aloud about Software Architecture: January 2012

The "What happened since last time I checked?" problem

About a year ago I started to look at NoSQL technologies, and how they could be used to provide a scalable backend for Mobile apps. I played a bit with MongoDB, and found it interesting: schema-less, flexible, nice query language, you don't need to mess with the inflexible relational database schemas, etc. I learned you could easily use MongoDB as a replacement for the database layer in a SOA architecture and be more scalable than with SQL architectures if you structure things properly (e.g. creating shards with replica sets). From the developer perspective, you still need to build a service layer with end points to be invoked from Mobile SDKs to run the business logic for your app and store the data in the DB layer. It will eventually work, it's a well known architecture used in web service development. However, in addition to other solvable problems we will talk about later, there's a particular one which sometimes is quite tricky (although never impossible) to solve using this traditional approach. The problem is found over and over again in Social Network and Gaming environments and is summed up in one simple question: "What happened since the last time I checked?" (e.g. Who commented on my picture? Do I have new emails in the inbox? Is there any breaking news that happened in the last 30 minutes?)

The complexity of the solution to that problem depends directly on the complexity of the events and data model tracked by the application under focus and, more importantly, the volume of users trying to access to that information. In general, you'll have to track a history of changes across the data model and write logic that will provide a feed of changes that happened in that data model since a given checkpoint (e.g. time-stamp, sequence id, etc). If the data model is fairly complex you'll have to carefully design and implement the logic for the "delta extraction of changes out of that data model". Scalability and performance issues are the one of the main concerns, since each different user may potentially have different checkpoints, so the provider of this kind of solution needs to make sure the different delta queries the layer can receive are indexed properly. Building this on top of traditional architectures is perfectly possible, but as the infrastructure needs to scale, challenges need to be creatively faced and solved (Facebook infrastructure is the best example of how brilliantly traditional solutions were transformed into a massively scalable infrastructure, but at the cost of having to invent new technology to overcome those limitations).

CouchDB provides something fundamentally different

After playing with MongoDB for a while, I started looking at CouchDB, thinking in the beginning it was going to be another schema less JSON-document based NoSQL implementation, similar to MongoDB. But, there was a feature that CouchDB provided that others NoSQL engines weren't. Replication. A very simple concept: you are given an API that synchronizes two databases by using an underlying service that provides a "delta" between two databases automatically. That way you can store your documents, update them, delete them, without having to worry about tracking changes for delta replication purposes. And you can do it both ways (e.g. you can have a master/slave or master/master architectures). Oh, and by the way, CouchDB was written in Erlang, a language designed to run in small devices used in the Telecommunication industry, so it should be possible to port it to smartphones or other devices with CPU and memory constraints. Huh, so you can run a CouchDB instance in a mobile phone? And you can replicate and synchronize that instance from a server instance using the Replication mechanism? Umm, to me, that looked like a promising way of communicating data between a mobile client and a service in the cloud. Soon I found Couchbase Mobile already existed, an implementation of CouchDB that can run on iOS and Android platforms. When I found out about it, back in April 2011, Couchbase Mobile was still in a beta stage but it was mature enough to run it and start doing some real testing on iOS devices. This is the summary of my experience playing with this architecture, some months later.

Event Driven Social Networks

The first advantage you can see from the usage of CouchDB is the removal of the service layer in the server side. That does not mean there is no business logic involved in this architecture, since CouchDB is still just a database layer. But the business logic that was hidden in the service layer on traditional architectures gets now spread out in different components. The first component is the Mobile SDK. If you're writing an iOS app, you'll need to write iOS Data Access Layer that will offer the rest of the app access to the local CouchDB database. Think about it as the code that would write data to a local SQLite database, either using CoreData or directly SQLite API. This layer will encapsulate the business logic involved in validating the data, and storing it accordingly in the local CouchDB documents. Once the App is ready to publish data entered by the user to the outside world, the code will just need to invoke a single server side method, the Replication API. That will make sure that data entered locally in the device ends up in a database replica on the server. You don't need to keep track of unsynchronized changes, CouchDB will manage that for you and will efficiently transmit only the data that hasn't made it to the server yet. The replication works both ways, so if data gets entered on the server side, the device will be able to pull it by using the same Replication API. Ok, so we have some logic to store data in a local database, and then that data gets stored in the server by using Replication. Nice, but now what? How can we run server side logic to consume that data, and react to it? For instance, in a Social Network application, the user may have entered some comments, or reached a new level in a game, or may have earned some points. When that data gets replicated to the server, we may want to react to those events so we can communicate them to the user's friends, or we can take some actions that need additional information not locally available on the device. For instance, we may have server side rules that can be centrally managed so we can alter the behavior of the app dynamically without having to upgrade the app in the device. E.g. if the user reaches 1000 points in a game, we will grant a free level this week as a promotion, but we will turn that promotion off next week. Here's when a Event Driven architecture is needed, and this is how I found out that CouchDB is a great tool to build it.

Filtered Replication and why it does not scale

The architecture described above seems simple enough, and easy to put in practice, but soon we face some challenges. It's easy to see that a 1-1 mapping between Server Side and Mobile CouchDB instances does not make sense in a Social Network infrastructure. E.g. why would I store in a user's mobile database all relationships and documents of people the user doesn't even know? The device needs to store data for the user or users that will be using that device, and just that data. When I started to look a this problem, I focused on the Filtered Replication. The Replication API accepts a parameter in which you can specify a dynamic rule to replicate just the documents that meet that rule. That could be used to store a subset of the server side database on the Mobile side of the replication. However, for each different user, the server would have to maintain a different set of indexes (Views in the CouchDB terminology) to do the filtered replication more efficiently, and that would end up in bigger and bigger space and computational costs. In addition, unexpected concurrent updates of the indexes for a big number of users could easily bring the infrastructure down. During initial tests, I already started to see these performance issues and started to think that the whole CouchDB Mobile idea was cool, but would be not very well suited for Social Network implementations. So when I was about to give up, CouchConf came and opened my eyes...

CouchConf to the rescue!

Lucky me, I attended San Francisco CouchConf last July, and Chris Anderson run a session in which he presented ideas to build Social Network Apps with CouchDB and he talked about the particular problems that a Filtered Replication would bring to those architectures. That session blew my mind in terms of how to leverage replication more to solve that problem better. The idea was simple, if you want to filter out data to replicate information concerning only to the user of the device, don't run those filtering query from the device (using Filtered Replication) since you will not be under control of the server side work loads. Rather than that, you can pre-calculate the user specific views on the server and store it on smaller databases to offer full 1-1 replications to the clients. That lead to the concept of a User-Specific CouchDB instance on the server side. That instance would be a mirror of the database running on the device, so the device can do full, non-filtered, replications which are much more efficient than filtered-based replications. So now there was "only" 1 problem left to solve: how to create, maintain and update those Server Side User-Specific databases. Here's when CouchDB replication and its underlying technology came handy once again.

The Changes API

The Changes API is the underlying technology used by the Replication API in CouchDB. It will provide a continuous (push) or polling-based (pull) feed of the changes made to a given database. The server side logic for this architecture will use the Changes Feed as the source of events to trigger business logic. The component can be implemented as a background process/agent that will periodically poll the Changes API or will listen to the Continuous Feed. The Changes Feed Agent will have the following responsibilities:

Event detection. An agent will usually take care of a family of events it will be specialized on. So the Agent will look into the document that has received the change and will decide wether an event can be inferred from it.

Gather context additional information. The agent will gather additional information needed to treat the event (from the document or other documents in the same or other databases).

Event notification. The agent will encapsulate the information in an event object and will publish it to an Enterprise BUS, so different Observer Agents can react to it.

There can be multiple agents listening to the changes feed, each of them being specialized in a particular family of events. And there can be multiple observing agents that will perform different activities to react to the event. In addition, this architecture can be further simplified by removing the Event BUS and Observer Agents, and making the Changes Feed Agent the one that will do both event detection and handling (however that approach will serialize the handling of the events, which can lead to a non-scalable architecture in case the network has lots of activity).

Ok, enough abstract talking, let's describe a possible example. Usually, in a Social Network app I want to make sure that my friends are able to see the comments I make to their posts or to my wall. So, let's see how the flow could work:

I open the App and write a comment to one of my friends' post
The comment gets stored in a document of the local CouchDB instance of my device. The document contains the contents of the comment, the id of the person that made the comment (me), the id of the person receiving the comment, the id of the comment's post, etc…
The App will, at some point or periodically do a push replication to the Remote User-Specific Database, which is a mirror of the CouchDB instance run in the user device. It's important to note here that the server database needs to be named after the user id using the App, so there's a 1-1 replication.
The App will save a flag in a global database indicating that my user specific database has been modified. This is needed for the Changes Feed Agent to be listening to just 1 database, otherwise we would need potentially thousands or millions of agents listening to all the server-side user specific database which would create a huge constant load on the server side.
The Changes Feed Agent detects my user specific database has changed so it triggers a full replication between my user specific database and the Global Database. The Global Database stores a replica of all documents from all user-specific databases. We need to make sure the Global Database is secured inside the firewall so only can be accessed by the server side agents, for security reasons. A device will only be able to access to the user specific database for the user logged in to the App.
After the replica to the Global Database has finished, the Changes Feed Agent detects a new "comment" document has been posted, so it creates an event object encapsulating all document information (user ids involved, document ids involved, type of event, etc) and publishes it to the System Event Bus.
A Comment Sharer Agent listening to the Bus will consume the "Comment" event and will replicate the comment document (the Replication API supports replicating one single document) from the Global Database, to the remote user specific database belonging to my friend's ID (the one receiving the comment).
My friend, at some point, will start the App in her device and the app will do a pull replication from its remote user-specific database, downloading the new document with my comment, so the App will render the new comment on her post.

Why I am loving it ...

First of all, there is no query against server side from the client, except for pulling replications. One replication is done, all queries are done locally against the CouchDB instance running on the device, providing high availability even in offline environments (e.g. you have a WIFI only device and are on the road) without having to build an offline cache layer on the device that would add additional complexity to the architecture.

Apps can always decide when to trigger replications against the server: when a wifi connection is detected, at application start or when resuming from the background, etc, or maybe not decide at all and leave to the user the decision on when to send data or refresh from the cloud by providing UI controls to do it. In addition, the data is always on the device, so there's no network latency, no possible cloud downtime or other network glitches you need to deal with in other more traditional architectures.

In addition, there's a clear separation of concerns in this architecture that helps the developer to focus on a particular side of the problem at any given time. When coding the App, the developer will focus on the User context and the smaller subset of data the app is going to have available, writing queries against the local CouchDB to deal with that small subset of data. However, when wearing the server side hat, the developer just need to focus on how the data needs to be replicated among different user-specific databases without worrying about supporting different APIs to transfer that data between the clients. Backward compatibility issues will still exist when evolving the data model, but those will only need to be tackled at the JSON Document definition, without having to translate all those changes through the different layers of a traditional architecture. E.g. when adding a new field to a table in a traditional model, sometimes we will have to modify the service layer to accept a new parameter, validate it and store it in its corresponding table, so it's not enough with just issue an ALTER TABLE statement, we still need to modify the API so it can receive that new parameter and then modify the clients to use that new parameter. With the Replication approach, most of the times we will just need to make sure that the app UI enters the new parameter in the JSON document and the replication will make sure the new field is there on the server side, without having to modify any service layer (except for those Agents that may want to leverage the existence of the new parameter during the event handling).

Conflicts and how to be friends with them

Replication flexibility comes with a inherent management overhead to fix the potential conflicts that happen when the same document is modified both in the remote database and in the local database before replications are triggered. In the architecture depicted above, there may be situations in which a document may be updated on the server side as a result of the handling of an event at the same time is updated on a user specific database in a device. Another possible conflicting scenario happens when the same user logs in from 2 devices at the same time using the same user name and edits the same document from both devices. There is a lot of information in CouchDB Wiki on how to deal with Conflicts in case they're inevitable because of the way the app works. But there's some design choices we can make as well to minimize the possibility of conflicts. For instance, we may want to make sure we don't overload a document with different concerns that get updated from different places in parallel. E.g. a document can hold information about the user profile including a field indicating the number of friends the user has. The number of friends can be updated by a Server Side Agent when it detects friend request acceptation. At the same time, while waiting for her friend to accept the request, the user can modify her profile picture from the App so that the URL for the new picture is updated in the user profile document. When the document is pushed from the device to the server side replica, it will end up in a conflict, since the Server Agent may have already updated the document with the new number of friends. A possible solution is to create different documents storing the different concerns and make sure those documents will be updated just from one database before a full replication happens. The client SDK will have to query all sub-documents and aggregate them into a single view (e.g. a document for statistics updated in the cloud and another for user info that gets generated from the device). Of course that does not eliminate completely the need of conflict resolution strategy (we will talk about conflict resolution in an upcoming post), and in addition, if not done carefully, that technique can lead to a performance impact because having to do multiple queries, but in any case, it can minimize conflict occurrences so it's worth to explore.

Other challenges down the road

For a Social Network architecture, we may need a clustered infrastructure that may potentially supports millions of different user-centric databases (we need to be prepared for it since our products will rock!, won't they?). Internally a CouchDB database is a set of files in a folder that grow sequentially (all changes are appended to the end) and need to be compacted from time to time to recover space. At some point the server may need to concurrently open lots of file descriptors for each user trying to replicate concurrently from the user specific databases. We need to look for an infrastructure provider that can scale horizontally and distribute the load among different servers by creating shards of user specific databases. In case we're rolling our own server and just using a Master/Stand-by Slave approach we need to make sure the server file descriptor limits are tweaked accordingly so the operating system will not start rejecting connections and use an Async IO based httpd servers like nginx as the CouchDB front end for supporting higher concurrency with a single box.

Plus, in architectures that use 1 database per user, we need to make sure the databases are not created in a single directory in CouchDB, otherwise we can hit the directory limits pretty easily and have performance issues. For that, the SDK needs to make sure the user-specific database name contains the '/' characters in the middle of the name so they will translate into different directories where the user database will finally be created (see "Naming and Addressing" section here ). The algorithm to create the directory structure need to receive the user id and number of levels as the input parameter and need to distribute the different ids evenly among the different levels. E.g. if we have 6 levels of directories and each directory can have 16 different subdirectories (e.g. a letter or number corresponding to the n-th position of the MD5 hash of the user id) we can hold around 1.6 billion user specific databases if each leaf directory holds around 100 databases, which would be a fairly small number of files for a directory.

Another challenge would be to plan for a disaster recovery procedure in case we're rolling out our own server and it goes down for any reason. Replication seems the straightforward answer to backup the server side data. However, we need to take into account we're creating and maintaining thousands/millions of small databases. Having a process that continuously replicates all the user specific databases to a stand-by slave for failure recovery purposes will not scale, since that would be equivalent to have all of our user base always connected at the same time to replicate. One possible solution would be to have a single continuous replication from the Master Global Database to a Slave Global Database in a different server and have additional instances of the same Event Agents that listen to the Master Global Database to listen to the Slave Global Database events and replicate data into the slave user-specific databases. We need to be careful in this case in separating Agents that only deal with the replication of data from Agents that deal with additional actions, such as communicating with 3rd party systems, or sending Push Notifications to devices, to avoid duplicates in the process. In general, we can consider the replication of the Master Global Database as the only mandatory backup, since we would be always able to replay the events from the master database to re-create the user-specific databases.

As of the date this post was created, there was two CouchdDB infrastructure service providers, IrisCouch and Cloudant. IrisCouch is a free hosting service intended to start developing your solution without having to worry initially in infrastructure provisioning. Cloudant is a Professional CouchDB hosting solution. It uses BigCouch (an open source fork of CouchDB that is 100% API compatible with CouchDB and adds sharding and clustering capabilities on top of it).

Looking forward to keeping learning

2012 looks like the year in which we are going to see major changes and improvements in CouchDB products. CouchBase seems to be heavily investing in Mobile technology and it seems they may be merging CouchBase Mobile into a single product that will integrate better with their clustered server side solution. I am looking forward to keeping learning about CouchDB and share my experiences, so stay tuned for more posts!

Thinking Aloud about Software Architecture

Monday, January 2, 2012

CouchDB and Mobile Social Network Apps

Blog Archive