Rethinkdb

Real-Time Applications With RethinkDB

John Hoestje Databases, JavaScript, Microservices, Node.js, Technology Snapshot 1 Comment

In the last several years, there have been new requirements and expectations placed on enterprise applications that have dramatically increased application code complexity. Users want dynamic websites that provide instant data feedback or to have multiple users work on the same document concurrently. New frameworks and ideas have also evolved along with the requirements to help cope with the new application features.

Talk of new distributed system designs like Microservices and new client frameworks like AngularJS have dominated developer discussions. This domination has been justifiably so. They provide great solutions to real problems.

One of the areas that has been undervalued in the discussion is the persistence layer. RethinkDB will provide the discussion spark.

New application features have strained traditional data storage technologies. Using a single database management system such as RDBMS for the entire data storage is not a viable solution for the majority of enterprise applications.

Polyglot Persistence

As recognition of this change, Martin Fowler wrote about polyglot persistence in 2011. Martin Fowler states that “any decent sized enterprise will have a variety of different data storage technologies for different kinds of data” (MF 2011).

While we have traditionally started with the database technology first and tried to coerce it to manipulate the data the way we want, Martin Fowler states that the new technologies allows us to determine how we want to manipulate the data first and then determine which technology matches up best.

There isn’t one database technology that provides an adequate solution for all data manipulation needs. RDBMS is great at storing data, but it can’t provide the same scalable search capabilities as Elasticsearch. As another Microservices benefit, each microservice provides a specific data need. The smaller service boundaries makes finding a storage technology for each service’s data need easier.

An expectation that has been steadily increasing is the requirement for real-time data feeds. The Internet of Things (IoTs), concurrent content editing, and rapidly changing shared data sets are driving a new need of data storage technology. Clients polling servers to check if information changed is not a scalable solution. Polling can overwhelm the servers. A more scalable solution is to “push” data to the client when the data changes. Adding this capability to an application service is adds a lot of code and complexity.

RethinkDB

Providing real-time data feeds is where RethinkDB shines. RethinkDB is a JSON database that pushes query results to applications. By having the “push” capabilities in the database, it drastically simplifies the application services. By using JSON Documents, it makes working with the data very easy in any language. RethinkDB has a JavaScript client that integrates well with Node.JS.

In this example application, I’ll be using:

  • RethinkDB as the database
  • Node.JS as the backend server
  • Socket.IO for the communication channel between the server and the client and vice versa
  • jQuery on the client just to keep it simple.

Getting Started

To get started, follow the RethinkDB installation instructions on their website. It would behoove you to also read the thirty-second quickstart and the ten minute guide. Windows users will need to build from the RethinkDB sources as there is currently not an available installer. A Windows installer is being worked on.

RethinkDB comes with a web interface (defaults to http://localhost:8080) that allows for database administration and running data queries.

The source for the example application is available on GitHub. NPM and NODE.JS is required for the application. Follow the instructions in README.md.

The ./db/config.js file sets up the database and tables if they do not already exist and contains the database API. Other than that, there really isn’t anything more to do just to get RethinkDB running locally with the default configuration. Going forward, I’m just going to focus on setting up the data feedback wiring. The RethinkDB documentation explains everything very nicely. 🙂

This is all the code needed to receive data pushed from RethinkDB:

	r.db('realtime').table('users').changes().run(connection, function (err, cursor) {
  	    cursor.each(function (err, row) {
    	        callback(row);  //callback function passed in to do something with the data
  	     });
	 });

Our Example

In this simple example, I am telling RethinkDB that I want to receive all changes to the ‘users’ table. In a real world example, you would filter the data to the last 10 users or maybe some small subset.

Once you pull down the code and get the application running, bring up two browsers (http://localhost:3000) to mimic two different users. Each user will be able to register a name and send messages to all the other users. Each message pushed from the server will be displayed to the user.

Simulate two users:

Real-Time Applications with RethinkDB Image1

When the first user registers the user name ‘John’ by clicking the ‘Register Name’ button (1). This causes jQuery to submit the data on the Socket.IO channel which sends the data to the server.

Once Socket.IO receives it on the server side, a connection to the RethinkDB instance is created and the user name is saved. RethinkDB detects a change on the Users table and then asynchronously pushes the new data to the feedback listener.

‘John’ user name registers:

Real-Time Applications with RethinkDB Image2

For the sake of brevity I’ll only show the logs of interest.

When the user name is persisted, RethinkDB returns a JSON document that describes the result of the operation along with the newly created key.

{
  "deleted": 0,
  "errors": 0,
  "generated_keys": [
	"877503b1-e7d4-4cb0-a88b-da41340c5305"
  ],
  "inserted": 1,
  "replaced": 0,
  "skipped": 0,
  "unchanged": 0
}

As expected, the newly-persisted user causes a change to the ‘Users’ table. When the change is detected, RethinkDB pushes a JSON document that contains the new and old values. Since it is an insert, we won’t get any data in the ‘old_val’ attribute.

DB---->registerRealtimeUserFeed pushing....
{
  "new_val": {
	"id": "877503b1-e7d4-4cb0-a88b-da41340c5305",
	"username": "John"
  },
  "old_val": null
}

In the application, only the ‘username’ information is sent to the client via Socket.IO and is instantly broadcasted to all connected users (2).

High Level application data flow:

Real-Time Applications with RethinkDB Image3

‘Sam’ user name registers:

Real-Time Applications with RethinkDB Image4

Similar to ‘John’, we get the notification when ‘Sam’ registers.

{
  "new_val": {
	"id": "0b9e6987-b389-4256-87ca-3f780004029a",
	"username": "Sam"
  },
  "old_val": null
}
DB---->registerRealtimeUserFeed emit....:  John
{
  "deleted": 0,
  "errors": 0,
  "generated_keys": [
	"0b9e6987-b389-4256-87ca-3f780004029a"
  ],
  "inserted": 1,
  "replaced": 0,
  "skipped": 0,
  "unchanged": 0
}

John sends a message:

Real-Time Applications with RethinkDB Image5

Sam sends a message:

Real-Time Applications with RethinkDB Image6

Final Thoughts

I was a little pessimistic on how easy it was going to be to get a sample application working when I first started. With some other products, the documentation promises the sky with just a few lines of code. Once you get into it, you find how much work it actually is. It was refreshing how easy it was to get a simple application going with RethinkDB, Socket.IO, and Node.JS.

With the persistence layer taking the responsibility of pushing the data, it removes a lot of code and complexity from the services. RethinkDB provides a solution for a specific data need. Just like polyglot programming expresses the idea that you should use the programming language that best fits the problem, polyglot persistence says to use the best database that best fits the data problem.

The source for the example application is available on GitHub.

— John Hoestje, asktheteam@keyholesoftware.com


About the Author
John Hoestje

John Hoestje

Twitter

John Hoestje is an experienced Software Architect and Developer with 10+ years in IT. His area of expertise is the architecture and development of applications and systems utilizing Java, .NET and JavaScript technologies.


Share this Post

Comments 1

  1. A nice blog post about a cool general purpose JSON DBMS. It uses web sockets in a push paradigm, compared to MongDB’s pull model. I see joins, map/reduce, cursors, etc. The RethinkDB admin interface imparts a feeling that this package is well-done. Their documenation is easy to follow.

    I had to invoke “npm install rethinkdb -save” as an added installation step for the example in this post. I suggest adding an installation instruction bullet having that earlier link for installing RethinkDB. This, for us eager beavers that skipped to the installation bullets without reading the earlier prose. Duh.

Leave a Reply