Node.js – Dealing with submitted HTTP request data when you have to make a database call first

Node’s asynchronous events are fantastic, but they can have a sting in the tail. Here’s a solution to something that you’ll probably run into at some point.

If you have a HTTP endpoint that accepts JSON, XML, or even a streaming upload, you normally read the data in using the data and end events on the request object:

var bodyarr = []
request.on('data', function(chunk){
  bodyarr.push(chunk);
})
request.on('end', function(){
  console.log( bodyarr.join('') )
})

This works in most situations. But when you start building out your app, adding in production features like user authentication, then you run in trouble.

Let’s say you’re using connect, and you write a little middleware function to do user authentication. Don’t worry if you are not familiar with connect – it’s not essential to this example. Your authentication middleware function gets called before your data handler, to make sure that the user is allowed to make the request and send you data. If the user is logged in, all is well, and your data handler gets called. If the user is not logged in, you send back a 401 Unauthorized.

Here’s the catch: your authentication function needs to talk to the database to get the user’s details. Or load them from memcache. Or from some other external system. (Don’t tell me you’re still using sessions in this day and age!)

So here’s what happens. Node will happily start accepting inbound data on the HTTP request, but before you’ve had a chance to bind your handler functions to the data and end events. Your even set up code only gets called after the authentication middleware is finished its thing. This is just the way that Node’s asynchronous event loop works. In this scenario, by the time Node gets to your data handler, the data is long gone, and you’ll stall waiting for events that never come. If your response handler depends on that end event, it will never get called, and Node will never send a HTTP response. Bad.

Here’s the rule of thumb: you need to attach your handlers to the HTTP request events before you make any asynchronous calls. Then you cache the data until you’re ready to deal with it.

Luckily for you, I’ve written a little StreamBuffer object to do the dirty work. Here’s how you use it. In that authentication function, or maybe before it, attach the request events:

new StreamBuffer(request)

This adds a special streambuffer property to the request object. Once you reach your handler set up code, just attach your handlers like this:

request.streambuffer.ondata(function(chunk) {
  // your funky stuff with data
})
req.streambuffer.onend(function() {
  // all done!
})

In the meantime, you can make as many asynchronous calls as you like, and your data will be waiting for you when you get to it.

Here’s the code for the StreamBuffer itself. (Also as a Node.js StreamBuffer github gist).

function StreamBuffer(req) {
  var self = this

  var buffer = []
  var ended  = false
  var ondata = null
  var onend  = null

  self.ondata = function(f) {
    for(var i = 0; i < buffer.length; i++ ) {
      f(buffer[i])
    }
    ondata = f
  }

  self.onend = function(f) {
    onend = f
    if( ended ) {
      onend()
    }
  }

  req.on('data', function(chunk) {
    if( ondata ) {
      ondata(chunk)
    }
    else {
      buffer.push(chunk)
    }
  })

  req.on('end', function() {
    ended = true
    if( onend ) {
      onend()
    }
  })        
 
  req.streambuffer = self
}

This originally came up when I was trying to solve the problem discussed in this question in the Node mailing list.

This entry was posted in Node.js. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>