Seneca, A Microservices framework for Node.js

The release of Seneca 1.0 represents 5 years of open source evolution, and not a little blood, sweat and tears. The thing I am most happy about is the fact that I did not do the release – Wyatt had that honor, with Dean and Matteo keeping him honest! Seneca is now a community, not just one developer’s itch. And if you ask me about the future, the first priority is the care and feeding of the community. Building the rules of conduct, guiding principles, decision making processes, great documentation, and all the other stuff that isn’t code. We want to be good open source citizens.

Microservices for Node.js

So you want to write microservices in Node.js? Seneca’s job is to make your life easier. The funny thing is, Seneca did not start out as a microservices framework at all. It was a framework for building Minimum Viable Products. To build an MVP, you need to be able to plug together pieces of code quickly. You should be able to list a set of basic functionalities, such as user accounts, database connectors, content delivery, administration backends, etc, and get a web application that “just works”. You then extend and enhance to add your own secret sauce.

I really liked the way that Rails (for Ruby) and Django (for Python) had ecosystems of “business logic” components that you could (almost!) just plug in. But having built systems with both platforms, the reality on the ground was a little different. The promise was that, unlike, say Java (where I spent far too many years building “enterprise” systems), Rails or Django would let you develop an MVP very quickly. This was only half true. Certainly, if you stuck to the rules, and mostly followed the Model-View-Controller style, you could get pretty far. But the component systems of these platforms always ended up creating technical debt. There were just too many integration hooks, too much opportunity for complexity to creep in. The underlying problem was that neither system had any fundamental structural model to unify the component architecture. It was all special cases, neat tricks, and monkey patching.

Software components

What are software components anyway? They are pieces of functionality that you can glue together. And, by glue, we mean compose. And yes, composability is why people get all hot and bothered about category theory and monads and all that jazz. Back in the real world, the ability to compose software components together is the essence of their value. The thing people love about UNIX command line tools is that you can pipe them together using simple streams of data. It’s a simple component model that works really well. Other component models have the same goal, but don’t quite get there. Take object-oriented programming. Objects are meant to be components. And yet getting objects to work together is … rather awkward. I never fail to be struck by the irony that despite the supposed power of inheritance, interfaces, polymorphism, and such, you still need a book of spells design patterns to “code proper”. Seems like all that power just lets you make a bigger mess.

Functional programming is a quite a bit better, mostly because functions are more uniform, and thus easier to compose. And many functional programming languages have pattern matching. Now pattern matching is terribly simple, like UNIX pipes, but also terribly powerful, also like UNIX pipes. The pattern of your input determines which function to call. And handling special cases is easy – just add more functions for more specific patterns. The nice thing about this approach is that the general cases can remain general, with simple data structures and logic. Pattern matching is a powerful way to fight technical debt. And it’s useful by itself, so you don’t have to go functional if you don’t want to.

The Genesis of Seneca

By 2010 I had finally become quite allergic to Java. I had tried Ruby and Python, and their primary frameworks, and found them wanting. And then I came across this little science experiment called Node.js. My first reaction, like that of many, was … JavaScript? srsly? But then I remembered that the json.org coder, Douglas Crockford, had written a book about JavaScript. I read that book, JavaScript, the Good Parts, and felt better. And a toy language became how I fed myself and my family.

At the time I was heavily involved in the mobile web and HTML5 worlds, and quite convinced that native mobile apps were on the way out (oh yeah … real soon now). I had even helped build a startup, feedhenry.com (since acquired by RedHat), based on the idea. But then life took a different course. A new baby, our third, and an offer of a book deal, and a desire to return to freelancing, and the promise of much higher productivity with Node.js, combined to push me into independence once again. I was back writing code, and it was fun!

Just a small hitch. The Node.js module system, npmjs.org, is fantastic, and there are many many great modules. Most of them are infrastructural in nature – utilities. Yes, they are software components, and yes, the Node.js module system is also a pretty good software component model. But, it still suffers from the complexity inherent in the underlying JavaScript language. Node modules are sort of composable, and there are good examples, like hapi, or streams, but there was still no easy way to componentize “business logic”. As a freelancer I lived or died by my ability to deliver features. Rewriting the business logic for user account management, or for shopping carts, or payment integrations, or content management, was killing my margins. I decided to build a component model based on pattern matching.

The model is really simple. Components are nothing more than a set of inbound and outbound messages. They are entirely defined by the messages they accept, and the messages they emit. We say nothing about internal data structures, or even causality between messages. A component is fully specified by these two lists of messages.

Let’s say we are writing a little blogging engine. You can post entries. Entries have a title and body text. So you have post-entry message, and it contains the title and body data. Now you have to answer the questions:

  • Who sends this post-entry message?
  • Who receives it? Does more than one component receive it?
  • What is an “entry” anyway?
  • And what is a message type? What type is post-entry?

And this is just inside the same process. I was not even thinking about distributed systems of microservices at this stage. All I knew was, encoding the messages as method calls on an object was not the way to go – that leads to the same old madness.

The key question is, what is a message type? It’s a hard one to answer. You end up going down the road of schema validation, contracts, and other such nastiness. One way to answer hard questions is not to answer them at all – a common trick among mathematicians. Do parallel lines ever meet? Decide that the question is unanswerable and you get whole new geometries! Pattern matching lets you side-step the question of message types. Here’s how it works:

Let’s say the post-entry message looks something like:

{
  "title": "Down with this sort of thing!",
  "body": "Careful now!"
}

What if we just send it to all components? Let each component decide if the message is important. That avoids the question of message routing. Still, it’s hard to recognize the messages you care about, so let’s make it a little easier. Let’s tag the message with a fixed property:

{
  "post": "entry",
  "title": "Down with this sort of thing!",
  "body": "Careful now!"
}

Now any components that care about the posting of entries can pattern match messages by looking for a top level post:entry property and value pair. Let’s say we have at least a PostEntry component that handles posted entries – perhaps it saves them to a database.

There are some nice consequences to pattern matching. Components that emit this message do not need to know about components that consume it. Nor do components that consume the message need to know about the emitting components. That’s decoupling, right there! Messages do not need to be addressed to anybody. The need to have a target for messages is the downfall of many a component architecture. To call a method, you need an object instance, and we avoid that need with patterns. Another consequence: any number of components or component instances can react to the message. That’s not an architectural decision you have to make in advance.

Isn’t this just an event-driven architecture? No. Events are sent and received from topics, and topics are pretty much equivalent to addresses. You have to know the topic name on the sending side.

Isn’t the tag just a backdoor type? On a theoretical level, probably! On a practical level, not really. It doesn’t impose that same constraints that types do, nor provide any ability to validate correctness. Nor does it impose a schema. This approach to component communication is very much in the school of Postel’s Law: “be strict in what you emit, liberal in what you accept”. And the label that we are using for this message, post-entry, is not a type, just an informal name.

Practical Pattern Matching

Messages do have to make it from one component to another eventually. But the mapping from patterns to components does not reside in the component themselves. You can put that information in a separate place, and implement the mapping separately, and in many different ways. I wrote a little pattern-matching engine to do this work: patrun. The way that you “wire” up components is thus independent, declarative, simple to understand, and yet dynamically configurable.

Now let’s kick it up a gear. Let’s add a feature to our system. Blog posts can contain an image! Woohoo! In a traditional software architecture, you’d have to modify your system to support this new feature. You’d have to extend your data models, create sub-classes, update data schemas, change method signatures, update unit tests, and so on, and so forth. No wonder software projects are always late and over-budget.

Stepping back for a minute, the post-entry messages now look like this:

{
  "post": "entry",
  "title": "Down with this sort of thing!",
  "body": "Careful now!"
  "image": "http://www.richardrodger.com/wp-content/uploads/2016/01/careful-now-down-with-this-sort-of-thing.jpg" // OPTIONAL!
}

Sometimes the message has an image property, sometimes it doesn’t. Does this break anything? The original PostEntry component that handled post-entry messages still works! It just ignores the extra image property – it means nothing.

Now, add a new component to the system that can handle entries with images: let’s call it PostImageEntry. Any time PostImageEntry sees a message that contains both post:entry, and an image property, then it has a match, and it acts on the message.

There’s an obvious problem. The original PostEntry component is also going to act on the same message, which is not what you want. There’s an easy solution. Add a rule that more specific matches win. PostImageEntry matches more properties than PostEntry, so it wins. The nice thing is, you never had to change the code of the original PostEntry to make this work. All you did was add new code. Not needing to modify old code removes entire classes of potential bugs.

The “more specific matches win” rule gives you extensible components. Every time you have a new feature or a special case, match against the property of the message that makes it special. You end up with a set of components where the ones written early in the project are more general, and the ones written later are more specific, and at no point did you ever have to refactor.

It gets better. Older components that are “a bit wrong” and no longer relevant – they’re disposable! Throw them away and rewrite better components. The consequences are local, not global, so rewriting is cheap and safe.

What about composability? Well, let’s say one of your clients is a strict libertarian, and believes all forms of censorship are evil, but another client is deeply traditional and simply won’t tolerate any foul language on their blogging site. Where do you add logic to deal with this?

Try this: write a NicePostEntry component. It checks for foul language, and replaces any objectionable words in the body property with the string “BEEP!”. The NicePostEntry component matches the pattern post:entry, and so captures all post-entry messages. Again we have the problem that this conflicts with our existing PostEntry. The solution is to allow pattern overrides.

We allow NicePostEntry to override the post:entry pattern. But we also give NicePostEntry a reference to the prior component that was attached to that pattern. NicePostEntry can then modify the message as it sees fit, and pass it on to the prior component. This is composition! As an abuse of syntax, we can say, with respect to the pattern post:entry, messages are processed as

PostEntry( NicePostEntry( message ) )

What about post-entry messages with images? Since we have a separate pattern matching engine, we set up our rules to handle both cases:

post:entry, image: undefined -> PostEntry( NicePostEntry( message ) )
post:entry, image: defined -> PostImageEntry( NicePostEntry( message ) )

Business Logic Components

This simple little model gives you pretty much all you need for handling the ever-changing requirements of “business logic”. It works because you don’t need to design a data model in advance, you don’t need to design an object model in advance, and you don’t even need to design message schemas in advance. You start with your best guess of the simpler messages in the system, and you know you have a get-out-of-jail: new features can be handled with new properties, and they won’t break old features.

If you think about it, there is quite a direct path from informal business requirements, to “things that happen” in the system, to messages between components. It’s quite easy to specify the system in terms of messages. In fact, you don’t really need to worry about deciding which components to build up front. You can group messages into natural components as you go, or split them out into separate components if the components get too complex.

And this gets you to the point where you can write very general components that handle all sorts of common application features, in a very generic way, and then enhance and compose as needed for needs of an individual project. If you look at the plugin page for Seneca, you’ll see there are plugins (software components) for all sorts of things. They all communicate using pattern matching on messages, and so are resilient to versioning issues, allow for alternative implementations, and most importantly, allow the community to build new plugins, for new features, without any “command and control” nonsense. Anyone can write any old Seneca plugin, any old way they like. Of course, there are some conventions, and we do maintain a curated list of well-behaved plugins on the Seneca site. Still, in your own projects, you’re pretty much free to do whatever you like – it’s all just messages at the end of the day.

By the time Seneca had become a useful component system in late 2011, I co-founded nearForm with Cian O’Maidín. We saw the potential in Node.js and decided we wanted to be part of something big. Seneca became a vital part of our ability to delivery quickly and effectively for clients. We’re based in Ireland, so not only are most of our clients remote, most of our developers are also remote. The ability to separate development work into well-defined components, with interfaces specified by message patterns, along with a body of plug-and-play business components, allowed us to excel at delivery, and is one of the cornerstones of our success in software professional services. We did hit one major snag though, and it illustrates an important trade-off and limitation of this approach (and you thought this was all rainbows and unicorns, oh no…).

Data Modeling

The problem was data, specifically, data models. How do you map the classical idea of a data schema onto a system with no types, and arbitrary messages? Our first instinct was to hide all data manipulation inside each component, and treat messages as extracts of relevant data only. This worked, but was not entirely comfortable. You’ll notice a similar problem in microservice architectures. If one microservice “owns” all the data for a given entity, say users, then how do other pieces of business logic in other microservices access and manipulate that data?

At the time we were blissfully unaware of Domain Driven Design, and still rather enamored with the ActiveRecord design pattern. We did have a problem to solve. Components needed a common data model to facilitate interactions, and we also wanted to be database independent (in consulting, especially for large clients, you don’t always get to choose the database).

We decided to model data using a set of standard messages patterns corresponding (almost) to the basic Create-Read-Update-Delete data operations. Seneca thus offers a conventional set of message patterns of the form role:entity, cmd:save|load|remove|list that operate on “data entities”. Pattern matching makes it easy to support optional namespaces, so that you can have “system” entities for well established plugins (say for user accounts), and even support things like multi-tenancy. Because all data entity operations reduce to messages, it’s easy to get fancy, and use different databases for different kinds of data, whilst retaining the same API. That’s cool. It lets you do things like switch database mid-project without much pain. Start with MongoDB because in the early days your schema is unstable, and end with Postgres, because the client insists on a relational database for production.

You can use pattern composition to add things like data validation and manipulation, access controls, caching, and custom business rules (this is equivalent to adding custom methods to an ActiveRecord). This is all very nice, and works really well in real-world projects. We’re still in business after all! For more details, the Seneca data entity tutorial has you covered.

So what’s the catch? The trade-off is that you have a lowest-common denominator data model. You get what is essentially a key-value store, but with reasonable, if limited, query capabilities. You certainly don’t get to write SQL, or have any concept of relations. You don’t get table joins. You have to accept denormalization.

Now, on the other hand, one can argue that a simplified data model gives you better scalability and performance, and also forces you to face up to data consistency choices that you should be making. The days of hiding behind “transactions” are gone, especially with the number of users we have to deal with.

The way that Seneca handles data will be expanding. We will certainly retain our simplified model, and use that as the basis for core components. It works, and it works pretty well, but we won’t hide the choices that it entails either. Luckily, the message model allows us, and you, to enhance what’s already there, and push forward. One of our core values is respect for developers that have chosen to use the framework, and that means you’ll never suffer from global thermonuclear version breakage. We’ll keep your old code running. Backwards compatibility is in our blood. You might have to switch on a flag or add a supporting plugin, but we’ll never ask you to refactor.

Microservices

Oh yeah … those. So we invited Fred George to speak at one of our Node.js meetups in Dublin in 2013, about “Programmer Anarchy”, and he pretty much melted our brains. We had discovered microservices, and we loved the idea. Yes, lots of practical problems, like deployment and configuration, and network complexity – not a free lunch by any means. But very tasty, and worth paying for!

We did have a little secret weapon – Seneca. Microservices are really just independently deployable and scalable software components. But how do they communicate? Well, we had already solved that problem! Pattern matching. All we need to know was figure out the networking piece.

To preserve the simple view of the world that we had created, it was obvious that microservices should not know about each other, in any way. Microservices based on web services offering REST interfaces suffer from the problem of addressing – where does the microservice live? You need to know the network address of the other side.

Now, you can solve this problem in many different ways – service registries, proxies, virtual network overlays, message buses, and combinations thereof. The problem with most approaches is that your microservice code is still closely bound to the transport mechanism. For example, say you decide to use redis, because you like the publish-subscribe pattern. Well if you use a redis library directly, then it’s going to be hard to move to Kafka when you need to scale. Sure you can write an abstraction layer, but that’s more work again. Alternatively, you could use system designed exactly for the microservice architecture – Akka, say. That does tend tie you down to a particular language platform (Yes, Seneca is Node.js, but the messages are JSON, so polyglot services are much easier than a custom protocol)

We decided to adopt the strategy of transport independence. Microservices should neither know nor care how messages arrive or are sent. That is configuration information, and should not require changes to the business logic code. The pattern matching message architecture made it very easy to make this work. Provide a transport plugin that matches the outbound message patterns. The transport plugin sends these messages out onto the network. The transport plugin can also accept messages from the network and submit them to local plugins. From the perspective of all other Seneca plugins, nothing has changed. The transport plugin is just another plugin.

We converted Seneca into a microservices framework, with no code changes. We just wrote some new plugins. Of course, later we added API conveniences, and things like message correlation identifiers, but even now Seneca is a microservices platform built entirely from plugins. That means you’re not stuck with our opinion on microservices. You can easily write your own transports.

DANGER: You can’t allow yourself to think that all messages are local, and the network is “hidden”, that’s a network fallacy. Instead, adopt the mindset that all messages are distributed, and thus subject to failure on the network. In the era of cloud computing, that’s probably going to end up being true for your system anyway.

The plugin approach to message transport gives you a very flexible structure for your microservices. You write your own code in a normal Node.js module, which you can unit test in the normal way. You put your module into a Seneca plugin, and expose it’s functionality via a set of message patterns (or for simple cases, just write a plugin directly). Then you write a separate execution script, to run your microservice. The execution script “wires” up the microservice to the rest of the network. It handles the configuration of the microservice, including details like network configuration. Just as with Seneca data entities, you can change your microservice communication strategy from HTTP REST to a RabbitMQ message queue, without any changes to your business logic code. Just write a new execution script – it’s only a couple of lines of configuration code.

To see actual code, and try this out for yourself, try the NodeZoo workshop.

Service Discovery

For the transport plugins, we started as simply as we could. In fact, the basic transport is pretty much just point-to-point HTTP REST, and you do need to provide an address – the IP and port of the remote service. But this is OK – your business logic never needs to know, and can be written under the useful fiction that it can send and receive any message, and it will still “just work”.

This approach has another useful feature – testing and mocking is easy. Simply provide stub implementations of the message patterns that your microservice expects. No need for the laborious re-construction of the object hierarchies of third party libraries. Testing reduces to verifying that inbound and outbound messages have the expected behavior and content. Much simpler than testing the nooks and crannies of all the weird and wonderful APIs you can construct just with normal language features in JavaScript.

Despite these advantages, service discovery had remained an awkward practicality, until recently that is. The problem is that you still have to get the network location of the other services, or at least the port numbers if using a proxy, or the location of the message bus, or use a central service registry, or set up fancy virtual DNS, or find some other way to get location information to the right place. We used all of these strategies, and more, to mitigate the problems that this issue causes in production, and also for local development.

But the pressure was mounting from our user community. Everybody wants a free lunch in the end. So we started to experiment with mesh networking. Microservices should configure each other, and share information directly with each other, in a decentralized way. The problem with most of the current service discovery mechanisms is that they use a central point of control to manage the microservice system. Consider the drawbacks. The central registry can easily get out of date. Services come and go as they fail and restart, or as the system scales up or down. The registry may not know about all of the healthy services, and may direct clients to use unhealthy services. Detecting unhealthy services has to be done by heart-beating, but that is vulnerable to slow failures, where the service, under load, may just be taking longer to respond. All-in-all, centralized microservice configuration is tricky, and offers a valid criticism of the entire approach in production.

Nonetheless, we were determined to find a solution. The advantages of microservices far outweigh even this problem. They really do make continuous deployment very simple, and provide you with meaningful ways to measure the health of your system. There had to be a way to let microservices discover each other dynamically, and without a central point of failure.

It was with some interest that we noticed what Uber was doing with the SWIM algorithm. It’s powerful stuff. Essentially it lets a microservice join a network of other microservices, and then they all share information. But not by broadcasting, which has scaling issues, but by infection. Information (such as the start up of a new microservice) moves through the network like an infection, with neighbors infecting each other. Get the mechanics right, throw in a little randomness, and you get fantastic performance and scalability. You also know very quickly if a microservice is unhealthy. It’s pretty sweet!

We wanted to use it, but there was another microservice function we had to build first – client-side load-balancing. You put a little load-balancer inside your client microservice, rather than using an external one (such as nginx or HAProxy). Netflix’s ribbon is a great example. The philosophy of Seneca is that all configurations have their place, and we wanted to offer client-side load-balancing as a possibility.

The trick is to make the balancer dynamically reconfigurable. The balancer is a transport, so it routes any messages that match remote patterns to remote services. Now we also use Seneca’s strengths to make this independent of the transport. You compose the balancer together with the underlying transport, and you can balance over any remote mechanism – HTTP end points, message buses, TCP streams, etc. Any combination of message pattern and transport is possible (you can see why fans of functional programming get excited by composition).

The next step is to provide a mesh networking plugin. All that plugin does is join the mesh of microservices using the SWIM algorithm (Thanks Rui Hu!). It then announces to the world the message patterns that the current microservice wants to listen for. The pattern information is disseminated throughout the network, and the client-side load-balancers dynamically add the new microservice to their balance tables. The balancer is able to support both actor (where listening services round-robin messages) and publish (where all listening services get each message) modes of operation. This provides you with a complete microservice network configuration. Except there is no configuration!

At the moment, our implementation still depends on “well-known” entry points. You have to run a few base nodes at predetermined locations, so that microservices know where to look to join the network – Peter is fixing that one for us, and soon the network will be completely self-managing.

With mesh networking, Seneca has now made microservice service discovery pretty much a solved problem. Even if we do say so ourselves!

Welcome to the Community!

It has been an honor, and privilege, to start and then participate in an active and growing open source project. It is really very special when people put so much trust in your code that they use it in production. It’s easy to forget how significant that is. And I am incredibly grateful to everybody that has contributed to Seneca over the years – Thank You!

We want to be a great project to contribute to, a safe project for any developer, and to be friendly community. We’re lucky that our plugin architecture gives us a simple mechanism for contributions, and also allows contributors to do things their own way. We will curate the main Seneca organization to have a consistent and well-tested set of plugins, and of course some rules will be needed to do that. That said, we want to live by principles, not regulations.

The microservices architecture is very young, and is fertile territory for research and experimentation. This is our contribution.




This entry was posted in Node.js, Uncategorized. Bookmark the permalink.

13 Responses to Seneca, A Microservices framework for Node.js

Leave a Reply

Your email address will not be published. Required fields are marked *