Preface (you know it’s good if there’s a preface)
In Architectural Paradigms of Robotic Control, a number of architectures were reviewed including deliberative, reactive, and hybrid architectures. Each of these exhibit a clean separation of concerns with layering and encapsulation of defined behaviors. When implemented, the various capabilities, such as planners and mobility controllers, are encapsulated into discrete components for better reusability and maintainability. A pivotal aspect not discussed in the previous article is how the various system layers and components communicate with each other, such as reporting sensor feedback and sending commands to actuator controllers. Effectively resolving this communication challenge is not only important to robotic systems but to many other industries and domains for the successful integration of disparate applications.
To give credit where credit is due, this article pulls quite heavily from the patterns, taxonomy, and best practices presented in Enterprise Integration Patterns (Hohpe, 2003). This well organized book is chock full of hard learned lessons and solid guidelines for developing maintainable message-based systems. This article should not be seen as an adequate replacement for that book (it’s more like cliff notes with a spackling of robotics bias); indeed, Enterprise Integration Patterns should have a prominent place on your bookshelf if you’re developing message-based systems – so read this post and then browse http://www.eaipatterns.com/ while waiting for your copy to arrive to delve deeper.
A Need for Message-Based Systems
A few industries in particular, such as finance, healthcare and robotics, are demanding integration of an intimidating number of separate technologies that may be spread across computers, networks, and/or built upon a variety of technological platforms. Not only is this integration tricky, it can come with a significant cost to performance and maintainability if not implemented correctly. Accordingly, a solution is needed which facilitates loosely coupled integration while accommodating the performance demands of the task at hand. Taking a message-oriented approach to inter-application communications is one such way to accommodate these demands in a maintainable manner without sacrificing performance. This article gives an introduction to developing message-based systems using messaging middleware, describes taxonomy for discussing messaging topics and patterns, and includes a number of best practices.
Before delving further, it’s important to clarify a few terms that will be used frequently:
- Messaging Middleware (aka, a message bus): a 3rd party application which provides messaging infrastructure and capabilities (e.g., MSMQ, MS Concurrency and Coordination Runtime (CCR), Robot Operating System (ROS)),
- Component: a stand-alone application or piece of executable code which communicates with the messaging middleware,
- Message-based system: the entirety of the system including all integrated components and the messaging middleware.
As stated, messaging provides one means of facilitating inter-component communications. But as with any design approach, the project requirements must be carefully considered to determine if messaging is the appropriate mechanism for integration. While messaging is robust and facilitates integration, it also adds complexity and indirection. So before deciding to use messaging as the means of integration, consider all component integration options including (Hohpe, 2003):
- File Transfer: wherein a component produces files of shared data which other components consume,
- Shared Database: each component stores and retrieves data from a common database,
- Remote Procedure Invocation: each component exposes specific procedures to be invoked remotely for exposing behavior and exchanging data,
- Messaging: each component connects to a common messaging system, using messages to invoke behavior and exchange data.
Determining which integration approach is most suitable to your project’s needs is beyond the scope of this article, focusing instead specifically on messaging. In turn, we’ll review important elements of developing a message-based system including: message channels, messages, message routers, and message endpoints.
When a component sends information to another component in a message-based system, it adds the information to a message channel. The receiving component then retrieves the information from the message channel. Different channels are created for each kind of data to be carried; having a separate channel for each datatype better enables receiving components to know what kind of data will be retrieved from a given channel. For using a channel, each channel is addressable for sending and retrieving messages to/from them. How a channel is addressed varies depending on the messaging middleware being leveraged, but it’s usually a port number or a unique string identifier. As a good practice for keeping channels organized, if string identifiers are available, a hierarchical naming convention may be employed to label channels by type and name; e.g., a channel carrying laser scans might be called “Perception/LaserScans.”
There are two basic kinds of message channels:
- Point-to-Point Channels (aka – client/server style): routes messages for components to talk directly with other components; e.g., a remote procedure call to another component. A message over a point-to-point channel only has a single receiver; so while the sender may not necessarily know who the receiver is, the sender can rest assured that the message will only be received by one receiver – it’s a FIFO queue.
- Publish-Subscribe Channels (aka – broadcast): routes messages for components to publish data and an arbitrary number of components to subscribe to that data. A copy of the message is generated for each subscriber on the channel.
In addition to channels intended to carry information among components, it is a good practice to setup an invalid message channel that bad-formed or unreadable messages may be forwarded to for logging and to assist with debugging.
With a function call, a simple parameter or object reference may be passed and retrieved by the invoked method, sharing the same memory space. But when passing data between two processes with separate memory spaces, the data must be packaged into a “message” adhering to an agreed upon format which the receiver will be able to disassemble and understand. The sender of the message passes the message via a message channel. The receiver retrieves the message from the message channel and transforms the message into internal data structures appropriate for the task at hand.
A message is made up of two parts:
- Header: describes the data being transmitted and details concerning the message itself; e.g., origin, timestamp information, message expiration (if content is time-sensitive), message identifier, correlation identifier, return address, etc., and
- Body: the data content that the receiver is looking to use.
When sending a message, the sender intends for the message to be used, or responded to, in a particular way. The intention of the message may be described as being one of the following:
- Command Message: invokes a procedure in another application,
- Document Message: passes a set of data to another application,
- Event Message: notifies another application of a change in state, and
- Request-Reply: requests a reply from another application.
Event messages deserve a bit more discussion. In its simplest form, an event message would simply be informational, letting subscribers know that an event has occurred; e.g., a new laser scan is available. If subscribers would like details concerning the event, they would send a request-reply to the sender of the event to provide further details; e.g., the laser scan details. Alternatively, the event could be a document as an event to inform subscribers that an event has occurred along with the details of that event; e.g., a new laser scan is available with laser scan details included. The size of the event details and the rapidity in which the event occurs should be considered when deciding between publishing simple event messages and document messages as events.
Request-reply messages could also use a bit more describing. A request-reply is usually implemented as two point-to-point channels. The first channel delivers the request as a command message, while the second carries the reply back to the requestor as a document message. To keep the replying component more loosely coupled and reusable, the requestor should include a return address indicating the channel that the replier should use to publish the reply. After receiving the reply, a challenge for the requestor is to then correlate the reply to the original request. If the requestor is sending a number of requests in succession, it will likely be difficult to keep clear – if it matters – which request a reply is associated with. To resolve this, every request may include a unique message identifier that the replier would then include as a correlation identifier. (A message could have both a message Id and a correlation Id.) The requestor uses the correlation identifier to “jog its memory” concerning which request the response is for. But frequently, a request-reply is in context of a particular domain object, such as a terrain map or a bank transaction; but the correlation Id doesn’t include such information. To assist, the requestor can maintain a mapping (e.g., hashtable) between message Ids and relevant domain object Ids which are related to the original request. When the reply is received, the mapping may be used to load the appropriate domain objects and take further action, accordingly.
Obviously, it is important that the senders and receivers of a message system agree upon the format that messages will take for clear interoperability, better reusability of components, and extensibility of the system. Consequently, a canonical data model should be well defined that all applications will adhere to. The canonical data model does not dictate how each application’s domain model must be structured, only how each application must format data within messages. Message translators are developed to convert the sending application’s domain model into the canonical data model before sending a message; receivers of messages then use their own message translators to translate the message into their own domain. This mechanism allows applications built on completely different technologies (e.g., C#, Lisp, and C++) to communicate with each other and exchange data. Many off the shelf messaging systems define their canonical data model which must be adhered to. For example, the Robot Operating System (ROS), which we’ll looking at in more detail in subsequent posts, defines their canonical model at http://www.ros.org/wiki/msg. But the canonical data model need not be limited to defining the types of primitives available and how to include them in messages.
Domain-specific canonical data models may augment message formatting rules, adding semantic meaning to the data within a message. For example, the Joint Architecture For Unmanned Systems (JAUS) is a set of message guidelines for the domain of unmanned systems, such as autonomous vehicles. The JAUS guidelines provide domain specific rules for communicating data, such as propulsion and braking commands, sensor events, pose and location information, etc. To demonstrate, JAUS message types include (Siciliano, 2008):
- Command: initiate mode changes or actions,
- Query: used to solicit information from a component,
- Inform: response to a query,
- Event set up: passes parameters to set up an event, and
- Event notification: sent when the event happens.
A challenge in dealing with canonical data models is how to handle changes to the model. In order to support backwards compatibility of existing components when the canonical data model changes, new message channels could be created to carry the messages adhering to the model; e.g., “Perception/LaserScans_V1″ and “Perception/LaserScans_V2.” Alternatively, the existing channels could continue to be leveraged to carry messages adhering to different version of the canonical data model. To do so, the message, within its header, would include a format indicator, such as a version number or format document (e.g., DTD) reference. But if a sender knows that receivers of a particular message are mixed in what format is being used, a component would need to send two messages, one for each version of the canonical data model. Certainly, this is an important consideration when deciding which components should (or even can) be upgraded to newer formats, and in what order.
While the heavy lifting of the message routing is handled by the messaging middleware itself, there are times when it is useful to augment the middleware with custom message routers to support unique scenarios.
Suppose the destination of a message may change based on the number of messages that have already passed over a channel. In this scenario, the sender of a message may not know how many messages have been passed over a channel since other senders may have been publishing messages on the same channel. Consequently, a message router may subscribe to the channel to determine where each message should be forwarded to, based on the described business rules. Once the destination is determined, the router would then place the message on a subsequent channel to be delivered to the appropriate destination. This intermediary routing is described as predictive routing as the message router is aware of every possible destination and the rules for routing, accordingly. If the routing is based on content within the message itself, such as threshold values, then the custom router is known as a content-based router.
A drawback to using message routers is that if the routing rules change frequently, the message router will need to be modified just as often. To help remedy this, if the rules are expected to change frequently, configurable routing rules (e.g., via XML) could be employed to enable easier management of routing rules.
Let’s now consider another scenario wherein it’s left up to the subscribers to determine which messages they’re interested in; i.e., subscribers will be responsible for filtering out the messages they’re uninterested in. In this reactive routing scenario, each subscriber would provide a respective message filter which is similar to a message router, but simply forwards, or does not forward, a message onto a subsequent channel that the destination subscriber is listening to. Frequently, message filters decide to forward, or not forward, based on content in the message itself; e.g., only forwarding orders that have a coupon included. While being similar to a message router in basic functionality, a message filter only has one possible channel to forward the message onto.
Deciding between predictive and reactive filtering must take into account a number of considerations. Is the message content sensitive? Do you need to minimize network traffic? Do you need to be able to add and remove subscribers easily? Is the predictive router becoming a bottleneck of message dissemination? For further guidance on selecting among routing options, see (Hohpe, 2003), ppg. 241-242.
Each messaging middleware option (e.g., CCR, ROS) has unique requirements for communicating with it. Each has its own API, its own means of addressing channels, and its own rules for packaging messages. Ideally, the components of the system should not be aware of the specifics of communicating with the messaging middleware. Furthermore, while unlikely to occur, the middleware should be able to be replaced with another, requiring little, if any, changes to the components. Accordingly, the components must be loosely coupled to the messaging middleware. Message endpoints provide the bridge between the domain of each component and the API of the messaging middleware.
Message endpoints are similar in nature to repositories when communicating with a database. Repositories encapsulate the code required to communicate with a database to store and retrieve data while being able to convert information from the database into domain objects. If the database changes, or if the mechanism for database communication changes (e.g., ADO.NET to NHibernate), then, ideally, only the repositories are affected. The rest of the application knows little about database communications outside of the repository interfaces. Likewise, components of a message-based system should not be aware of messaging details outside of the message endpoint interfaces, which provide the means to send and receive data. The message endpoint accepts a command or data, converts it into a message, and publishes it onto the correct channel. Additionally, the message endpoint receives messages from a channel, converts the content into the domain of the component, and passes the domain objects to the component for further action. Internally, the message endpoint implements a message mapper to convert between the component domain objects and the canonical data model.
While posting to a channel is rather straight forward, a message endpoint may receive a message by acting as a:
- Polling Consumer: wherein the receiver looks at a channel on a regular basis for new messages and/or as soon as it completes the processing of a previous message. This frees the receiver from having to deal with messages as soon as they arrive on the channel in favor of dealing with messages went it’s ready and willing. A consideration to keep in mind is that messages may queue up on a channel while waiting to be retrieved by the polling consumer. Additionally, a polling consumer may take up threads and resources while polling a channel, even if the channel is empty.
- Event-Driven Consumer: wherein the message is given to the receiver as soon as it arrives on the channel. The benefits to this include avoiding messages queuing up while being able to process messages asynchronously. But the receiver is no longer in control of the timing in which it processes messages and must handle messages as soon as they arrive on a channel.
Many messaging middleware solutions include support for transactions that the message endpoints may leverage, making the endpoints transactional clients. To illustrate the need for this, suppose the receiver of a request-reply command crashes just moments after the command message is consumed and removed from the channel. When it recovers, the command message is lost and the sender will never receive a reply. Using a transaction, the command message is not removed from the channel until the response is completed and sent. Committing the transaction removes the command message from the channel and adds the reply document message to the reply channel.
There are a few recommendations which should be considered when developing message endpoints. In accordance with the SRP, message endpoints should be able to receive messages or send messages, but not both in the same message endpoint. Furthermore, a message endpoint should only communicate with one message channel. If a component needs to send a message on separate channels, it would leverage multiple message endpoints to do so. If your components are developed in line with DDD, it would be each component’s application services layer which would communicate with the message endpoints, preferably via their interfaces instead of concrete instances. This facilitates swapping out the message endpoints with mock objects for unit testing. Leveraging separated interfaces and dependency injection helps enable this approach. While these guidelines introduce more objects and indirection, they are proven practices for increasing maintainability of the component and reusability of the message endpoints.
Performance of Message Based Systems
One would likely be quick to assume that developing a message-based system has a huge cost to performance. Domain objects are converted to messages, messages are passed over channels, routers and filters intercept and forward messages, and messages are converted back into domain objects…this sounds like a heck of a lot going on. But because each component executes in its own thread or process, they need not wait for other components to complete their job before being able to move on to handling another message or task.
In the figure above (adapted from Hohpe, 2003, pg. 73), note that a sequential process requires that each message life cycle is completed in full before moving on to the next message. But in an asynchronous, message-based system, each component can move onto a subsequent message just as soon as it has completed its part in the last one. This greatly compensates for the extra overhead imparted by the messaging infrastructure.
With that said, there are some component responsibilities which take a long time to complete and may impede the speed by which messages are processed. For example, imagine a component which takes images from a web cam and extracts information such as human figures or road signs. This is likely a time consuming process and would impact the turn around time in which the component could process subsequent messages. To alleviate this bottleneck, multiple instances of the same component may subscribe to the same point-to-point channel. Recall that a point-to-point channel ensures that each message only has a single receiver. The instances of the component then become competing consumers of each message. So if one instance of the component is still processing an earlier message, another instance can grab the next message that arrives for concurrent processing. Power in numbers!
Monitoring and Debugging
Due to the widely asynchronous nature of message-based systems, attention must be given to monitoring and debugging techniques to observe system behavior and iron out problems.
Logging message content is an invaluable measure towards getting a clear look at what communications are taking place. To facilitate this, logging could be added directly into sending and receiving components, but logging message content should not be a concern of components; accordingly, using a monitoring utility to log such information is a cleaner separation of concerns. Certainly a benefit of publish-subscribe channels is that a monitoring component may subscribe to all messages and log the information to a file or console window. While it’s just as valuable to monitor messages on point-to-point channels, if a monitor were to consume a message over a point-to-point channel, the message would be noted as consumed and would no longer be available to the intended receiver. Because such monitoring capability is so helpful in developing and debugging, many messaging middleware options include a “peek” option which allows a monitoring utility to review message contents on a point-to-point channel without actually consuming the message. This capability should be taken into consideration when comparing messaging middleware alternatives.
In addition to monitoring message content sent over channels, it’s also assistive to monitor active subscriptions to various channels to accurately determine which components are receiving which messages. This capability is typically built into messaging middleware solutions and is very assistive during development.
It’s possible that even if a message is routed appropriately, the receiving component may not know what to do with the message due to an incorrectly formatted header or body content. Invalid messages such as this should be forwarded to an invalid message channel which an error logging utility would monitor and log, accordingly. An invalid message channel is setup just like any other channel but with the intention of exposing such messages for debugging purposes.
Developing asynchronous, message-based systems requires a paradigm shift from the more traditional, synchronously executed applications that most people are familiar with. It is not appropriate for every application domain and should be seen as one additional architectural option to consider when developing applications. But in some domains, such as robotics or in the integration of disparate applications, this approach to development is absolutely pivotal in providing responsive behavior without sacrificing maintainability of the overall system. Indeed, by splitting responsibilities into discrete components, loosely coupled to each other via messaging middleware, immensely complex problems can be broken down into understandable chunks while being flexible enough to accommodate changes to the underlying middleware or introduction of new components.
In the next couple of posts, we’ll look at a checklist for developing message-based systems followed with examples in CCR and ROS.
Hohpe, G., Woolf, B. 2003. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions.
Siciliano, B., Khatib, O. 2008. Springer Handbook of Robotics.