-
Daniel Lyons authoredDaniel Lyons authored
Messaging Architecture
Messaging is a pervasive concept in the archive and workspaces systems. Messaging in the archive is typed thanks to Java and a library we developed for the archive called “channels.” Using “channels” it is a small amount of up-front work to define types that will be exchanged over AMQP and how they will be encoded; senders and receivers can then use the message definition to instantiate senders and receivers.
Early attempts to port this functionality to Python revealed that it simply isn’t workable because there is no static compilation moment in Python to leverage this way. Using types to try and encode message patterns quickly became bulky and un-Pythonic. If we want messaging to form a cornerstone of workspaces, it has to be easy to read and write and effective.
The result of our analysis is this library, messaging.
Messages are Python dictionaries
Python dictionaries have a 1:1 correspondence with JSON. There is a built-in library for converting between Python dictionaries and JSON. This contrasts with Java where there is no such correspondence and the third-party libraries all have extensible object encoding functionality.
JSON is always the AMQP message encoding format for SSA, because it is structured, easy to parse and human-readable, and thus easier to debug.
The cost of this decision (which was made long ago in the archive project) is essentially that:
- You cannot send binary data without encoding it somehow into a string (such as via base64)
- There is some efficiency loss compared to binary message encodings
In practice, the overhead is not likely to be a problem until we reach thousands of messages a second, which is unlikely to occur with our messaging regime.
Message sends are function calls
So the message format is essentially Python dictionaries. But Python
dictionaries also have a 1:1 correspondence with function calls via
function **kwargs
. So we define a message Router
with a single
method of interest: send_message
:
class Router
def send_message(self, **kwargs): pass
If we want to send a message constructing the message on the fly, we can; this would for instance send a message saying a certain capability request is complete:
router.send_message(subject='capability request', id=23, state='Complete')
If we happen to have the message in a dictionary, we can send it as well
using the **
destructuring syntax:
msg = {'subject': 'capability request', 'id': 23, 'state': 'Complete'}
router.send_message(**msg)
Message receipt is also a function call
Message receipt is also a function that takes keyword arguments. As far as the user is concerned,
router.send_message(foo='bar', ...)
leads directly to
def receiver(foo=foo, ...): ...
without the user having to spend any thought on AMQP at all.
In practice, most message recipients are not be interested in the entire content of the message. Instead they will be interested in one aspect or another, and we do not want to have to update the code at the site of each message receipt because the message format itself changed in some way that doesn’t matter to the recipient. So the majority of receivers have the type:
def receiver(**message: Dict): ...
Message recipients use patterns to select messages of interest
Recipients annotate their callback methods with an @on_message
pattern to indicate that they are interested in receiving a certain
message:
@on_message(service="workflow", type="delivery")
def on_delivery(self, **message: Dict):
The meaning of this annotation is that the method on_delivery
will
be called whenever a message passes through that has a service
key
with value workflow
and a type
key of value delivery
, or in
other words, the message is a superset of
{'service': 'workflow', 'type': 'delivery'}
. This annotation is dead
until a Router is apprised of the existence of these methods using
router.register
, so classes that send and receive messages typically
contain something like this in their __init__
method:
self.message_router = Router("capability")
self.message_router.register(self)
The Router itself only exposes its constructor and the two methods
register
and send_message
. The parameter to the Router is used
to create a topology in the AMQP server; we use this topology to keep
the Capability and Workflow services separated.