-
Daniel Lyons authoredDaniel Lyons authored
Messaging Architecture
Messaging is a pervasive concept in the archive and workspaces systems. Messaging in the archive is typed thanks to Java and a library we developed for the archive called "channels." Using "channels" it is a small amount of up-front work to define types that will be exchanged over AMQP and how they will be encoded; senders and receivers can then use the message definition to instantiate senders and receivers.
Early attempts to port this functionality to Python revealed that it simply isn't workable because there is no static compilation moment in Python to leverage this way. Using types to try and encode message patterns quickly became bulky and un-Pythonic. If we want messaging to form a cornerstone of workspaces, it has to be easy to read and write and effective.
The result of our analysis is this library, messaging.
Messages are Python dictionaries
Python dictionaries have a 1:1 correspondence with JSON. There is a built-in library for converting between Python dictionaries and JSON. This contrasts with Java where there is no such correspondence and the third-party libraries all have extensible object encoding functionality.
JSON is always the AMQP message encoding format for SSA, because it is structured, easy to parse and human-readable, and thus easier to debug.
The cost of this decision (which was made long ago in the archive project) is essentially that:
- You cannot send binary data without encoding it somehow into a string (such as via base64)
- There is some efficiency loss compared to binary message encodings
In practice, the overhead is not likely to be a problem until we reach thousands of messages a second, which is unlikely to occur with our messaging regime.
Message sends are function calls
So the message format is essentially Python dictionaries. But Python dictionaries also have a 1:1 correspondence with
function calls via function **kwargs
. So we define a message Router
with a single method of interest:
send_message
:
class Router
def send_message(self, **kwargs): pass
If we want to send a message constructing the message on the fly, we can; this would for instance send a message saying a certain capability request is complete:
router.send_message(subject='capability request', id=23, state='Complete')
If we happen to have the message in a dictionary, we can send it as well using the **
destructuring syntax:
msg = {'subject': 'capability request', 'id': 23, 'state': 'Complete'}
router.send_message(**msg)
Message receipt is also a function call
Message receipt is also a function that takes keyword arguments. As far as the user is concerned,
router.send_message(foo='bar', ...)
leads directly to
def receiver(foo=foo, ...): ...
without the user having to spend any thought on AMQP at all.
In practice, most message recipients are not be interested in the entire content of the message. Instead they will be interested in one aspect or another, and we do not want to have to update the code at the site of each message receipt because the message format itself changed in some way that doesn't matter to the recipient. So the majority of receivers have the type:
def receiver(**message: Dict): ...
Message recipients use patterns to select messages of interest
Recipients annotate their callback methods with an @on_message
pattern to indicate that they are interested in
receiving a certain message:
@on_message(service="workflow", type="delivery")
def on_delivery(self, **message: Dict):
The meaning of this annotation is that the method on_delivery
will be called whenever a message passes through
that has a service
key with value workflow
and a type
key of value delivery
, or in other words, the message
is a superset of {'service': 'workflow', 'type': 'delivery'}
. This annotation is dead until a Router is apprised
of the existence of these methods using router.register
, so classes that send and receive messages typically
contain something like this in their __init__
method:
self.message_router = Router("capability")
self.message_router.register(self)
The Router itself only exposes its constructor and the two methods register
and send_message
. The parameter to
the Router is used to create a topology in the AMQP server; we use this topology to keep the Capability and Workflow
services separated.