Next: Messaging & Multiplexing
The idea underlying "bus" messaging pattern is to provide the semantics similar to those of the hardware bus — everyone connected to the bus gets any data sent to the bus — just to do so on the higher layer. So, to use messaging terminology, everyone connected to the message bus gets any message sent to the bus.
This pattern doesn't scale well — at some point the number of applications connected to the bus will grow to the point where they will be overloaded by sheer amount of messages they produce. However, the pattern is widely used in some industries (stock trading) in IPC (messaging passing within a single machine) and local (broadcasting data on single LAN) environments. The assumption is that the bus topology is fully controlled by the admins and is never left to scale beyond the point where it breaks.
Given the wide usage of the pattern, new experimental bus protocol was added to nanomsg library. The protocol, due to its symmetrical nature, provides just one socket type named NN_BUS. If the topology is set up right (there are paths between each two nodes in the topology) any message sent to NN_BUS socket should be received by all the other NN_BUS sockets in the topology, but not by the sender itself.
And a simple example of the usage of the bus socket:
#include <nanomsg/nn.h>
#include <nanomsg/bus.h>
int main ()
{
int s = nn_socket (AF_SP, NN_BUS);
nn_connect (s, "tcp://192.168.0.111:5555");
nn_send (s, "ABC", 3, 0);
nn_close (s);
return 0;
}
Note that as with any other broadcasting pattern, the message transfer is not reliable. If it was, single slow, dead or malevolent application would be able block the whole topology.
While the semantics of bus protocol are fairly simple, special care is needed to make it work for all possible configurations of the topology: multicast vs. unicast transports, broker-based vs. broker-less topologies, joining multiple partial buses using intermediate devices etc.
First, let's have a look at two trivial examples with no intermediate devices.
The multicast scenario is pretty straightforward:
The only special case to take into account is whether the underlying multicast transport delivers messages back to the sender. If so, the assumption that NN_BUS socket doesn't receive messages that it itself had sent won't hold. So, it's important that implementation of such multicast transport filters out messages sent by itself. It should be relatively easy to do, for example, by filtering based on source IP address.
Another trivial setup is to connect any node with any other node in the topology by TCP connections. That of course means a lot of connections, but this kind of setup is sometimes used in small deployments (3-4 boxes):
This setup should just work with no special cases to take care of.
Now, let's have a look at some non-trivial topologies. The most common one is to have a broker (device) in the middle. Every application will connect to the device, which in turn will broadcast every message to every application except for the one that sent it.
To build a device, as is the case with any other scalability protocol, you need a special low-level or "raw" version of NN_BUS socket.
s = nn_socket (AF_SP_RAW, NN_BUS);
The semantics of such socket are following:
- When receiving a message, message is tagged by ID of the connection is was received on.
- When sending a message, it is sent to all the connections except for the one identified in the message tag.
I believe it's quite obvious how that works. One special aspect though is that due to fully symmetrical nature of bus protocol, intermediate devices — unlike devices for other scalability protocols — need just one socket, not two of them. Messages are simply received from the socket and sent back to it:
Finally, let's have a look how two partial buses can be joined into a single bus using an intermediate device. Note how raw bus socket semantics allow messages from one sub-bus to pass through the device to the other sub-bus, but prevent them to be republished to the original sub-bus:
The above is the first attempt to formalise bus pattern as a scalability protocol. The solution may still have its shortcomings. Any help with identifying the deficiencies of the protocol would be appreciated!
Martin Sústrik, February 19th, 2013
Next: Messaging & Multiplexing
If all nodes on a bus are homogeneous, who binds the socket?
User does. He decides which nodes are bound, which connect or which do both. However, after the topology is set up all the nodes act in the exactly same way.
I think the need for a forwarding raw bus connecting each totally connected sub-bus makes this no use for a friend to friend topology (sparsely connected peers, no hub), and that's a pity.
What I wonder is, could this be extended to the more general case of a sparsely connected group, so every member still receives every message it didn't send, only once? Perhaps, each broadcast contains the ID of the sender, but also the IDs of the receivers, and a message UUID, so each recipient can forward it to its connections that aren't named (adding them to the list), and discard messages whose UUID it has seen (they looped around).
Sorry, I don't follow. How is that different from what's already implemented? Are you asking for cycle prevention to be added to the existing algo?
To give an example, the following topology: A connected to B and C, C connected to D, D connected to E, E connected to F, B connected to E. There is no direct routing between any pair that aren't connected - perhaps they aren't even on the same IP namespace (and might collide IP usage, say A and E both think they are 10.0.0.1), being two-peer LANs created with OpenVPN.
I suppose this could be created by bridging every pair with raw bus. But it would be nice if the sockets could be made to just link up like that.
Or am I misreading it, and it would already be OK to link them up like that?
I set up three processes using bus and tcp and the communication pattern is not as expected.
p1 listens. p2 dials p1. p3 dials p1.
p2 send("hello"). p1 hears "hello". p3 hears nothing.
I was expecting p1 and p3 to hear anything sent from p2.
From the bus code examples it looks like all nodes are required to be a tcp listener and tcp connect to every other node. If thats the case then how is this different than pubsub? I was expecting to use the tcp channels in both directions. Having to build a mesh of one-way links seems unnecessary when p1 already has a two-way connection to p2 and p3.
Thanks!
in your pattern, only p1 connects directly to p2, so p2 hello, p1 heard.
you can try: p1, p2, p3 listen, p1 dial p2, p2 dial p3, p3 dial p1. so anyone send, the other two receive.
If you using pubsub, say p1 pub, p2, p3 sub. Only p1 can send, p2 and p3 can receive.
Post preview:
Close preview