State machines are widely used to implement network protocols, or, generally, objects that have to react to external events.
Consider TCP state machine:
During its lifetime TCP socket moves throught different states in the diagram. When you start connecting it's in SYN SENT state, when the initial handshake is over, it's in ESTABLISHED state and so on.
And here's an interesting observation: The API of the socket changes as you move from one state to another. For example, it doesn't make sense to receive data while you are still connecting. But once you are connected, receiving data is all right.
To give a more illustrative example, have a look at SOCKS5 protocol. It's basically a TCP or UDP proxy protocol. It's used, for example, by Tor. It starts with authentication phase, supporting different kinds of authentication. Then it moves to connection establishment phase. Once again there are different ways to connect. You can connect to an IPv4 address, to a IPv6 address or to a hostname. Finally, the state machine moves to one of the working states. This can be a TCP connection or an UDP connection.
Note how API changes between the states. In CLOSED state you can call functions such as connect_unauthenticated or connect_password. In AUTHENTICATED state you can call connect_tcp, bind_tcp or open_udp. In TCP ESTABLISHED you can do normal stream socket operations, while in UDP ESTABLISHED you can do datagram operations.
This requirement of mutating API is at odds with how the state machines are normally implemented: There's a single object representing the connection during it's entire lifetime. Therefore, a single object must support different APIs.
What it leads to is code like this:
void Socks5::connect_tcp(Addr addr) {
if(state != AUTHENTICATED) throw "Cannot connect is this state.";
...
}
Which, let's be frank, is just an implementation of dynamically typed language on top of statically-typed one.
In other words, by implementing state machines this way we are giving up proper type checking. While compiler would be perfectly able to warn us if connect_tcp was called in CLOSED state, we give up on the possibility and we check the constraint at runtime.
This sounds like bad coding style, but it turns out that the programming languages we use fail to provide tools to handle this kind of scenarios. It's not network programmers who are at fault, but rather programming language designers.
The closest you can get is having a different interface for each state and whenever state transition happens closing the old interface and opening a new one:
auto i1 = socks5_socket();
auto i2 = i1->connect_unauthenticated(proxy_addr);
// i1 is an invalid pointer at this point
auto i3 = i1->connect_tcp(addr);
// i2 is an invalid pointer at this point
But note how ugly the code is. You have there undefined variables (i1, i2) hanging around. If you accidentally try to use them, you'll get a runtime error. And imagine how would the code closing the socket have to look like!
So you try to "undeclare" those variables, but the only way to do "undeclare" is it let the variable fall out of scope:
socks5_tcp_established *i3;
{
socks5_authenticated *i2;
{
auto i1 = socks5_socket();
i2 = i1->connect_unauthenticated(proxy_addr);
}
i3 = i1->connect_tcp(addr);
}
You've got what you wanted — only i3 is declared when you get to the end of the block — but you aren't better off. Now you have undefined variables at the beginning. And I am not even speaking of how ugly the code looks like.
Anyway, this rant is addressed to programming language designers: What options do we have to support such mutating API at the moment. And can we do better?
December 24th, 2018
session types.
list of languages: http://groups.inf.ed.ac.uk/abcd/session-implementations.html
overview: http://www.di.unito.it/~dezani/papers/sto.pdf
That looks interesting, but it's hard to find an example to get a feeling how the code would look like. There's even a C implementation — based on 0mq :) — but I haven't found any examples. Can you give an example how, say, the socks5 usage would look like?
You can do things with Akka Typed that will go a long way toward Session Types:
https://github.com/rkuhn/akka-typed-session
Also Rust:
https://github.com/Munksgaard/session-types#example
Seconded. A more general approach is called linear typing. The point of a linear type is that you can only use a variable with such a type once, and you must use it once. Linear types are usually used to reason about resource usage, but they can be used as a solution for "evolving" types, like the ones you're looking for. Rust uses weaker notions of such types, but for example in GHC Haskell they introduced proper linear types. In both languages it's used for reasoning about resource usage.
As for an example, imagine your second example, only you cannot use i1 again, that would give a compile-time error.
Internally it's actually quite simple to implement linear types, it's simply a matter of how you propagate/merge variable contexts.
Also see https://gankro.github.io/blah/linear-rust/
Now having said this, I believe this kind of typing (or in fact any "static" attempt at reducing boilerplate) is *not* useful for state machines, at least not for things like network protocols where the event ordering is outside of the program's control. In fact I would say that the "evolving interface problem" you pose isn't applicable to such state machines.
The core of the issue you're describing is that there's a disconnect between a state machine-like program's dynamic state space and the programmer's conceptual state space. The former seems "too large", larger than what seems necessary.
This is because when we reason about a stateful program, the efficient reasoning strategy is as you describe, to think of it as transitions from one interface to another, where each interface/state has its own set of events triggering further transitions. Then we can neatly scope our reasoning to particular states, we don't need to think of seemingly impossible scenarios.
However note that when it comes to external events at runtime, *any* event can happen at *any* time, in *any* order, and it doesn't matter what we as programmers think of as a valid event sequence. Any static knowledge we tried to enforce in the state machine through clever typing would mean we're making implicit assumptions about runtime event ordering. This is all the more true for network protocols where we don't just get reordering, we also get duplication and even complete loss.
Another way of putting this would be: the runtime of a program *doesn't care* that a certain state+event combination doesn't fit the programmer's conceptualization of the state space. In theory the combination could happen, so the program should better be prepared for it, even if it's an error condition. From experience I can tell that a lot of these "impossible" states do in fact happen (although usually in extreme conditions), and the verbosity of handling all of them does pay off, as it's very easy to reason about such a condition when it happens if the program is structured as a state machine.
I would also argue that the sheer act of listing all state+event combinations helps to clarify and simplify these protocols in the programmer's head.
So when are session/linear types useful? When the event producer is composed with the event consumer *in the same program*. In these cases the program *can* enforce event ordering and have it align with the desired protocol, ruling out impossible scenarios statically. But network protocols, or in fact any kind of software that handles external events, don't fit this category.
i don't think there's any conflict between the two. in fact, if you encode the valid state transitions into the type system, the typechecker will *force* you to validate the incoming messages and only "let them in" if they follow your encoded "conceptualization" of the problem. if your code might result in an invalid message getting in, you (hopefully :) get a type error.
That's true, but I think it is possible to treat most of the weird cases as protocol violations and simply close the socket. Think of it as pruning of the state machine. You cut off any weird states and you are left with a state machine that matches programmer's conceptual space and can be modeled by linear types.
Of course, that violates Postel's law, but I am not sure whether in the current security landscape anyone still believes that treating misbehaving peer liberally is a good idea.
How would such a program look like?
The academic literature uses process algebras like pi calculus as basis programming languages. These usually have a parallel composition operation that composes two processes, and the typing rule for such a composition is the static check ensuring protocol conformance.
This looks neat on paper, however in practice programs don't look like this. In practice the composition of things like network protocols happens on the process boundary through sockets/file handles. So I cannot *statically* check that two separate peers conform to one another, their interfaces are bytes! This means that there must be a runtime translation layer somewhere that "upgrades" the untyped socket bytestream into a (linearly) typed message sequence, and this layer will need to handle unconforming input all the same. It's only after this "unmarshalling" that the program can use typed sequences.
So then the question is, is it worth splitting the state machine's runtime error handling into a 'protocol check' phase and an 'everything else' phase? Having thought about it, perhaps the answer is yes. Perhaps we can think of this protocol conformance check as simply an extension of usual deserialization. Instead of parsing a bytestring into a typed datastructure we "parse" a bytestream into a typed protocol. And perhaps just like with parser generators/metaprogramming we could automate this boilerplate too!
Ok, having contemplated this further this could be viable. We just need a capable language hehe
With c++11 you could use lambdas - it adds some boilerplate syntax, but you get what you want - code won't compile if you make same bug as in your examples (i1 instead of i2 for connect_tcp) i1 is out of scope. Also all variables are always initialized.
For this certain example you can just chain methods, but builder pattern should allow for same solution for more complicated cases as well:
Linear and affine types. As seen in: Idris, Rust, and Linear Haskell.
The blog post "A hammer you can only hold by the handle" is relevant to this topic.
Also in Rust, http://insanitybit.github.io/2016/05/30/beyond-memory-safety-with-types
You make the correct observation that certain events are not relevant at different moments in the life cycle. It would be more robust to let the specific current state get the first chance to handle an event, and if it doesn't handle it, pass it to a subsuming state. This is the idea of hierarchical state machines ("statecharts"), invented by David Harel in the late 80's. Statecharts are part of the UML and they are several good implementations.
A statechart approach would automatically capture and allow proper error handling for events received not appropriate for the current state, even being able to gracefully handle events that occur in that momentary transition between states (which has some finite time, albeit very small).
Miro Samek (http://state-machine.com) has a lot of good materials on this.
Maybe I'm missing something obvious here, but wouldn't a Visitor pattern enable relatively clean compile-time type-checking?
Too many people confuse terseness with cleanliness. Clean code is expressive and easy to reason about. Sometimes verbosity is cleaner than terse syntax. I argue that nested if statements with well named predicate function calls is immensely cleaner than trying to build a type system that handles every possible case as a class or interface.
Reginald Braithwaite deals with this by mutating the API of a javascript state engine implementation based on the current state in this series of articles: http://raganwald.com/2018/03/03/reflections.html
I'd probably do it like this
Respect of an automaton in an API is a typical use case for contracts (preconditions and postconditions). See for example the first paragraph in this section of the SPARK formal programming language User's Guide: http://docs.adacore.com/spark2014-docs/html/ug/en/source/how_to_write_subprogram_contracts.html#writing-contracts-for-functional-correctness
Of course, that means you need a tool that can formally analyze your code, and a programming language that supports such a tool. It's possible in C with Frama-C but more difficult than with a language like SPARK meant for such formal analysis.
Have not had a chance to read all the comments, but has anyone seen Sing Sharp?
https://en.wikipedia.org/wiki/Spec_Sharp#Sing_Sharp
I've read the documents on the Singularity operating system. It looked awesome. Inter-process communication was type-safe, memory-safe, statically checked at compile time and used a finite state machine to control the flow of messages. Sing-Sharp was extended from Spec-Sharp, which was extended from C#.
I would really enjoy hearing from anyone that has read the Singularity research.
Post preview:
Close preview