Previous: Enforced Error Handling
I've written about state machines before.
The message I was trying to convey was that using state machines is superior to relying on callbacks.
Which it is. Most definitely.
However, state machines have a bunch of problems of their own.
In this article I wan to discuss these problems in detail and propose what has to be done to replace state machines by standard linear code.
What's wrong?
Let's go straight to the problem: State machines are large and brittle.
Typically, there's no built-in support for state machines in your language and so you end up managing a large amount of boilerplate code. (In theory, you could generate the boilerplate, but code generation is super contributor-unfriendly and thus should better be avoided.)
What's worse, the code is complex and hard to reason about. It's extremely easy to get wrong.
In practice, the common traversal paths through the state machines are written and debugged early in the development cycle and what you are left with are bugs in rarely used paths dealing with stuff like timeouts, cancelations or errors. Interactions between different state machines multiply the complexity and what results are rarely seen, non-deterministic heisenbugs that take ages to reproduce and fix.
What is a state machine?
To fix the problems with state machines we first have to have clear understanding of what state machine is and what it isn't.
To put it simply, state machine is a computation, an algorithm.
It is very similar to 'function' (also known as procedure, method or subroutine), but unlike function it splits the computation is multiple smaller steps. The added value of the splitting is that the state machine can be interrupted each time it processes one step and before it starts processing the next one.
But what's the point? Why would anyone want to execute a function, which is a perfectly sane and nicely encapsulated unit of computation, in several steps?
And here we are getting to the core of the problem: You want to split the function because it may take too long to finish — it can do a lot of work or maybe just idly wait for I/O — and in the meantime it blocks other stuff from being executed.
Typical use of state machines is to run a step of one state machine, then run a step from another one and so on. In short, state machine is a very lightweight thread that user can schedule by hand. It is a manual alternative to an OS process or an OS thread.
Why state machines?
The obvious question thus is: Why state machines? Why not processes or threads?
And the obvious answer is: Performance.
When UNIX was still young, scheduling was supposed to be done by the OS on per-process basis. When implementing a network server, for example, you were supposed to fork a new instance of the process for each TCP connection and rely on the OS scheduler to switch between the processes.
I guess it made sense from performance point of view back then. All the stuff that makes process switching slow today haven't yet existed. Machines had single processor, there was no virtual addressing or memory caches. In such an environment process switching was closer to what we call green threads today.
Later on came threads. Threads are like processes, but operate in the same address space, so there's no need to switch the TLBs. Thus, cost of context switch is lower.
Finally, we've got green threads. These are typically implemented on the language level and avoid even more cost. The latest and most optimised version is Golang's goroutines. They are cooperatively scheduled, thus there is no cost associated with preemption. They have tiny stacks, maybe couple of kilobytes. They can even be scheduled on multiple CPU cores like real OS threads.
Goroutines are performance-wise comparable with hand-written state machines. It seems that there's no more space for radical performance improvement.
So, we can forget about state machines and use green threads instead, right?
Unfortunately no. Green threads, even if they bring us a long way towards the goal, aren't sufficient to eliminate all state machines. In next blog post I'll dig into the remaining problems in more detail.
Martin Sústrik, January 5th, 2016
Previous: Enforced Error Handling
Working exactly on this matter at the moment. I am really curious of the next post…
State machines are also used to implement business processes. And for that use case I don't really see a good alternative. But I'm curious for your next post!
Good point. Business processes require state of the computation to be persistent, i.e. to survive process restart. And there's (currently) no way to persist the state of green thread. It may be interesting area for research.
Well, there's green threads, there's green threads, and there's green threads.
To be more precise, there's stackful with explicit cooperative rescheduling (possibly also hidden in library calls), stackful with automatically inserted preemption points, and stackless.
The first case is how (say) Duktape coroutines work. An explicit yield construct, etc. With this, one can pickle the state pretty feasibly, as the preemption point will always be at a semantically sensible point in the execution.
The second case is more how goroutines and many other language-level green threads work. These tend to be harder to pickle, less because of the actual green threads and more because this technique sees more use in systems that push their state down into external systems, like the kernel's asynchronous I/O system. That's not something you can serialize or persist.
The third case is where it gets really interesting - that's stuff like the C# async/await construct, which works by inducing the compiler to _transform_ the coroutine into a state machine.
It also has the least overhead of any of the options, due to not needing a stack at all - and if performance is your goal, they take green threads (on Linux, or other OSes with efficient schedulers like the L4 family) from "mostly irrelevant" to "compelling." At least on Linux, any performance advantages of stackful green threads are essentially in the noise, and that's without considering that 'large stacks' aren't a feature of OS threads, but rather glibc. Musl allows considerably smaller per-thread stacks.
But honestly, I think that saying that the need for persistence is why business processes use state machines fundamentally misses the real reason:
By and large, business processes are _specified_ as state machines - the flowcharts and process diagrams behind it all. And in the end, one of the best things one can do for maintainability is to keep a strong correspondence between the spec and the code.
I don't really care whether my language translates my function into something with its own stack or whether it's clever enough to generate a state machine from it.
All I want is to write a simple imperative code that's an order of magnitude easier to understand and manage than an equivalent state machine.
And, arguably, green threads of one sort or another bring us closer to this goal.
As for the business processes, it's kind of chicken and egg problem. There's no way to build them using simple linear code (because persistence) so everybody goes with state machines both for implementation and specification. If it was the other way round, everyone would just go with imperative programming, I guess.
I am currently studying the idris (http://idris-lang.org) language which is an eager evaluated dependent-type functional programming language.
In it one can specify each state as a different type and a "function" (of Types) that transform one type/state to another type/state.
The result is that any deviation from the state machine specification is a compile time error.
Edwin Brady expains this in his talk : https://youtu.be/kpPyolN65s0?t=14m17s
One can do the same thing with Rust, tidily enough:
With that, a sequence like this compiles:
AwaitingRequest::listen(80).recv().expect("Timed out receiving").reply("Read you loud and clear").expect("Timed out sending).close();
But this would fail:
AwaitingRequest::listen(80).reply("Me first!");
As would this:
AwaitingRequest::listen(80).request();
In addition, each of those types takes up no more space than their members, and the methods are statically dispatched with no overhead.
Oh, also, those states would be allocated on the stack (he String type in Rust is heap-allocated, but state machines that never touch the heap are really trivial to implement)
Yes, Rust is orders of magnitude better than C.
I would prefer though if there was a separation between the state machine specification and the implementation.
I think that ,with with the use of traits, that can be achieved.
Rust is cool.
Yup - if I wanted to be more abstract, I could have done something like this:
This would allow
(Note that the "unwrap"s are not idiomatic; in real code where the transition may vary at runtime, you'd use match { } to extract the values)
What would you think of a program communicating with other devices using Bluetooth, for example? I can see a clearly cut state machine for that: Connecting -> [Connection Succeeded] -> Reading -> [Value received] -> Disconnecting
What would an argument against it be?
Great post by the way, I loved reading it as I've ran into similar problems! Found it as one of the first hits when Googling :)
Try adding timeout to the state machine (e.g. the value wasn't received in 1 min). Then try adding cancelation (e.g. user canceled the process while waiting for connection to succeed, user canceled the process while waiting for the value, user canceled the process while disconnection was happening). Very soon your nice state machine will turn into an ugly mess.
Post preview:
Close preview