Previous: Structured Concurrency
Abstract
This article is a follow-up to Edgar Dijkstra's classic "Go To Statement Considered Harmful" article. It takes Dijkstra's premise and applies it to concurrent programming.
Program and Process
Dijkstra says:
My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving it time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.
My observation is that in the code using select, whether POSIX select() function, Go select{} statement or an equivalent in other languages, the correspondence between code layout and its execution is severely harmed.
Consider the following example (in Go):
for {
select {
case <- channelA:
doA()
case <- channelB:
doB()
case <- channelC:
doC()
}
}
The order of execution of individual clauses has no relationship whatsoever to their ordering in the code.
On the other hand, in the following snippet the ordering corresponds 1:1 to the execution order:
<- channelA
doA()
<- channelB
doB()
<- channelC
doC()
It gets worse when the execution requires to preserve some state between the steps:
var varA int = 0
var varB int = 0
var varC int = 0
for {
select {
case varA <- channelA:
doA()
case varB <- channelB:
doB(varA)
case varC <- channelC:
doC(varB)
}
}
What happens if case B is executed before case A? And what happens if C is executed after A, but B is missing? We are facing exponential explosion of complexity.
The latter pattern, on the other hand, keeps complexity at bay:
varA := <- channelA
doA()
varB := <- channelB
doB(varA)
varC := <- channelC
doC(varB)
Ideally, if we want to preserve our ability to maintain the program and reason about it, we should strive to use the linear pattern and avoid the select-based pattern altogether.
The question is whether every select-based program can be converted into linear program. Or whether getting rid of select necessarily limits the expressiveness of the language.
Let's consider different use cases and see what can be done.
State Machines and Event Loops
First of all, select-based programming is used in C to write network servers. The idea is to have many TCP connections to clients opened in parallel, to wait for activity on any of them using select(), then to service whoever happens to ask us to do some work. Reader can readily come up with their own examples in different languages and different problem domains.
This pattern is called event-driven programming or state machine approach. An interesting observation is that state machines are one of couple cases where using goto statement still yields nicer code than using structured programming. This isn't at all accidental. What we are facing here is the half a century old Dijkstra's problem resurfacing in the concurrent environment.
If we dig deeper into the network server problem, it can be seen that we can get rid of the state machine approach by forking a separate process for each TCP connection. This is the canonical way to solve the problem in the UNIX tradition, which itself is not that much younger than the original 'considered harmful' paper.
The reason why people use select() nonetheless is simple: Forking a process per TCP connection is simply too expensive. A program that would do so won't be able to cope in modern high performance environment.
So, we are not facing a conceptual limitation, but rather a relatively simple performance issue. And most modern languages do offer a way to fork lightweight processes (called green threads, coroutines, goroutines and so on) that make the option of launching one per TCP connection a viable proposition once again.
Go and Timeouts
Interestingly, Go, the language that is closest to the original UNIX philosophy, encourages handling network connections this way, but is still features a select{} statement.
We'll explore the remaining use cases in the rest of the article but let's first do away with a trivial case of using select{} for timeouts:
switch {
case time.After(100):
handleTimeout()
case <- channelA:
doA()
}
It seems that timeouts are handled this way in Go because select{} statement was already available and thus the designers of the language took advantage of the mechanism to solve this relatively simple and benign problem. It definitely doesn't look like select{} was introduced to the language because of dire need to solve the menacing timeout problem.
In fact, timeouts can be solved by simply adding an additional parameter to blocking operations. Here's an example in idiomatic C:
int rc = foo(1); /* timeout of 1 second */
if(rc == -1 && errno == ETIMEDOUT)
handle_timeout();
Two Big Use Cases
That being said, we are left with the cases where select{} is used for parallel communication with different goroutines. These can be split into two big classes. First, there are use cases where we want to communicate with many instances of the same thing. For example, in a network server there may be a goroutine per TCP connection but all of those want to communicate with the main goroutine. Second, there are use cases we want to communicate with different things. For example, we have a random number generating goroutine, a TCP connection goroutine, a statistics gathering goroutine and so on. Main goroutine wants to speak with all of those.
Unlimited Number of Identical Peers
Let's first tackle the problem of "many of the same type". Naively, one could create a channel for each of those peers and then use select{} (or rather reflect.Select() which allows for unlimited number of peers) to wait for activity from any of them.
Luckily, Go language already contains a mechanism to deal with this scenario in a clean way. Specifically, single Go channel can be used by many senders and many receivers at the same time. Thus, it works as (de)multiplexing device and removes the need for (de)multiplexing via select{} statement:
ch := make(chan int)
go(worker(ch))
go(worker(ch))
go(worker(ch))
result := <- ch // waits for a message from any of the 3 workers
One may object that this way we lose the information about which worker is the message coming from. But that's exactly the point. If we are dealing with arbitrary number of instances of the same type we positively don't want to handle each of them in a different way. If we do, they are probably not the same thing after all. For example, we may have a bunch of TCP connections but some of them are HTTP connections while others are SMTP connections. We want to handle them differently. I'll discuss that in the 'different peers' section below.
Before moving there though let's have a look at a slightly confusing scenario of different peers that we want to handle the same way. For example, our application may receive inputs via console, via email or via SMS. These look like different things, but given that any of them provides same kind of input they can be treated as 'multiple instances of the same thing' and connect to the main goroutine via a single channel:
ch := make(chan int)
go(consoleHandler(ch))
go(emailHandler(ch))
go(smsHandler(ch))
result := <- ch // waits for a message from any of the 3 handlers
The lesson from this example is that word 'same' in the phrase 'many of the same kind' refers to identical semantics, not necessarily to identical implementation.
Limited Number of Different Peers
Having dealt with the 'same thing' case, let's have a look at how to deal with communicating with different kinds of things.
for {
select {
case <- channelA:
doA()
case <- channelB:
doB()
}
}
Let's suppose that A and B are completely independent. There's no ordering relationship between doA's and doB's. They can be executed in any order. They don't pass any info each to the other.
In such case each of the handles can be placed into its own goroutine with no need for select{} statement:
func handlerA(channelA chan int) {
for {
<- channelA
doA()
}
}
func handlerB(channelB chan int) {
for {
<- channelB
doB()
}
}
Let's, on the other hand, assume that there is a relationship between A and B. For example, doA returns a value that is then passed to doB. If so the program can be rewritten in linear manner:
for {
<- channelA
varA := doA
<- channelB
doB(varA)
}
At this point I would like to claim that any program dealing with limited number of different peers can be transformed into an equivalent program with no select{} statements. It can be done by repeatedly applying the two transformations shown above:
1. Splitting independent stuff into separate processes.
2. Linearisation of the dependent stuff.
TODO: Formal proof. (The raw idea is that if there are two operations, either one is dependent on the other in which case they can be linearised; or they are independent in which case they can be parallelised. Use induction to prove it for any number of operations.)
Closing Remarks
As a final remark let me note that 'limited number of identical peers' is a trivial case and 'unlimited number of different peers' is not realistic as it would require infinite amount of code.
It was also brought to my attention that doing RPC via channels may be a case where select{} statement is truly needed.
The argument goes like this: Imagine we have one server goroutine and many client goroutines. Client goroutines are sending requests to the server goroutine via a single channel as described above. Each message contains a reference to the reply channel. Server is supposed to process the request and send the result to the reply channel.
And the question is: What happens if the reply cannot be sent to the reply channel straight away? The send operation can't block as that would DoS the entire server. Surely we want to store the reply and select on the reply channel (along with the request channel) until is becomes writable?
Well, no. The channels can already be buffered (in Go, that is). Sufficiently large buffer should smooth out any peakiness in client performance. If the reply cannot be sent it means the buffer is full and thus that there's something wrong with the client. The correct behaviour is such case is shutting the client down and possibly logging the event.
RPC server should be thus implemented like this (unfortunately, the example is written in Go and has to use select{} statement to implement a timeout):
for {
request := <-requestChannel
response := processRequest(request)
select {
case request.replyChannel <- response:
continue
otherwise:
close(request.replyChannel)
log.Error("Oh my God! Oh my God! Something bad happened!")
}
}
Conclusion
If language contains following features:
1. Lightweight processes.
2. Channels to communicate between them.
3. Many simultaneous senders per channel.
4. Many simultaneous receivers per channel.
5. Ability to make channels buffered.
Then there is no need for select statement.
Martin Sústrik, February 20th, 2016
Previous: Structured Concurrency
Not all "different kind of peers" cases can be linearized. Simple example: read-write storage with some processing on read and on write. Processing goroutine will look like this:
and there are use-cases when those two channel should not be squashed: different contention policies on those channels, routing, exposing channel(s) as part of API etc.
Another use-case of select - delivering control signals to goroutines. Simplest case - stopping the intermediate goroutine without closing the original channel (i.e. when another goroutine transparently replaces this one), another case - reconfiguring goroutine on the fly (e.g. ajusting throttling, flushing caces etc). The sane out-of-band signalling is important feature that should not be underestimated.
The former example seems to be easily parallelisable, i.e. can be split into two goroutines. Am I missing something?
The second case looks more interesting. Both examples cover basically the same scenario. There are two (or more) channels with different QoS connecting two goroutines. I wonder whether implementing a simple prioritised channel wouldn't solve the problem.
Thinking about it, the classic Go select{} statement doesn't seem to address this scenario either. If chooses one of the active channels at random, ignoring the fact that the two may have different QoS and that it should thus choose the one with higher priority. What's the idiomatic Go way to deal with this kind of stuff?
select is used when you want to share state between processing more than one channels. You otherwise need mutexes or atomic values which I would argue are harder to get right than just using a select. I've created a fairly complex state machine that would never have been possible without the select.
Can you be more specific about the scenario which can't be implemented without select?
The most common scenario is where I have a channel for data and a second channel for management commands. The management commands how data is processed, and the data is the data to be processed. I could implement an envelope for both the data and the management so that it can go over the same channel, but I lose the protection of having an internal only management channel. I think this is still better than setting up various mutexs to have an internal store for the management state.
Yes. That's the hand-crafted scheduling scenario. I've just written about it in the next blog post.
I think both Vladimir and Joshua are thinking about a stateful read-write interface like
Would you transform this pattern to pub-sub?
Another possible counterexample is in push/pull dataflow frameworks like gstreamer, where data can be either pushed (as a message) or pulled (as a request,reply pair). Of the four adapters between push,pull connections, three can be written without select (push->push, pull->pull, pull->push), but I believe the push->pull adapter (aka buffer) needs a select. It is useful for e.g. adapting between different sampling rates of audio vs video streams.
As for push->pull scenario, I am not sure I fully understand the question, but if buffer is needed, why not just use a channel to buffer the data?
Why are read and write requests passed on different channels? One would rather expect something like:
Anyway even in the case of two separate channels, these can be funneled to a single channel by having two helper coroutines (read and write). They would write requests to a single request queue, which would then be serviced as shown above.
To put it in a different way. The commands (reads & writes) have to be sequentialised at some point. This is typically done in the client (user can do only one command at a time). Still you can have an advanced client which is capable of multiple overlapping operations, or, which is really the same thing, multiple independent clients. Even in that case the commands have to be sequentialized before reaching the DB engine. This can be done either by having a separate channel for each client and doing select{}, or simply by having a single channel that all the clients write to. The latter is, IMO, preferable from the point of view of readability and maintainability.
"Go, the language that is closest to the original UNIX philosophy"
With all due respect: Such a statement without being properly laid out and being profoundly substantiated, to me simply translates to "OK, we have a fan boys writing here"; it's basically a disqalification.
Don't get me wrong, I *do* think that languages are of high importance and that it's about time to have a profound (and even mistrusting) look at them (outside of academia). I *do* agree that some languages are better for the kind of job you're discussing here, that some are less attractive and that some - some major languages among them! - are rightout evil and dangerous.
So, a statement like "X is closest to the original UNIX philosophy" or "Y is the best language for safe, reliable and performant network servers" may quite well be justified and valuable, even desirable. It must, however, be very carefully laid out and substantiated.
If it isn't I take it to be worthless blunder from a fanboy.
I don't really care whether it's closest. But Ken Thompson is one of the authors of Go, so shrug, whatever.
To me that statement about Go being the closest to the UNIX philosophy was self-evidently true the moment I read it.
It's actually insightful in a deep way, but I think rmoe is reading a different meaning into it, a more value-judgement-feely meaning instead of the common-pattern-identifying meaning, probably because the design decision parallels and similarities are somewhat "gated truths" - it's only obvious to me what you meant because I have spent a really large amount of time thinking about the relevant design considerations, the underlying unifying base principles on top of which a lot of systems and languages were built.
I think rmoe is right in the sense that obviously those who don't already "see it" are liable to misinterpret that statement or have a negative reaction to it, just like rmoe did. But it's not an absolute, and if you're writing for the audience of people who already get it or are willing to do good-faith idea-fitting on the statement, taking the time to justify it has limited value, and from experience I can say even counter-productive.
So I'm here to say the statement was good, right, and while I'm sympathetic to rmoe's critique, I am glad you just wrote it simply at face value and moved on - it worked well for me.
I think your argument is about structuring state machines, not that much about select. Without indepdependly schedulable lightweight threads, one is forced to conflate logically hierarchical (or peer) state machines all into one. I agree with you that with Goroutines, many state machines can be broken up into tinier, hierarchically related ones.
However, you still need select, since it is the equivalent of nondeterministic choice in CSP, and it is a primitive in the latter for a reason. So, point #2 just before your conclusion is not always possible because there is no linearization there. Independent signals (timeouts, cancellations, peer failures, aasync i/o completion etc) are all possibilities, and they may or may not happen. You have to account for a powerset of the possibilities in any execution.
The suggestion of putting all such signals into a single channel to avoid select is fraught with problems:
1. The receiver dictates the type of the channel to be the union of all incoming signals, so the source has to know details that it is completely unconcerned with.
2. Erlang has the model you suggest. At the same time, the language has built-in support for selective receive (which is select in a different form), yet head of line blocking gets in the way more often than you think. In Go, you have to build selective receive yourself, but the essential problem doesn't go away.
3. Flow control affects all sources. There are good arguments against unlimited buffering, and bounded buffered channels impose flow control: one busy source can basically keep out another source feeding to the same channel.
4. This scheme doesn't get rid of nondeterminism anywway, because the first thing the receiver must do is to demultiplex the message and process it. It is identical to the select code.
Two problems with your argument against select for timeouts:
1. Separation of concerns. When I'm writing code, I don't want to be thinking "oh I need to remember to add a timeout parameter to every function." I want to handle timeouts orthogonally to everything else.
2. Composition. If I want to compose N sub-operations into one operation governed by a single timeout, then without select I must record the running time of every sub-operation, subtract it from the time remaining, and feed that as the timeout to the next sub-operation. It's onerous and causes drift in the timeout that scales with the number of sub-operations.
I think these two problems are easily addressed:
1. If the underlying system deeply supports timeouts or cancellations - at every blocking operation or context switch for example - then you *never* have to remember to add a timeout parameter - callers can just impose timeouts on every operation within some scope wherever appropriate. Both this blog author's libdill in C and the very nice trio library in Python are examples of this - assuming the entire system is written on top of primitives that can be timedout or cancelled, timeouts come for free, everywhere.
Which actually gets you the timeouts handled orthogonally that you wanted, just like Go's select on a timeout channel does.
2. This is actually solved by the same way that I described in the previous point - if the underlying primitives can be timed out or cancelled in a scoped way then you get that feature for free.
But if you didn't, that's actually a really trivial pattern to abstract - I have a library that boils down to like ten lines of essential code which lets you do something like (Go-like pseudocode):
timeout := totaltimeout.Timeout(5 * time.Seconds)
someOperation1(…, timeout.TimeLeft())
someOperation2(…, timeout.TimeLeft())
…
someOperationN(…, timeout.TimeLeft())
and so long as your operations handle zero for the timeout as a "bail immediately" (which can be trivially ensured with a wrapper function if they don't), it works exactly as intended.
One case I don't see obviously cleanly handled by this model, either here or in the follow-up footnote post, is that of collating multiple data sources into different updates to the same state.
Actually I just thought of a solution when I came back after a few hours to write this comment, so this does fit into the model, but I'll write the rest of the post for anyone else who it helps.
For example, I have a program right now which receives information from many different data sources about some networked devices we have. (The data sources are various systems that directly or indirectly talk to our networked devices, or have insights into the devices or their activity for other reasons.)
You can picture each data source as a Go or libdill -style channel, where events arrive frequently - nearly constantly - but effectively at random.
Then each of these pieces of information needs to be parsed (different for each source) into a decision about how to modify the in-memory state for each of those networked devices (the state is shared).
At first glance, this is not a "semantically the same" data sources problem, so it cannot be handled as one channel.
At first glance, this can be trivially split into independent coroutines, but it relies on sharing of state and - assuming that the concurrency can also potentially be parallel at any moment, as in Go - requires bringing mutexes into the equation. That seems strictly worse for readability because now the state manipulations are not exactly local in the code.
So I kept feeling like a select on all of these data source channels was the cleanest way, that kept the semantically related code local in the source text.
However, it finally hit me:
1. Decision on if and how to mutate shared state
2. Mutating shared state
These are two semantically separate things, so they actually fit this no-select-statement model beautifully.
Separate coroutines can read each data source channel, transform the data into some representation of the state change to make, and then write that representation into a single shared state updates channel.
A single coroutine reads just that channel and applies each state change.
Evidence through concilience that this is actually a good idea - doing this automatically factors the logic into more easily testable pieces - the state changes are either data which can be fed into the logic that applies them, or functions which are given the state to update as an argument - each of which lends itself way more easily to unit testing, fuzzing, model-based testing, etc.
You're onto something real and significant here Sustrik.
Post preview:
Close preview