Previous: The Merit of AMQP (part I)
EINTR is one of the POSIX errors that you can get from different blocking functions (send, recv, poll, sem_wait etc.) The POSIX standard unhelpfully describes it as "Interrupted function." Googling for EINTR returns mainly random questions like "I am getting teh EINTR error. What now?" answered mostly by "Just restart the interrupted function."
None of this helps much when you want to correctly handle EINTR, actually understand what you are doing and why. In this blog post I'll try to explain what EINTR is good for and how to handle it in your code.
To understand the rationale behind EINTR, let's do a little coding exercise. Let's write a simple event loop that performs some action for every byte it receives from a socket. And let's pretend there's no EINTR and recv just continues waiting for data whatever happens:
void event_loop (int sock)
{
while (1) {
char buf [1];
recv (sock, buf, 1, 0);
printf ("perform an action\n");
}
}
The above program works great. However, interrupting the program using Ctrl+C kills it immediately, which may be a problem if we want to do some clean-up, for example, release some system-wide resources.
To handle Ctrl+C in a custom way we have to implement a signal handler:
volatile int stop = 0;
void handler (int)
{
stop = 1;
}
void event_loop (int sock)
{
signal (SIGINT, handler);
while (1) {
if (stop) {
printf ("do cleanup\n");
return;
}
char buf [1];
recv (sock, buf, 1, 0);
printf ("perform an action\n");
}
}
Ok. Looks good. What's the problem with that?
The problem is that recv is a blocking function. If Ctrl+C is pressed while the event loop is blocked in recv, you'll get a kind of deadlock: Signal handler is executed as expected, it sets 'stop' to 1, but then the execution blocks. The event loop is stuck in recv and has no opportunity to check whether 'stop' was set to 1.
The deadlock unblocks only when new data arrive via the socket. Then 'stop' is checked and the program exits decently. However, there's no guarantee that new data will arrive in a reasonable time, so pressing Ctrl+C may seem to have no effect. The program is probably going to terminate at some later point, but at the moment it's just stuck.
Enter EINTR.
POSIX specification defines that when signal (such as Ctrl+C) is caught, recv returns EINTR error. That allows the event loop to wrap over and check the 'stop' variable:
volatile int stop = 0;
void handler (int)
{
stop = 1;
}
void event_loop (int sock)
{
signal (SIGINT, handler);
while (1) {
if (stop) {
printf ("do cleanup\n");
return;
}
char buf [1];
int rc = recv (sock, buf, 1, 0);
if (rc == -1 && errno == EINTR)
continue;
printf ("perform an action\n");
}
}
The above code works more or less like expected. When you press Ctrl+C, program exits performing the clean-up beforehand.
EDIT: Please note that to make blocking fuctions like recv return EINTR you may have to use sigaction() with SA_RESTART set to zero instead of signal() on some operating systems.
The morale of this story is that common advice to just restart the blocking function when EINTR is returned doesn't quite work:
volatile int stop = 0;
void handler (int)
{
stop = 1;
}
void event_loop (int sock)
{
signal (SIGINT, handler);
while (1) {
if (stop) {
printf ("do cleanup\n");
return;
}
char buf [1];
while (1) {
int rc = recv (sock, buf, 1, 0);
if (rc == -1 && errno == EINTR)
continue;
break;
}
printf ("perform an action\n");
}
}
When Ctrl+C is pressed in this case, signal handler is executed, 'stop' is set to 1, recv returns EINTR, but the program just calls recv again and blocks. 'stop' is thus not checked and the program gets stuck. Ouch.
Instead of remembering these intricacies you can just remember a simple rule of thumb: When handling EINTR error, check any conditions that may have been altered by signal handlers. Then restart the blocking function.
Additionally, If you are implementing a blocking function yourself, take care to return EINTR when you encounter a signal.
To give you a real world example of incorrectly implemented blocking function, here's a problem we encountered with ZeroMQ couple of years ago: Ctrl+C did not work when ZeroMQ library was used from Python (via pyzmq language binding). After some investigation, it turned out that Python runtime works more or less like the examples above. If Ctrl+C signal is caught, it sets a variable in the handler and continues the execution until it gets to a point where signal-induced conditions are checked.
However, ZeroMQ library used to have a blocking recv function, that (oops!) haven't returned EINTR and rather ignored the signals.
What happened was that user called ZeroMQ's recv function from Python, which started waiting for incoming data. Then the user pressed Ctrl+C. Python's signal handler handled the signal by marking down that the process should be terminated as soon as possible. However, the execution was blocked inside ZeroMQ's recv function which never returned back to the Python runtime and thus the termination never happened.
Exiting the recv function with EINTR in case of signal solved the problem.
Finally there are few fine points to be aware of:
First, there's no EINTR on Windows. My assumption is that blocking functions cannot be combined with Ctrl+C and with decent clean-up on Windows. Maybe there's some esoteric API to handle this kind of situation, but I am personally not aware of it and I would be grateful for suggestions.
Second, even some POSIX blocking functions don't return EINTR in case of a signal. Specifically, this is the case for pthread_cond_wait and pthread_mutex_lock. pthread_mutex_lock is not often a problem as it is generally not used to block for arbitrary amount of time. Mutexes are normally locked only for a very short time until some simple atomic operation is performed. pthread_cond_wait is more of a problem. My suggestion would be to use sem_wait (which returns EINTR) instead of pthread_cond_wait. Once again, if anybody knows how to perform clean shutdown when pthread_cond_wait gets into its way, let me know!
Third, even EINTR is not completely water-proof. Check the following code:
volatile int stop = 0;
void handler (int)
{
stop = 1;
}
void event_loop (int sock)
{
signal (SIGINT, handler);
while (1) {
if (stop) {
printf ("do cleanup\n");
return;
}
/* What if signal handler is executed at this point? */
char buf [1];
int rc = recv (sock, buf, 1, 0);
if (rc == -1 && errno == EINTR)
continue;
printf ("perform an action\n");
}
}
As can be seen, if the signal handler is executed after checking the 'stop' variable and before invoking the recv, the program will still be stuck. However, this is not a serious problem. For starters, the period between the check and the recv is extremely short and it is thus very unlikely that the signal handler gets executed precisely at that point. And even if that happens, pressing Ctrl+C for the second time sorts the problem out.
EDIT: It was suggested by Justin Cormack that signalfd can be used to solve the last problem. However, keep in mind that first two problems remain: it doesn't work with Windows and it doesn't work with pthread_cond_wait. Moreover, it makes your program non-portable (works only on Linux). Finally, signalfd is not a good option when you are implementing a library. By using signalfd (or, for what it's worth, your own signal handler) you are messing with the signal handling algorithm of the main application. If, for example, the main application already has a signalfd handling Ctrl+C signal, creating a new signalfd in the library causes the signal to be delivered alternatively to the main application and to the library (first Ctrl+C is sent to the library, second one to the application, third to the library etc.) Which, of course, brings the problem back: You have to press Ctrl+C twice to exit the program.
EDIT: Ambroz Bizjak suggests to use pselect (and similar functions) to deal with the race condition above. The idea is that signals are blocked for a very short period of time before the blocking call. Once signals are blocked, the flags set by the handlers can be checked and pselect is called which unblocks the signals is an atomic manner. This trick is even applicable in libraries. If the library exposes a blocking function,you can extend it to expose a p* variant of the function (for example, ZeroMQ could expose zmq_precv in addition to zmq_recv). User of the library can use this function to handle signals in a race-free way.
Martin Sústrik, November 5th, 2012
Previous: The Merit of AMQP (part I)
Shouldn't "stop" be sig_atomic_t?
Yep. I've tried to keep the examples simple. Cache coherence related questions are kind of out of scope of the article.
There's also pselect/ppoll/epoll_pwait, which can be used together with non-blocking I/O to catch signals without race conditions.
How exactly would you use it to deal with the Ctrl+C problem described in the article?
The magic here is that signals are blocked except during the pselect(), and pselect() atomically unblocks them when it's called and blocks them when it returns. As a result, the signal handler can only execute during the pselect(), so it is safe to only check the flag after pselect() returns.
From linux.die.ne t/man/2/pselect
The reason that pselect() is needed is that if one wants to wait for either a signal or for a file descriptor to become ready, then an atomic test is needed to prevent race conditions. (Suppose the signal handler sets a global flag and returns. Then a test of this global flag followed by a call of select() could hang indefinitely if the signal arrived just after the test but just before the call. By contrast, pselect() allows one to first block signals, handle the signals that have come in, then call pselect() with the desired sigmask, avoiding the race.
As far as I understand, then the Ctrl+C would not work when we don't happen to be waiting in a blocking function. That seems even worse than doing it the other way round.
The only blocking function here is pselect(). If you get Ctrl+C during a pselect(), it will immediately return (because you have unblocked signals with the sigmask argument to pselect). On the other hand, if you get Ctrl+C while you're not in pselect(), the signal handler will not run until you call pselect() (because signals are blocked), and when you finally call pselect(), it will run and pselect() will return.
I think you're misinterpreting the meaning of "blocked". If signals are blocked, and a signal happens, it is queued, not discarded. When you unblock it, the handler will execute.
Yes. In the tight event loop it would work OK.
However, consider this example (pretty common ZeroMQ use case): User calls zmq_recv (a blocking function) to get some work. Then he processes the work. With the pselect solution, the program would be interruptible while waiting for a message (say 0.1 sec) but not interruptible while processing the work (say 1 hour).
I don't see a need to keep signals blocked outside zmq_recv(). When zmq_recv() is called:
1. block signals
2. check if signal is pending (this is atomic because signals are blocked)
3. pselect()
5. recv()
6. if we need more data, goto 2 (yes, 2, so we don't miss a signal)
7. else unblock signals and return data to user
It can never happen that a signal is received and we wait indefinitely in pselect().
By "check if signal is pending" I mean check the flag, not some posix functions to get the pending signals mask…
Nice! I wasn't aware of the trick. I'll mention it in the article.
Still, it doesn't work for libraries. Unless, of course, ZeroMQ would expose zmq_precv (…, const sigset_t *sigmask) in addition to standard zmq_recv.
Huh?? How is the 3rd listing supposed to behave differently from the 2nd one??
You've just added a check on the return value & errno, but if the recv call is exiting with EINTR, you should have exact the same behaviour in both listings regardless of if you check its return value or not.
Besides that, this doesn't seem to work anyway. Both on a Linux (Debian) and a Darwin (OSX) system, recv doesn't seem to exit with EINTR once a signal handler has been installed, so I still get the same "stuck-until-some-data-arrives" condition.
Am I missing something?
The assumption for the two first listings is "let's pretend there's no EINTR and recv just continues waiting for data whatever happens".
The first two listings are trying to explain why EINTR error exists at all and explore what would happen if it doesn't exist. EINTR is considered only from 3rd listing on.
Ok, I think I found the problem: it seems that if you install a signal handler without setting the SA_RESTART flag to 0 (which you can only do if you use sigaction to install the handler) then at the exit of the signal handler, any interruptible syscall (such as recv) is restarted automatically instead of exiting with EINTR.
So, I think the example should be expanded showing how to install the SIGINT handler by using sigaction. In any case, I insist that listing2 and listing3 have no different behavior.
How would SA_RESTART help? The problem here is that you have to somehow force the blocking function to exit when signal happens. Setting SA_RESTART does the exact opposite.
Ah, sorry. I've misunderstood you. Yes, you are right. SA_RESTART has to be set to 0 to get the EINTR behaviour. I'll make a note in the article.
Another solution is to use the so-called "self-pipe trick" where you have the signal handler write a byte to a pipe. You use non-blocking sockets and in the select() you always monitor your sockets and the read end of the pipe. There is no race condition - when your signal happens, you very soon be able to read from this pipe (next time you call select, or immediately if select is being called). However, be sure to set both ends of the pipe to non-blocking. Otherwise the signal handler could block if the pipe buffer fills up and the program would deadlock.
But this is of course only applicable to your Python problem if there is a way to change what the signal handler does. Or if you're very evil you can try to override the signal handler temporarily.
Yes, that would work if you are in control of the handler (send is in the POSIX list of signal-safe functions). Unfortunately, that's not the case for libraries.
write(2) a single byte to a pipe file descriptor from the signal handler and use whatever async I/O tech in your event loop (select(2), poll(2), libev, libevent, ….).
Or give up on portability and use some API that solves this proble, like, say, Solaris event ports (just post an event from the handler, or put the port into alert mode from the handler).
As for condition variables… if only pthread_mutex_lock() were async-signal-safe… then you could acquire the mutex, signal/broadcast on the cv, and drop the lock. But it's not, so you can't. What you can do is resort to having a thread waiting for events on a pipe written to by the signal handler (see above!) then have that thread signal/broadcast on you condition variable.
See? The trick is to transmogrify async signals into events that you can wait for with traditional async I/O APIs.
FYI, this technique goes back a long time. It's used in OpenSSH, for example, and has been for many years now.
Except that messing with signal handlers is not an option when you are implementing a library. Signal handlers are set by the main module and should not be changed randomly in the background by the libraries.
On Windows this is done with SetConsoleCtrlHandler() call.
Receive is the easy case.
What happens if you have a protocol where you call "write".
Is there any guarantee that you cannot get EINTR if the packet has been transmitted?
If you retry sending the packet, then it is sent TWICE, which may confuse the other end.
True. As far as I know there is no guarantee that the operation haven't [partially] succeeded when EINTR is returned.
If a write/send has already sent some data, then the function will return early with a short count (the same is usually true for read/recv too). I'm not sure if EINTR will be set in errno is such cases, but the caller is unlikely to check it anyway.
in alternative, u can use the fcntl (filedes, F_SETFL, new_flags) with the O_NONBLOCK macro: check here: https://www.aquaphoenix.com/ref/gnu_c_library/libc_141.html
Windows has some support for this, at least for the Winsock interface. There is an error code WSAEINTR 0x2714 "A blocking operation was interrupted by a call to WSACancelBlockingCall."
And, as you can guess, a Winsock API function WSACancelBlockingCall is used to indicate the "interrupt". closesocket (Winsock close for sockets) seems to use this function internally.