Running a shell script with a timeout

In my line of work, deadlock is always close at hand. Running multithreaded programs through prototype compilers, runtime systems, or simulators (and sometimes all three!), there is always the chance that a bug will trigger a deadlock in some layer of the system. There’s nothing like setting off a bunch of experiments, going to sleep, and waking up the next morning to find that the first experiment (or, inevitably, the experiment that ran right after I stopped checking on things) deadlocked and prevented everything else from running. After being burned by this particular variant of Murphy’s Law countless times, I learned to always run benchmarks with a hard timeout. Knowing that my experiments won’t hang forever is honestly right up there with fancy foam mattresses in terms of helping me sleep at night.

I looked around and couldn’t find any standard Unix commands or shell trickery that did what I wanted, so I wrote a little Python script called timeout.py (available via github) some years ago. Unbeknownst to me, in late 2008 GNU coreutils added a timeout program (documentation here) that is now available on many Unix systems. The basic idea behind timeout.py and GNU’s timeout is the same: run a command with a specified timeout, and 1) exit when the command exits normally or 2) send a signal to the command when the timeout expires.

timeout.py works by setting a SIGALRM signal to be delivered at the desired timeout period, and then forking a new child process to run the specified command. The SIGALRM handler then sends a specified signal (SIGKILL by default) to the child process.

One useful feature that I added to timeout.py (that GNU timeout doesn’t have) is the ability to place the child process in its own process group; when the timeout expires a signal is sent to the entire process group. This is helpful for cleaning up multiprocess programs. PARSEC‘s script for launching benchmarks (parsecmgmt), in particular, forks a bunch of processes to run a benchmark, and killing the top-level parsecmgmt process will not kill the actual benchmark process which lives a few layers deeper.

timeout.py is written in Python, so it’s easy to modify. Its functionality is also exported via a Python module called TimerTask that wraps the functionality of Python’s standard subprocess.Popen() call. One caveat with using the TimerTask module is that, since it uses SIGALRM, it may interfere with other uses of SIGALRM in client code.  In particular, you can’t have multiple simultaneous TimerTasks, each with their own timeouts. TimerTask tries to be safe and raises an exception if there’s an existing SIGALRM handler, but a better approach would be to fork a new child process to run each TimerTask. Maybe I’ll add this someday.

timeout.py is Unix-only due to the use of, e.g., the SIGALRM signal and the preexec_fn argument to subprocess.Popen(). The source code is available under a GPLv3 license at https://gist.github.com/1526975.

Advertisements

One thought on “Running a shell script with a timeout

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s