Still Trying to Get Control-C Right?

  • I thought it funny that one of the selling points for the Trio design is “Control-C works”! That is, if you run a Python program and try to kill it at the command line by typing Control-C, it will properly kill the program. I found that especially funny because when Sun released NFS in 1984, the initial beta implementation did not handle Ctrl-C (SIGINT) “correctly”. To make it work as expected and transparently to local file IO, they needed to make sure that any IO operations on the network file system properly returned EINTR immediately so that the interrupt could then be handled properly by the surrounding context to eventually kill the program. So 33 years later, getting that cancellation behavior right is still a selling point for an async library. Cancellation is pretty hard.
  • It was especially interesting that one of the challenges for the various languages and frameworks that have tried to define consistent mechanisms is that many of the most basic system primitives do not provide the appropriate control to implement a rational higher-level model. This was a huge problem for Windows since the most basic file handle and networking operations (like CreateFile) did not provide the right level of control to support flexible cancellation models. This is true for other language and systems as well. The lack of appropriate control at the lowest levels tends to propagate up so other libraries and middleware also ignore these issues. All this makes it tremendously hard for an app developer trying to take ultimate responsibility for their programs behavior to get it all right. The consequences of this play out in your daily experience with misbehaving applications.
  • Smith’s discussion on composability was important — composable behavior is hugely important for these design issues — but I think he misses grounding the design in how failures happen in the real world and the consequences of that on his design. In particular, his discussion on the relative behavior of interval-based timeouts vs. absolute deadlines missed a key point. Most user-facing applications don’t want to use strict deadlines, they want to continue (and quite possibly provide feedback to the user) as long as progress is being made. The distinction between “I’m making (slow) progress” and “I have no connectivity at all” makes all the difference. This means that even if you have a tree of timeout-limited calls, it might very well make sense to apply the same timeout to each individual call rather than slice the timeout between the calls in the way that a strict deadline effectively would. This behavior is actually pretty hard to implement using the mechanism Trio provides because the desire to support both composability and information hiding (so the internal implementation details are hidden from the caller) make it difficult to reset the cancellation condition at the appropriate points.
  • I’ll also note that if you’re using this in the context of a service / data center environment, your model for failure — and therefore how you want to use cancellation and timeouts — will be very different. The behavior you want if you have a flaky cell signal is very different than the behavior you want if you’re trying to contact a service sitting on the same rack in a data center.
  • He motivates using what is effectively a stack of global cancellation contexts rather than an explicit context parameter by the failure of so many previous systems that have explicit contexts to uniformly and consistently propagate that context. I think this is a poor motivation. Those consistency failures are a direct result of the fact that it in general was impossible for these middleware layers to do the right thing anyway because the underlying system primitives they would end up calling didn’t provide the appropriate control. So that failure is not really a fair test. I am a much bigger fan of both the control and very clear semantics that an explicit parameter system provides. It is too easy for code that uses an implicit global context to simply fail to consider the design issues.
  • For effectively the same reason, I did find the Trio “nursery” concept interesting. Most systems that define concurrency mechanisms essentially treat any new thread of execution as occurring at the global context. The Trio nursery requires a task to have an explicit scope. Of course it could be misused by creating all tasks within a global nursery, but overall has a number of useful consequences that he describes in his analysis.
  • I also enjoyed the link to the discussion between edge and level-triggered designs. I discussed this in “Defense in Depth or Hack?” in the context of event vs. state-based APIs which is the same issue in another guise.




Programmer, Ex-Microsoft Technical Fellow, Sometime Tech Blogger, Passionate Ultimate Frisbee Player

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

OKPulse Now Supports Microsoft Azure Active Directory (Azure AD) Single Sign-On

2 Steps to No-Code Web Scraping Into Webflow CMS — 2020

Pivot Points Trading Strategies — The Full Guide.

Reviews of Epicor ERP

The Web Is Sick

Which billing model is best for cooperation with a software house?

TorGuard VPN (Ubuntu Linux GUI)

#Day12 — Anonymous Functions in Python

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Terry Crowley

Terry Crowley

Programmer, Ex-Microsoft Technical Fellow, Sometime Tech Blogger, Passionate Ultimate Frisbee Player

More from Medium

“Claymorphism”, Wallace’s Giant Bee, and Why Wormbots Are So Unsettling

First impressions and experiences from a Julius Baer new joiner

The Architecture of Mariupol — and the legacy of Viktor Nielsen

Is hydrogen the future of rail?