Spooky Exit At A Distance
I am personally opposed to async
, futures, promises; whatever you call it.
It is almost never appropriate for application or library development,
yet widely proposed as a good solution to problems. It
also
has
an
almost
amusingly
terrible
history
of integration and transition into ecosystems. I plan to explain my
complaints properly in a future post.
But, we still use it. Let's look at a specific example, in node
,
which I call "Spooky Exit At A Distance".
Here, we have possibly the simplest async
node
application,
with the "logging prelude" we're going to be using:
async function main() {
return 5;
}
main()
.then((r) => console.log('returned:', r))
.catch((e) => console.error('erroh!', e))
.finally(() => console.log('application complete!'));
This prints the return value (5
), and the application complete!
.
(This "prelude" is here because you
can't use await
at the top level in node
,
which is mighty inconvenient here, but I'm sure they have their reasons.)
Let's add some "real" work to our example:
async function main() {
const made = await new Promise((resolve, reject) => {
// ... do some work ...
resolve(2);
});
return made + 3;
}
This prints the same thing as the previous example, in a less direct way.
await
causes us to hand-off control from main
to the Promise
, and,
when resolve
is called, we "unblock" and resume running main
.
But.. what happens if there's a bug in the do some work
, and we don't
call resolve
?
async function main() {
const made = await new Promise((resolve, reject) => {
// (there's like four different bugs here)
switch (Math.random(2)) {
case 0:
resolve(2);
break;
case 1:
resolve(3);
break;
}
});
return made + 3;
}
% node a.js
%
...the app just vanishes. Our then()
, catch()
, and finally()
are
not run. The rest of main
isn't run either. The exit status is SUCCESS
.
As far as node
is concerned, there is no code to run, and no IO is
outstanding, so it's done. Bye!
Note that this can happen anywhere in your entire application. Deep within some library, on handling input, or only under certain load conditions.
Nobody would write code like that, you'd think. Unfortunately, much of the ecosystem forces you to write code like this; it's pretty much the only reason remaining you would write explicit promises. For example, dealing with subprocesses:
await new Promise((resolve, reject) => {
child.once('exit', () => resolve());
child.once('error', () => reject());
});
What happens if neither of these events fires? Your app is gone.
I hit this all the time. unzipper
took down a service at work occasionally,
probably this similar IO issue.
I hit the subprocess issue using the library in the simplest way I can imagine, reading the output of a command, then waiting for it to exit. Popular wrapper libraries have pretty much the same code.
The solution?
After consulting with a serious expert, we decided
that the events probably don't fire (sometimes, under load) if they are
not registered when the event happens. You might expect this, I didn't.
You can resolve this by
moving the promise creation above other code, and await
ing it later.
This relies on the (surprising to me!) execution order of Promise
constructor
arguments.
You can also have great fun looking at execution order in your test case.
A row (in this picture, normally a column) is a job, which works from 1e
nter,
to 8a
waited.
This recording shows all of the workers completing the read in a row (6c
),
then interleaving of the function completing (7x
, 8a
), with new workers
starting (1e
, etc.). Note how some of the jobs 7x
(exit) before they
6c
(complete reading), which is probably our bug.