2017-03-21

find-deleted: checkrestart replacement

checkrestart, part of debian-goodies, checks what you might need to restart. You can run it after a system update, and it will find running processes using outdated libraries. If these can be restarted, and there's no new kernel updates, then you can save yourself a reboot.

However, checkrestart is pretty dated, and has some weird behaviour. It frequently reports that there are things needing a restart, but that it doesn't feel like telling you what they are. (This makes me moderately angry). It's pretty bad at locating services on a systemd-managed system. It tries to look through Debian packages, making it Debian specific (along with unreliable). This is especially odd, because systemd knows what pid belongs to a unit, as does /proc, and...

Instead of fixing it, I have rewritten it from scratch.

find-deleted is a tool to find deleted files which are still in use, and to suggest systemd units to restart.

The default is to try and be helpful:

% find-deleted
 * blip
   - sudo systemctl restart mysql.service nginx.service
 * drop
   - sudo systemctl restart bitlbee.service
 * safe
   - sudo systemctl restart fail2ban.service systemd-timesyncd.service tlsdate.service
 * scary
   - sudo systemctl restart dbus.service lxc-net.service lxcfs.service polkitd.service
Some processes are running outside of units, and need restarting:
 * /bin/zsh5
  - [1000] faux: 7161 17338 14539
 * /lib/systemd/systemd
  - [1000] faux: 2082
  - [1003] vrai: 8551 8556

Here, it is telling us that a number of services need a restart. The services are categorised based on some patterns defined in the associated configuration file, deleted.yml.

For this machine, I have decided that restarting mysql and nginx will cause a blip in the service to users; I might do it at an off-peak time, or ensure that there's other replicas of the service available to pick up the load.

My other categories are:

  • drop: A loss of service will happen that will be annoying for users.
  • safe: These services could be restarted all day, every day, and nobody would notice.
  • scary: Restarting these may log you out, or cause the machine to stop functioning.
  • other: things which don't currently have a classification

If you're happy with its suggestions, you can copy-paste the above commands, or you can run it in a more automated fashion:

systemctl restart $(find-deleted --show-type safe)

This can effectively be run through provisioning tools, on a whole selection of machines, if you trust your matching rules! I have done this with a much more primitive version of this tool at a previous employer.

It can also print the full state that it's working from, using --show-paths.


2017-02-21

Busy work: pasane and syscall extractor

Today, I wrote a load of Python and a load of C to work around pretty insane problems in Linux, my choice of OS for development.


pasane is a command-line volume control that doesn't mess up channel balance. This makes it unlike all of the other volume control tools I've tried that you can reasonably drive from the command line.

It's necessary to change volume from the command line as I launch commands in response to hotkeys (e.g. to implement the volume+ button on my keyboard). It's also occasionally useful to change the volume via. SSH. Maybe there's another option? Maybe something works via. dbus? This seemed to make the most sense at the time, and isn't too awful.

Am I insane?


Next up: Some code to parse the list of syscalls from the Linux source tree.

It turns out that it's useful to have these numbers available in other languages, such that you can pass them to tools, so that you can decode raw syscall numbers you've seen, or simply so that you can make the syscalls.

Anyway, they are not available in the Linux source. What? Yes, for most architectures, this table is not available. It's there on i386, amd64, and arm (32-bit), but not for anything else. You have to.. uh.. build a kernel for one of those architectures, then compile C code to get the values. Um. What?

This is insane.

The Python code (linked above) does a reasonably good job of extracting them from this insanity, and generating the tables for a couple more arches.

I needed to do this so I can copy a file efficiently. I think. I've kind of lost track. Maybe I am insane.


2017-02-11

Playing with prctl and seccomp

I have been playing with low-level Linux security features, such as prctl no new privs and seccomp. These tools allow you to reduce the harm a process can do to your system.

They're typically deployed as part of systemd, although the default settings in many distros are yet to be ideal. This is partly because it's hard to confirm what a service actually needs, and partly because many services support many more things than a typical user cares about.

For example, should a web server be able to make outgoing network connections? Probably not, it's accepting network connections from people, maybe running some code, then returning the response. However, maybe you're hosting some PHP that you want to be able to fetch data from the internet? Maybe you're running your web-server as a proxy?

To address these questions, Debian/Ubuntu typically err on the side of "let it do whatever, so users aren't inconvenienced". CentOS/RHEL have started adding a large selection of flags you can toggle to fiddle security (although through yet another mechanism, not the one we're talking about here..).


Anyway, let's assume you're paranoid, and want to increase the security of your services. The two features discussed here are exposed in systemd as NoNewPrivileges= and SystemCallFilter=.

The first, NoNewPrivileges=, prevents a process from getting privileges in any common way, e.g. by trying to change user, or trying to run a command which has privileges (e.g. capabilities) attached.

This is great. Even if someone knows your root password, they're still stuck:

% systemd-run --user -p NoNewPrivileges=yes --tty -- bash
$ su - root
Password:
su: Authentication failure

$ sudo ls
sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid'
  option set or an NFS file system without root privileges?

The errors aren't great, as the tools have no idea what's going on, but at least it works!

This seems like a big, easy win; I don't need my php application to become a different user... or do I? It turns out that the venerable mail command, on a postfix system, eventually tries to run a setuid binary, which fails. And php defaults to sending mail via. this route. Damn!

Let's try it out:

% systemd-run --user -p NoNewPrivileges=yes --tty -- bash
faux@astoria:~$ echo hi | mail someone@example.com
faux@astoria:~$

postdrop: warning: mail_queue_enter: create file maildrop/647297.680: Permission denied

Yep, postdrop is setgid (the weird s in the permissions string):

% ls -al =postdrop
-r-xr-sr-x 1 root postdrop 14328 Jul 29  2016 /usr/sbin/postdrop

It turns out that Debian dropped support for alternative ways to deliver mail. So, we can't use that!


Earlier I implied that NoNewPrivileges=, despite the documentation, doesn't remove all ways to get some privileges. One way to do this is to enter a new user namespace (only widely supported by Ubuntu as of today). e.g. we can get CAP_NET_RAW (and its associated vulnerabilities) through user namespaces:

% systemd-run --user -p NoNewPrivileges=yes --tty -- \
    unshare --map-root-user --net -- \
    capsh --print \
        | fgrep Current: | egrep -o 'cap_net\w+'
cap_net_bind_service
cap_net_broadcast
cap_net_admin
cap_net_raw

To harden against this, I wrote drop-privs-harder which simply breaks unshare (and its friend clone)'s ability to make new user namespaces, using seccomp.


Unlike NoNewPrivileges=, SystemCallFilter= takes many more arguments, and requires significantly more research to work out whether a process is going to work. Additionally, systemd-run doesn't support SystemCallFilter=. I'm not sure why.

To assist people playing around with this (on amd64 only!), I wrote a tool named seccomp-tool and a front-end named seccomp-filter.

There's a binary of seccomp-tool available for anyone who doesn't feel like compiling it. It depends on only libseccomp2. sudo apt install libseccomp2. It needs to be in your path as seccomp-tool.

seccomp-filter supports the predefined system call sets from the systemd documentation, in addition to an extra set, named @critical, which systemd seems to silently include without telling you. Both of these tools set NoNewPrivilges=, so you will also be testing that.

Let's have a play:

% seccomp-filter.py @critical -- ls /
ls: reading directory '/': Function not implemented

Here, we're trying to run ls with only the absolutely critical syscalls enabled. ls, after starting, tries to call getdents() ("list the directory"), and gets told that it's not supported. Returning ENOSYS ("function not implemented") is the default behaviour for seccomp-filter.py.

We can have a permissions error, instead, if we like:

% seccomp-filter.py --exit-code EPERM @critical -- ls /
ls: reading directory '/': Operation not permitted

If we give it getdents, it starts working... almost:

% ./seccomp-filter.py --exit-code EPERM @critical getdents -- ls /proc
1
10
1001
11
1112

Why does the output look like it's been piped through a pager? ls has tried to talk to the terminal, has been told it can't, and is okay with that. This looks the same as:

seccomp-filter.py --blacklist ioctl -- ls /

If we add ioctl to the list again, ls pretty much works as expected, ignoring the fact that it segfaults during shutdown. systemd's @default group of syscalls is useful to include to remove this behaviour.


Next, I looked at what Java required. It turns out to be much better than I expected: the JVM will start up, compile things, etc. with just: @critical @default @basic-io @file-system futex rt_sigaction clone.

This actually works as a filter, too: if Java code tries to make a network connection, it is denied. Or, er, at least, something in that area is denied. Unfortunately, the JVM cra.. er.. "hard exits" for many of these failures, as they come through as unexpected asserts:

e.g.

Assertion 'sigprocmask_many(SIG_BLOCK, &t, 14,26,13,17,20,29,1,10,12,27,23,28, -1) >= 0' failed at ../src/nss-myhostname/nss-myhostname.c:332, function _nss_myhostname_gethostbyname3_r(). Aborting.

It then prints out loads of uninitialised memory, as it doesn't expect uname to fail. e.g.

Memory: 4k page, physical 10916985944372480k(4595315k free), swap 8597700727024688k(18446131672566297518k free)

uname: [box][box]s[box]


This demonstrates only one of the operation modes for seccomp. Note that, as of today, the Wikipedia page is pretty out of date, and the manpage is outright misleading. Consider reading man:seccomp_rule_add(3), part of libseccomp2, to work out what's available.


Summary: Hardening good, hardening hard. Run your integration test suite under seccomp-filter.py --blacklist @obsolete @debug @reboot @swap @resources and see if you can at least get to that level of security?


2016-10-11

HTTP2 slowed my site down!

At work, we have a page which asynchronously fetches information for a dashboard. This involves making many small requests back to the proxy, which is exactly the kind of thing that's supposed to be faster under HTTP2.

However, when we enabled HTTP2, the page went from loading in around two seconds, to taking over twenty seconds. This is bad. For a long time, I thought there was a bug in nginx's HTTP2 code, or in Chrome (and Firefox, and Edge..). The page visibly loads in blocks, with exactly five second pauses between the blocks.

The nginx config is simply:

resolver 8.8.4.4;
location ~ /proxy/(.*)$ {
  proxy_pass https://$1/some/thing;
}

.. where 8.8.4.4 is Google Public DNS.


It turns out that the problem isn't with HTTP2 at all. What's happening is that nginx is processing the requests successfully, and generating DNS lookups. It's sending these on to Google, and the first few are getting processed; the rest are being dropped. I don't know if this is due to the network (it's UDP, after all), or as Google think it's attack traffic. The remainder of the requests are retried by nginx's custom DNS resolver after 5s, and another batch get processed.

So, why is this happening under HTTP2? Under http/1.1, the browser can't deliver requests quickly enough to trigger this flood protection. HTTP2 has sped it up to the point that there's a problem. Woo? On localhost, a custom client can actually generate requests quickly enough, even over http/1.1.

nginx recommend not using their custom DNS resolver over the internet, and I can understand why; I've had trouble with it before. To test, I deployed dnsmasq between nginx and Google:

dnsmasq -p 1337 -d -S 8.8.4.4 --resolv-file=/dev/null

dnsmasq generates identical (as far as I can see) traffic, and is only slightly slower (52 packets in 11ms, vs. 9ms), but I am unable to catch it getting rate limited. On production, a much smaller machine than the one I'm testing, dnsmasq is significantly slower (100+ms), so it makes sense that it wouldn't trigger rate limiting. dnsmasq does have --dns-forward-max= (default 150), so there's a nice way out there.


In summary: When deploying HTTP2, or any upgrades, be aware of rate limits in your, or other people's, systems, that you may now be able to trigger.


2016-02-01

Fosdem 2016

I was at Fosdem 2016. Braindump:

  • Pottering on DNSSEC: Quick overview of some things Team Systemd is working on, but primarily about adding DNSSEC to systemd-resolvd.
    • DNSSEC is weird, scary, and doesn't really have any applications in the real world; it doesn't enable anything that wasn't already possible. Still important and interesting for defence-in-depth.
    • Breaks in interesting cases, e.g. home routers inventing *.home or fritz.box, the latter of which is a real public name now. Captive portal detection (assuming we can't just make those go away).
    • systemd-resolvd is a separate process with a client library; DNS resolution is too complex to put in libc, even if you aren't interested in in caching, etc.
  • Contract testing for JUnit: Some library sugar for instantiating multiple implementations of an interface, and running blocks of tests against them. Automatically running more tests when anything returns a class with any interface that's understood.
    • I felt like this could be applied more widely (or perhaps exclusively); if you're only testing the external interface, why not add more duck-like interfaces everywhere, and test only those? Speaker disagreed, perhaps because it never happens in practice...
    • Unfortunately, probably only useful if you are actually an SPI provider, or e.g. Commons Collections. This was what was presented, though, so not a big surprise.
    • The idea of testing mock/stub implementations came up. That's one case where, at least, I end up with many alternative implementations of an interface that are supposed to test the same. Also discussed whether XFAIL was a good thing, Rubby people suggested WIP (i.e. must fail or test suite fails) works better.
  • Frida for closed-source interop testing: This was an incredibly ambitious project, talk and demo. ("We're going to try some live demos. Things are likely to catch fire. But, don't worry, we're professionals.").
    • Injects the V8 JS VM into arbitrary processes, and interacts with it from a normal JS test tool; surprisingly nice API, albeit using some crazy new JS syntaxes. Roughly: file.js: {foo: function() { return JVM.System.currentTimeMillis(); } } and val api = injectify(pid, 'file.js'); console.log(api.foo());.
    • Great bindings for all kinds of language runtimes and UI toolkits, to enable messing with them, and all methods are overridable in-place, i.e. you can totally replace the read syscall, or the iOS get me a url method. Lots of work on inline hooking and portable and safe call target rewriting.
    • Live demo of messing with Spotify's ability to fetch URLs... on an iPhone... over the network. "System dialog popped up? Oh, we'll just inject Frida into the System UI too...".
  • USBGuard: People have turned off the maddest types of USB attack (e.g. autorun), but there's still lots of ways to get a computer to do something bad with an unexpected (modified) USB stick; generate keypresses and mouse movements, even a new network adaptor that may be chosen to send traffic to.
    • Ask the user if they expect to see a new USB device at all, and whether they think it should have these device classes (e.g. "can be a keyboard"). Can only reject or accept as a whole; kernel limitation. UX WIP.
    • Potential for read-only devices; filter any data being exfiltrated at all from a USB stick, but still allow reading from them? Early boot, maybe? Weird trust model. Not convinced you could mount a typical FS with USB-protocol level filtering of writes.
    • Mentioned device signing; no current way to do identity for devices, so everything is doomed if you have any keyboards. Also mentioned CVEs in USB drivers, including cdc-wdm, which I reviewed during the talk and oh god oh no goto abuse.
  • C safety, and whole-system ASAN: Hanno decided he didn't want his computer to work at all, so has been trying to make it start with ASAN in enforce mode on for everything. Complex due to ASAN loading order/dependencies, and the fact that gcc and libc have to be excluded because they're the things being overridden.
    • Everything breaks (no real surprise there). bash, coreutils, man, perl, syslog, screen, nano. Like with Fuzzing Project, people aren't really interested in fixing bugs they consider theoretical, but are really real on angry C compilers, or in the future. Custom allocators have to be disabled, which is widely but not totally supported.
    • Are there times where you might want ASAN on in production? Would it increase security (trading off outages), or would it add more vulnerabilities, due to huge attack surface? ~2x slowdown at the moment, which is probably irrelevant.
    • Claimed ASLR is off by default in typical Linux distros; I believe Debian's hardening-wrapper enables this, but Lintian reports poor coverage, so maybe a reasonable claim.
  • SSL management: Even in 2014, when Heartbleed happened, Facebook had not really got control of what was running in their infrastructure. Public terminators were fine, but everything uses SSL. Even CAs couldn't cope with reissue rate, even if you could find your certs. Started IDSing themselves to try and find missed SSL services.
    • Generally, interesting discussion of why technical debt accumulates, especially in infra. Mergers and legacy make keeping track of everything hard. No planning for things that seem like they'll never happen. No real definition of ownership; alerts pointed to now-empty mailing lists, or to people who have left (this sounds very familiar). That service you can't turn off but nobody knows why.
    • Some cool options. Lots of ways to coordinate and monitor SSL (now), e.g. Lemur. EC certs are a real thing you can use on the public internet (instead of just me on my private internet), although I bet it needs Facebook's cert switching middleware. HPKP can do report-only.
    • Common SSL fails that aren't publicly diagnosed right now: Ticket encryption with long-life keys (bad), lack of OCSP stapling and software to support that.
  • Flight flow control: Europe-wide flight control is complex, lots of scarce resources: runways, physical sky, radar codes, radio frequencies. Very large, dynamic, safety-critical optimisation problem.
    • 35k flights/day. 4k available radar codes to identify planes. Uh oh. Also, route planning much more computationally expensive in 4D. Can change routes, delay flights, but also rearrange ATC for better capacity of bits of the sky.
    • Massive redundancy to avoid total downtime; multiple copies of the live system, archived plans stored so they can roll-back a few minutes, then an independently engineered fall-back for some level of capacity if live fails due to data, then tools to help humans do it, and properly maintained capacity information for when nobody is contactable at all.
    • Explained optimising a specific problem in Ada; no actual Ada tooling so stuck with binary analysis (e.g. perf, top, ..). Built own parallelisation and pipelining as there's no tool or library support. Ada codebase and live system important, but too complex to change, so push new work into copies of it on the client. Still have concurrency bugs due to shared state.
  • glusterfs and beyond glusterfs: Large-scale Hadoop alternative, more focused on reliability, and NFS/CIFS behaviour than custom API, but also offer object storage and others.
    • Checksumming considered too slow, so done out of band (but actually likely to catch problems), people don't actually want errors to block their reads (?!?). Lots more things are part of a distributed system than I would have expected, e.g. they're thinking of adding support for geographical clustering, so you can ensure parts of your data are in different DCs, or that there is a copy near Australia (dammit).
    • The idea that filesystems have to be in kernel mode is outdated; real perf is from e.g. user-mode networking stacks. Significantly lower development costs in user-mode means FSes are typically faster in user space, as the devs have spent more time (with more tooling modifiers) getting them to work properly: real speedups are algorithmical and not code.
    • Went on to claim that Python is fine, don't even need C or zero-copy (but do need to limit copies), as everything fun is offloaded anyway. Ships binaries with debug symbols (1-5% slowdown) as it's totally irrelevant. Team built out of non-FS, non-C people writing C. They're good enough to know what not to screw up.
    • Persistent memory is coming (2TB of storage between DRAM and SSD speed), and cache people are behind. NFS-Ganesha should be your API in all cases.
  • What even are Distros?: Distros aren't working, can't support things for 6mo, 5y, 10y or whatever as upstream and devs hate you. Tried to build hierarchies of stable -> more frequently updated, but failed; build deps at the lower levels; packaging tools keep changing; no agreement on promotion; upstreams hate you.
    • PPAs? They had invented yet another PPA host, and they're all still bad. Packaging is so hard to use, especially as non-upstreams don't use it enough to remember how to use it.
    • Components/Modules? Larger granularity installable which doesn't expose what it's made out of, allowing distros to mess it up inside? Is this PaaS, is this Docker? It really feels like it's not, and it's really not where I want to go.
    • Is containerisation the only way to protect users from badly packaged software? Do we want to keep things alive for ten years? I have been thinking about the questions this talk had for a while, but need a serious discussion before I can form opinions.
  • Reproducible Builds: Definitely another post some time.
    • Interesting questions about why things are blacklisted, and whether the aim is to support everything. Yes. Everything. EVERYTHING.
  • Postgres buffer manager: Explaining the internals of the shared buffers data structure, how it's aged poorly, what can be done to fix it up, or what it should be replaced with. Someone actually doing proper CS data-structures, and interested in cache friendliness, but also tunability, portability, future-proofing, etc.
    • shared_buffers tuning advice (with the usual caveat that it's based on workload); basically "bigger is normally always better, but definitely worth checking you aren't screwing yourself".
    • Also talked about general IO tuning on Linux, e.g. dirty_writeback, which a surprising number of people didn't seem to have heard of. Setting it to block earlier reduces maximum latency; numbers as small as 10MB were considered.
  • Knot DNS resolver: DNS resolvers get to do a surprising number of things, and some people run them at massive scale. Centralising caching on e.g. Redis is worthwhile sometimes. Scriptable in Lua so you can get it to do whatever you feel like at the time (woo!).
    • Implements some interesting things: Happy Eyeballs, QNAME minimisation, built-in serving of pretty monitoring. Some optimisations can break CDNs (possibly around skipping recursive resolving due to caching), didn't really follow.
  • Prometheus monitoring: A monitoring tool. Still unable to understand why people are so much more excited about it than anything else. Efficient logging and powerful querying. Clusterable through alert filtering.
    • "Pull" monitoring, i.e. multiple nodes fetch data from production, which is apparently contentious. I am concerned about credentials for machine monitoring, but the daemon is probably not that much worse.
  • htop porting: Trying to convince the community to help you port to other platforms is hard. If you don't, they'll fork, and people will be running an awful broken fork for years (years). Eventually resolved by adding the ability to port to BSD, then letting others do the OSX port.
  • API design for slow things: Adding APIs to the kernel is hard. Can't change anything ever. Can't optimise for the sensible case because people will misuse it and you can't change it. Can't "reduce" compile-time settings as nobody builds a kernel to run an app.
    • Lots of things in Linux get broken, or just never work to start with, due to poor test coverage, maybe the actually funded kselftest will help, but people can help themselves by making real apps before submitting apis, or at least real documentation. e.g. recvmsg timeout was broken on release. timerfd tested by manpage.
    • API versioning is hard when you can't turn anything off. epoll_create1, renameat2, dup3. ioctl, prctl, netlink, aren't a solution, but maybe seccomp is. Capabilities are hard; 1/3rd of things just check SYS_ADMIN (which is ~= root). Big argument afterwards about whether versioning can ever work, and what deprecation means. Even worse for this than for Java, where this normally comes up.
  • Took a break to talk to some Go people about how awful their development process is. CI is broken, packaging is broken, builds across people's machines are broken. Everything depends on github and maybe this is a problem.
  • Fosdem infra review: Hardware has caught up with demand, now they're just having fun with networking, provisioning and monitoring. Some plans to make the conference portable, so others can clone it (literally). Video was still bad but who knows. Transcoding is still awfully slow.
    • Fosdem get a very large temporary ipv4 assignment from IANA. /17? Wow. Maybe being phased out as ipv6 and nat64 kind of works in the real world now.
    • Argument about why they could detect how many devices there were, before we realised mac hiding on mobile is probably disabled when you actually connect, because that's how people bill.
  • HOT OSM tasking: Volunteers digitising satellite photos of disaster zones, and ways to allocate and parallelise that. Surprisingly fast; 250 volunteers did a 250k person city in five days, getting 90k buildings.
    • Additionally, provide mapping for communities that aren't currently covered, and train locals to annotate with resources like hospitals and water-flow information.
    • Interesting that sometimes they want to prioritise for "just roads", allowing faster mapping. Computer vision is still unbelievably useless; claiming 75% success rate at best on identifying if people even live in an area.
    • Lots of ethical concerns; will terror or war occur because there's maps? Sometimes they ask residents and they're almost universally in favour of mapping. Sometimes drones are donated to get better imagery, and residents jump on it.
  • Stats talk. Lots of data gathered; beer sold, injuries fixed, network usage and locations. Mostly mobile (of which, mostly android). Non-mobile was actually dominated by OSX, with Linux a close second. Ouch.

Take-aways: We're still bad at software, and at distros, and at safety, and at shared codebases.

Predictions: Something has to happen on distros, but I think people will run our current distros (without containers for every app) for a long time.


2015-12-23

BlobOperations: A JDBC PostgreSQL BLOB abstraction

BlobOperations provides a JdbcTemplate-like abstraction over BLOBs in PostgreSQL.

The README contains most of the technical details.

It allows you to deal with large files; ones for which you don't want to load the whole thing into memory, or even onto the local storage on the machine. Java 8 lambdas are used to provide a not-awful API:

blops.store("some key", os -> { /* write to the OutputStream */);
blops.read("other key", (is, meta) -> { /* read the InputStream */);

Here, the (unseekable) Streams are connected directly to the database, with minimal buffering happening locally. You are, of course, free to load the stream into memory as you go; the target project for this library does that in some situations.

In addition to not being so ugly, you get free deduplication and compression, and a place to put your metadata, etc. Please read the README for further details about the project.


And, some observations I had while writing it:

I continue to be surprised at how hard it is to find good advice on locking techniques and patterns for Postgres. For example,

SELECT * FROM foo WHERE pk=5 FOR UPDATE;

... does nothing if the pk=5 doesn't exist (yet). That is, there's no neat way to block until you know whether you can insert a record. Typically, you don't want to block, but if your code then progresses to do:

var a = generateReallySlowThing();
INSERT INTO foo (pk, bar) VALUES (5, a);
COMMIT;

...it seems a shame to have waited for that slow operation, and then have the INSERT explode on you. The "best" solution here appears to insert a blank record, commit, then lock the record, do your slow operation, and then update it. As far as I'm aware, none of the UPSERT related changes in PostgreSQL 9.5 help with this case at all. I would love to link to a decent discussion of this... but I'm not aware of one.

A similar case comes up later, where I wish for INSERT ON CONFLICT DO NOTHING, which is in PostgreSQL 9.5. Soon.


- Next ยป