2015-12-23

BlobOperations: A JDBC PostgreSQL BLOB abstraction

BlobOperations provides a JdbcTemplate-like abstraction over BLOBs in PostgreSQL.

The README contains most of the technical details.

It allows you to deal with large files; ones for which you don't want to load the whole thing into memory, or even onto the local storage on the machine. Java 8 lambdas are used to provide a not-awful API:

blops.store("some key", os -> { /* write to the OutputStream */);
blops.read("other key", (is, meta) -> { /* read the InputStream */);

Here, the (unseekable) Streams are connected directly to the database, with minimal buffering happening locally. You are, of course, free to load the stream into memory as you go; the target project for this library does that in some situations.

In addition to not being so ugly, you get free deduplication and compression, and a place to put your metadata, etc. Please read the README for further details about the project.


And, some observations I had while writing it:

I continue to be surprised at how hard it is to find good advice on locking techniques and patterns for Postgres. For example,

SELECT * FROM foo WHERE pk=5 FOR UPDATE;

... does nothing if the pk=5 doesn't exist (yet). That is, there's no neat way to block until you know whether you can insert a record. Typically, you don't want to block, but if your code then progresses to do:

var a = generateReallySlowThing();
INSERT INTO foo (pk, bar) VALUES (5, a);
COMMIT;

...it seems a shame to have waited for that slow operation, and then have the INSERT explode on you. The "best" solution here appears to insert a blank record, commit, then lock the record, do your slow operation, and then update it. As far as I'm aware, none of the UPSERT related changes in PostgreSQL 9.5 help with this case at all. I would love to link to a decent discussion of this... but I'm not aware of one.

A similar case comes up later, where I wish for INSERT ON CONFLICT DO NOTHING, which is in PostgreSQL 9.5. Soon.


2015-11-18

xlines: stdin round-robiner

xlines is a combination of xargs and split. It takes a bunch of lines, and sends them to a number of child processes. Each process sees only one of the lines.

e.g.

seq 16 | xlines -c 'cat > $(mktemp)'

...will give you 8 temporary files (on an 8-core machine) containing:

1
9

and:

2
10

etc.

Why would you care?

You have a bunch of INSERT statements coming off a stream, but your database will only use a single core if you run them in series:

zcat sql.gz | xlines -P 32 -- psql

Some speed-up.

zcat sql.gz | xlines -P 32 -c 'buffer | psql'

Zoom.

A specific tool to fix a specific job. I still don't think it makes up for the lack of limited parallelism in shell, however. Still thinking about that one...


2015-11-12

Teensy weensy crypto

As the UK's politicians continue to fail to understand what "strong cryptography" or "banning" even mean, I thought I would have a look at how simple strong cryptography can be.

nanorc4 is a working RC4 encryption and decryption implementation in 16-bit assembly. It will run on any 32-bit (or, presumably, 16-bit!) Windows machine (which, admittedly, are going out of fashion), and on dosbox:

uwACiB/+w3X6MckxwIjIih6AAP7L9vOI44qHgv6Iy4onAOAAxegvAP7Bdd8xybQIzSH+wYjLAi/o
HACIy4oXiOsCF4jTihcwwrQCzSG0C80hhMB12c0giMuKF4jrijeIF4jLiDfD

Yep, that's it. base64 encoded. 102 bytes, or 138 encoded. Fits in a tweet. Probably small enough to memorise. Certainly pretty hard to ban.

With this (and your computer) you can secure a message with a password in a way that's unbreakable. I can't break it, your government can't break it, other people's governments can't break it. Secure.

Why's it so small?

  1. The problem is (relatively) easy. This is known as "pre-shared key cryptography", or "symmetric cryptography", which are one of the easier problems in the science. Things get much harder when you don't have a good way to tell the target the key in advance.
  2. RC4 is surprisingly secure for how simple the code is.
  3. 16-bit assembly, and the COM "format" have no preamble: it's just the code. It just starts executing at the start. (And I hacked at it a bit.)

Demo!

> echo hi | one.com secure password>out ; in DOS (note: no trailing space)

$ make c && ./c 'secure password' <out  # on linux
hi

Should you use it? No. There's many important missing features that are present in proper symmetric encryption tools, such as proper key derivation, protection against modification, IVs, and fewer bugs. Yes, even this 102 byte program has some significant bugs I couldn't be bothered to fix.

Is RC4 secure? For this use-case, yes. For TLS, most certainly not. Even today there are many plausible attacks against RC4 in the TLS context, but none of them apply to this static-data world.

I was actually hoping to be able to fit RC4-drop-N in, which is probably secure in many more contexts, but I couldn't get the byte count down to the (tweet-derived) target. I guess this makes for a reasonable golf competition...

Development notes:

  • dosbox is pretty annoying, but so is cmd. The dosbox debugger is cool, but there doesn't seem to be any current documentation on it. That Forum Post is pretty wrong.
  • dosbox doesn't support pipes or <input redirection, so I couldn't debug with binary files, which is one of the reasons it doesn't work.
  • I have no idea what the actual semantics of the input interrupts are, all the useful documentation seems to have been lost to history, or was commercial (and/or paper) in the first place.
  • Everything fits in three 256-byte blocks, so the bh register == block number, and there's no use of memory segmentation (WOOO).
  • block 0: the PSP, which I couldn't overwrite as it has the key in (as the command-line argument).
  • block 1: the code segment
  • block 2: the 256-byte state for RC4.
  • After the key setup, the bh is left at 2 forever.
  • cl and ch are used for the i and j state parts in RC4.

Update:

  • A number of people pointed me at Odzhan's RC4 implementation in normal x86(_64) which shows a much better understanding of actual assembly programming. For example, their "swap" implementation is amazing compared to mine.
  • Some people asked how much hacking it took to get the size down. It took about six hours, but it was great. I love golf competitions, even if they're just against myself.
  • There was some concern that people might actually accidentally run or incorporate the code without understanding the flaws, as there isn't a big enough warning on this page, or on github. These people additionally didn't read any of the rest of the article, where it is explained that it's broken, 16-bit x86 assembly which you actually can't run anywhere, even if you wanted to.

2015-10-04

Capturing users' ssh keys

Four years ago, I was working on a project that would require users to connect to it over ssh. At the time, asking typical users (even developers!) to send you an ssh public key was a bit of an involved operation.

The situation hasn't improved much.

For example, github suggests generating the keys manually, then using Windows' clip.exe or apt-get install xclip && xclip (from the command line) to get the key into the clipboard, then pasting it into their web-interface. Ugh.

The situation is a little better for PuTTYTray, it has built-in support for SSH agent, and a reasonably streamlined way to get keys into the clipboard, but, then, we're still using the clipboard-into-the-web-interface story. This was written in 2013-08, two years too late (although I'm sure the author could have been convinced to move the development forward).

For this project, I came up with a better way.

I realised I could simply ask the new user to ssh in, and capture their keys. To distinguish concurrent users, I could issue them a fake username, and ask them to ssh account-setup-for-USERNAME@my.service.com. When they do, I can capture their keys and automatically associate them with their account. No platform specific commands, no unnecessary messing around in the terminal.

This is possible due to how ssh authentication works:

  • Client sends the username.
  • Server replies: Sure, you can try logging in with keys, or with passwords if you want.
  • Client sends Public Key 1.
  • Server replies: Nope, but you can try other keys or passwords.
  • Client sends Public Key 2.
  • Server replies: ...

That is, the standard ssh client will just send you all the user's public keys.

Note that this isn't (normally) considered a security problem; the keys are public, after all, and the server isn't leaking any information by saying "nope".

As I was already running a custom SSH server which practically required you to implement authentication yourself anyway, it was a simple step to add key capture to the account setup procedure. I've uploaded a stripped down version to github if you want to see how it works. For example,

Start the server:

server% git clone https://github.com/FauxFaux/ssh-key-capture.git
server% cd ssh-key-capture
server% ./gradlew -q run

The user can try and login, but gets rejected (this isn't reqiured):

john% ssh -p 9422 john@localhost
Permission denied (publickey).

Server logs from the (unnecessary) failed authentication:

KeyCapture - john trying to authenticate with RSA MIIBIjANBg...
KeyCapture - john trying to authenticate with EC MFkwEwYHKoZ...

Tell the server that john has signed up, or wants to add keys, or...

Enter a new user name, or blank to exit: john
Ask 'john' to ssh to '18a74d9f-5c7d-41d0-8369-bae4aaba9867@...'

John now adds his keys, and hence can login:

john% ssh -p 9422 18a74d9f-5c7d-41d0-8369-bae4aaba9867@localhost
Added successfully!  You can now log-in normally.
Connection to localhost closed.

john% ssh -p 9422 john@localhost
Hi!  You've successfully authenticated as john
Bye!
Connection to localhost closed.

Future work:

  • It could capture all of the user's keys (it currently just captures the first).
  • More meaningful behaviour after the first authenticaiton, or during the admin part of the setup?
  • Some way to do this on top of OpenSSH, or other tools people actually run in the wild. PAM?

Update: There was some decent discussion on reddit's /r/netsec about this post.


2015-10-01

ghetto_json for Ansible

ansible-ghetto-json is an ansible module for making quick edits to JSON files.

Ansible has great built-in support for ini files, but a number of more modern applications are using JSON for config files.

ghetto_json lets you make some types of edits to JSON files, and remains simple enough that it's hopefully easier just to extend than to switch to a different module, and you won't feel too guilty just copy-pasting it into your codebase.

More details are in its README, which you can view on the above github link.

It offers an interesting oppotunity to think about type conversion: JSON actually supports more types than you would normally think of; ints, floats, nulls, booleans, as well as the trusty string type. Python, which I still don't think of as a typed language, uses and honours these types in its JSON module, meaning you have to do conversion.

And, if it explicitly supports null, how do you do removals? I made up a new keyword, unset, which removes the key. Pretty ghetto.


2015-05-07

lxc-autostart for limited users, on systemd

lxc comes with a tool named lxc-autostart which can help you start your containers at boot, all you have to do is set lxc.start.auto = 1 in the config file and it will start your containers for you... if you're running your containers as root.

For convenience and security, I'm not running my containers as root. Normally, if I wanted to start something on boot, as a limited user (or possibly as a service), I'd use the cron @reboot hack:

$ crontab -l
@reboot /usr/bin/lxc-autostart

This, however, fails for lxc-autostart (and for lxc-start, for the same reason): cron runs your command in a bizarre environment which, importantly, doesn't have the user's cgroups setup properly. These are setup somewhere scary (pam?), and cron apparently doesn't do a proper log-in for your user. You can observe the failure with some:

* * * * * cat /proc/self/cgroup

...which will show you have junk cgroups, which makes lxc-start angry with terrible, terrible errors:

cgmanager[1041]: cgmanager:do_create_main: pid 5679 (uid 1000 gid 1000) may not create under /run/cgmanager/fs/blkio/system.slice/autostart.service
cgmanager[1041]: cgmanager:do_create_main: pid 5679 (uid 1000 gid 1000) may not create under /run/cgmanager/fs/cpu/system.slice/autostart.service
...
cgmanager[1041]: cgmanager: Invalid path /run/cgmanager/fs/blkio/system.slice/autostart.service/lxc/utopic
cgmanager[1041]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/blkio/system.slice/autostart.service/lxc/utopic
cgmanager[1041]: cgmanager: Invalid path /run/cgmanager/fs/cpu/system.slice/autostart.service/lxc/utopic
cgmanager[1041]: cgmanager:per_ctrl_move_pid_main: Invalid path /run/cgmanager/fs/cpu/system.slice/autostart.service/lxc/utopic
...

The easiest way for a limited user to solve this is, as far as I'm aware, ssh to localhost. Limited users can't configure sudo to be passwordless, and can't su without entering their password on a proper terminal, meaning neither work from cron.

$ ssh-keygen -t ed25519
$ ssh-copy-id localhost
$ crontab -l
@reboot /usr/bin/ssh me@localhost /usr/bin/lxc-autostart

This was working great, until the Ubuntu Vivid upgrade, which has bought the wonders of systemd.

Under systemd, the @reboot entries are sometimes processed before sshd has started, so the above massive hack fails.

$ crontab -l
@reboot sleep 10 && /usr/bin/ssh ...

NO. NO NO NO.

Under systemd, we can write a simple service file that does the auto-start. systemd understands cgroups, so if you ask it to run a service as a User=, it'll run the service in the user's cgroup, right? Nope: It runs everything in the service cgroup. Fair enough.

However, as the service is started as root, we can use su. A systemd service: /etc/systemd/system/autostart.service:

[Unit]
Description=lxc-autostart
After=network.target

[Install]
WantedBy=multi-user.target

[Service]
Type=oneshot
ExecStart=/bin/su me -c '/usr/bin/lxc-autostart'

And install it:

$ sudo systemctl enable lxc-autostart.service

This seems to work. I'm not sure if the After= is necessary; network.target is a complex beast but I still feel safer waiting for something to be alive.


« Prev - Next »