2015-11-12

Teensy weensy crypto

As the UK's politicians continue to fail to understand what "strong cryptography" or "banning" even mean, I thought I would have a look at how simple strong cryptography can be.

nanorc4 is a working RC4 encryption and decryption implementation in 16-bit assembly. It will run on any 32-bit (or, presumably, 16-bit!) Windows machine (which, admittedly, are going out of fashion), and on dosbox:

uwACiB/+w3X6MckxwIjIih6AAP7L9vOI44qHgv6Iy4onAOAAxegvAP7Bdd8xybQIzSH+wYjLAi/o
HACIy4oXiOsCF4jTihcwwrQCzSG0C80hhMB12c0giMuKF4jrijeIF4jLiDfD

Yep, that's it. base64 encoded. 102 bytes, or 138 encoded. Fits in a tweet. Probably small enough to memorise. Certainly pretty hard to ban.

With this (and your computer) you can secure a message with a password in a way that's unbreakable. I can't break it, your government can't break it, other people's governments can't break it. Secure.

Why's it so small?

  1. The problem is (relatively) easy. This is known as "pre-shared key cryptography", or "symmetric cryptography", which are one of the easier problems in the science. Things get much harder when you don't have a good way to tell the target the key in advance.
  2. RC4 is surprisingly secure for how simple the code is.
  3. 16-bit assembly, and the COM "format" have no preamble: it's just the code. It just starts executing at the start. (And I hacked at it a bit.)

Demo!

> echo hi | one.com secure password>out ; in DOS (note: no trailing space)

$ make c && ./c 'secure password' <out  # on linux
hi

Should you use it? No. There's many important missing features that are present in proper symmetric encryption tools, such as proper key derivation, protection against modification, IVs, and fewer bugs. Yes, even this 102 byte program has some significant bugs I couldn't be bothered to fix.

Is RC4 secure? For this use-case, yes. For TLS, most certainly not. Even today there are many plausible attacks against RC4 in the TLS context, but none of them apply to this static-data world.

I was actually hoping to be able to fit RC4-drop-N in, which is probably secure in many more contexts, but I couldn't get the byte count down to the (tweet-derived) target. I guess this makes for a reasonable golf competition...

Development notes:

  • dosbox is pretty annoying, but so is cmd. The dosbox debugger is cool, but there doesn't seem to be any current documentation on it. That Forum Post is pretty wrong.
  • dosbox doesn't support pipes or <input redirection, so I couldn't debug with binary files, which is one of the reasons it doesn't work.
  • I have no idea what the actual semantics of the input interrupts are, all the useful documentation seems to have been lost to history, or was commercial (and/or paper) in the first place.
  • Everything fits in three 256-byte blocks, so the bh register == block number, and there's no use of memory segmentation (WOOO).
  • block 0: the PSP, which I couldn't overwrite as it has the key in (as the command-line argument).
  • block 1: the code segment
  • block 2: the 256-byte state for RC4.
  • After the key setup, the bh is left at 2 forever.
  • cl and ch are used for the i and j state parts in RC4.

Update:

  • A number of people pointed me at Odzhan's RC4 implementation in normal x86(_64) which shows a much better understanding of actual assembly programming. For example, their "swap" implementation is amazing compared to mine.
  • Some people asked how much hacking it took to get the size down. It took about six hours, but it was great. I love golf competitions, even if they're just against myself.
  • There was some concern that people might actually accidentally run or incorporate the code without understanding the flaws, as there isn't a big enough warning on this page, or on github. These people additionally didn't read any of the rest of the article, where it is explained that it's broken, 16-bit x86 assembly which you actually can't run anywhere, even if you wanted to.

Commenting is disabled for this post.

Read more of Faux' blog