Faux' blog

2011-02-10

Release anouncement: PatchPiler!

Many people find modern version control systems confusing. PatchPiler offers a new, simpler way to think about version control, offering you far more flexibility than certain other version control systems.

All software development is the creation of patches; small changes to the state of the software. PatchPiler's command-line tool, papi, lets you manage these patches efficiently.

Like in other version control systems, papi commit adds a patch (yellow) to the existing stack of patches (green). As with any modern version control system, you can have multiple outstanding patches on a pile.

Sometimes you want to be working on multiple things at the same time, say yellow and blue things. For this, there's papi new-pile.

It lets you name your new pile, let's call it blue. This means you can pile patches on "blue" while continuing to pile patches on "yellow". This is amazingly cool. In fact, given that "blue" is completely unrelated to "yellow", somebody else can be piling patches on "blue" while you continue working on "yellow".

Frequently you'll want to copy patches between piles.

papi copy lets you copy a patch from one pile to another.

Something cool has happened in "yellow" and you want in? Just copy it across! The patch is now in both piles, but this is okay, as they're currently unrelated.

papi re-pile allows you to catch-up with another pile's entire history.

It just "re-piles" your patches on top of the patches from the other pile. This doesn't affect the other pile; it's still a separate stream of development.

Note how it intelligently works out that the yellow patch in "blue"'s pile was already included earlier on, so it's no-longer necessary to copy it in.

You'll notice that the pile named "blue" now has all of the outstanding patches; this brings us on to the last command: papi bless marks changes as complete and removes any unnecessary piles.

This results in a nice, clean pile of patches, leaving you ready to continue developing. The possibilities are endless!

2011-01-29

UWCS Progcomp Chain 6, Results

As a continuation of bucko's progcomp chain, I set Progcomp 6.

The objective was to walk your way through a "maze" with huge constraints to make the problem simpler.

I believed the simplest solution to the problem was graph search with backtracking. My inefficient implementation (with debugging) of this is unbelievably fast; it can solve a 300x5000 map (>100 times the size of the original problem) in around 2 seconds. The map generator ensures there are no problems set with trivial solutions (i.e. going in a straight line all the way down the map). I thought that the effort of solving it by hand within the (6-second) time constraint would require far too much UI work, so keeping the problem size small enough to fit on the screen was fine.

I was wrong. So wrong.

bucko's Perl solution was first, after 15 minutes, with a tree search implementation. While easier to implement, this is much slower.

I'd carefully designed the page, submission process and timeout so that people who found curl / wget too much effort could submit the solution by copy-pasting; I could copy the result from the page (without the IE-only JS' assistance) into my solver, solve it, and copy the result back in ~5 seconds; hence the 6-second timeout. bucko's solution sometimes doesn't finish in time, even on these tiny maps. It'll never finish on the huge map.

fice was second, after 21 minutes. Algorithm / code unknown, although I'm guessing there was auto-submission as no User-Agent was set.

Queex (outsider!) was third, after about 50 minutes, with his Java solution. He decided the problem wasn't complicated enough to write a "proper" solution for, and went for just trying to jiggle away from the edges. Obviously it works fine, and I have no idea how to harden the map generator against it, except perhaps forcing you to go from the left edge to the right edge at least once. It can only solve some of the maps, but this isn't important as it was a "solve once" problem.

Next up was Afal's js solution. He (correctly) guessed that the map generator only generates large walls every 10 spaces, worked out which side it was, and jiggled away from it a bit. Apparently works most of the time. Again he uses an in-browser implementation to avoid having to do any page parsing or post rubbish.

Afal then decided to submit 62,000 other solutions, which has made my log huge and writing this a pain. Such a penis. Such a penis.

At this point an hour had passed and I turned the debugger on; when you die it tells you where and shows you what you submitted, etc.

james was next, with some more jiggling Java, and a shell-script wrapper.

MrWilson then submitted a travesty. It generates random solutions and assumes loads of things about the map. It works in nearly no cases. I don't know how he can show his face in public after this.

Connorhd duplicated bucko's solution in php.

sadiq, tom, Softly, Trencha and Steve Brandwood also had a correct answer.

Meta: I wrote my own solutions website instead of using bucko's. I got the logging all wrong. I got the map size all wrong (not realising how lazy everyone is).

It took me about 90 minutes to do the map generator and my solution with debugger, and verifier. Another 90 minutes was spent on the website. That is, a ~3 hour time investment for causing hours of suffering and an entire night of entertainment. Totally worth it.

Next up is bucko's progcomp chain, link 7, but tom still owes us a progcomp.

2011-01-18

SpamFiles

I've been whining for a while about SpamFiles' speed on Windows. It creates and writes small amounts of data to hundreds of files, then deletes them all. It's orders of magnitude slower on Windows (all the way to Seven) than on Linux, due to NTFS.

It's just a synthetic benchmark though, right? That is, it's reasonably irrelevant. Or so I thought.

In a recent private project I was using Spring's JDBCTemplate with SQLite to write out a couple of hundred rows to an empty table. JDBCTemplate defaults to autocommit and it's non-trivial to convince it not to do so.

The relevant code and sqlitelulz.jar shows why this is a problem:

>java -jar sqlitelulz.jar 1000
Autocommit: 70.867652636 seconds
Manual commit: 0.107324493 seconds

$ java -jar sqlitelulz.jar 1000
Autocommit: 1.814235004 seconds
Manual commit: 0.075502495 seconds

Yes, that's 660 times slower on Windows (and only 25 times slower on non-ntfs). This time is entirely sqlite creating and deleting it's journal file.

Sadness.

2010-12-14

git --set-commit-id

People often complain that git's commit ids are too hard to remember and that they prefer the sequential ones generated by inferior version control systems.

Stock git doesn't have an option to pick the commit id for a commit; this seems like a grave omission. I've prepared a patch which offers git commit --set-commit-id.

For example, everyone knows that the base commit in a repository should have a low number:

$ git init
Initialized empty Git repository in ./.git/
$ git add -A
$ git commit --set-commit-id 0000000 -a -m "Base."
Searching:  46% (12593/26843), done.
[master (root-commit) 0000000] Base.
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 myfile

If you've already messed up your repository, a handy fixing script is provided:

$ git lg
 fe5e2ee - (HEAD, master) work, work, work, it's all I do
 a2c1ec8 - work, work, work
 e580e5e - work, work
 a6ad5ee - work
 0000000 - base
$ sequentialise.sh 0000000 6
Stopped at a6ad5ee... work
Searching: 39% (10468/26843), done.
[detached HEAD 0000010] work
 1 files changed, 1 insertions(+), 0 deletions(-)
Stopped at e580e5e... work, work
Searching:  174% (46706/26843)
[...]
$ git lg
 0000040 - (HEAD, master) work, work, work, it's all I do
 0000030 - work, work, work
 0000020 - work, work
 0000010 - work
 0000000 - base

Much more usable! This example repository is available for inspection. gitweb doesn't show the commit ids on the log screen, but you can mouse-over and see them in the URLs.

Needless to say, this takes "a while". sequentialise.sh defaults to 5 digits, i.e. enough for a million commits, and is reasonably fast on modern hardware. 6 digits is rather less tolerable.

2010-08-04

Java's ZipFile performance

I have an application that scales well up to around five threads a core, due to the mix of IO and CPU that it does.

That is, you give it more threads, and the throughput increases; the overall time goes down.

The following graph shows, in blue, the Sun's java.util.zip.ZipFile time to complete a set of unzips on an increasing number of threads:

Wait, what the cocking shit.

2010-04-23

Java/C++ polyglot

Today I discovered Java's "inline C++" keyword, //\u000a/*, which makes a Java/C++ polyglot pretty easy:

//\u000a/*
#include <iostream>

#define private
#define public
#define static
#define void int
struct {
  std::ostream &println(const char *c) {
    return std::cout << c < < std::endl;
  }
} out;

//*/
/*\u002a/
import static java.lang.System.out;

public class Polyglot {
//*/
  public static void main(/*\u002a/String[] args//*/
      ) {
    out.println("Hello from whatever language this is!");
  }

/*\u002a/
}
// */

Eclipse deals.. okay. The red-underlining in the commented sections is for the spelling. <3

« Prev - Next »