Today marks the 10th anniversary of the most bizarre, and possibly the saddest, job I ever took.

The year was 2005. My interest in writing a content management system in Java for the company that bought our startup had been steadily draining away, while my real passion was working on compilers and other programming language infrastructure (mostly SBCL). One day I spotted a job advert looking for compiler people, which was a rare occurrence in that time and place. I breezed through the job interview, but did not ask the right questions and ignored a couple of warning signs. Oops.

It turned out to be a bit of an adventure in retrocomputing.

The bizarre

This was the former internal tools unit of a very large company, let's call them X. For some reason X had split off the unit and sold (given?) it to a moderately large consulting company, whom we shall call Y. I was going to work at Y. The reason they needed compiler people was that they were about to take over the maintenance of a C compiler suite (compiler, linker, assembler, etc). Except I'd misunderstood them as taking over the maintenance from X. That wasn't the case. Actually the compiler was from another very large company, Z, who were discontinuing all support. So X bought the source code from Z for very significant $$$, and needed somebody (Y) to actually do something with it. In fact it wasn't even just one compiler suite as I'd initially understood, it was two. Woo, double the compilers to play with!

I started in September, but some schedules had slipped and we wouldn't actually have anything to work with for a month or two. So I had plenty of time to acclimatize there. Which is good, because it's like I'd stepped into some strange parallel dimension where the 80s never ended. You know, the kind of place where you need access to some old documentation, and eventually find it's stored in an ingenious in-house source control system built on top of RCS.

For example on my first day I found that X was running what was supposedly largest VAXcluster remaining in the world, for doing their production builds. Yes, dozens of VAXen running VMS, working as a cross-compile farm, producing x86 code. You might wonder a bit about the viability of the VAX as computing platform in the year 2005. Especially for something as cpu-bound as compiling. But don't worry, one of my new coworkers had as their current task evaluating whether this should be migrated to VMS/Alpha or to VMS/VAX running under a VAX emulator on x86-64! [0]

Why did this company need to maintain a specific C compiler anyway? Well, they had their own ingenious in-house programming language that you could think of as an imperative Erlang with a Pascal-like syntax that was compiled to C source [1]. I have no real data on how much code was written in that language, but it'd have to be tens of millions lines at a minimum.

The result of compiling this C code would then be run on an ingenious in-house operating system that was written in, IIRC, the late 80s. This operating system used the 386's segment registers to implement multitasking and message passing. For this, they needed the a compiler with much more support for segment registers than normal. Now, you might wonder about the wisdom of relying on segment registers heavily in the year 2005. After all use of segment registers had been getting slower and slower with every generation of CPUs, and in x86-64 the segmentation support was essentially removed. But don't worry, there was a project underway to migrate all of this code to run on Solaris instead [2].

After a couple of months of twiddling my thumbs and mostly reading up on all this mysterious infrastructure, a huge package arrived addressed to this compiler project. But... We were supposed to get a source dump. Why does the package need two men to carry it? Did somebody play a practical joke on us, and send the source as printouts?

Why it's the server that we'll use for compiling one of the compiler suites once we get the source code! A Intel System/86 with a genuine 80286 CPU, running Intel Xenix 286 3.5. The best way to interface with all this computing power is over a 9600 bps serial port. Luckily the previous owners were kind enough to pre-install Kermit on the spacious 40MB hard drive of the machine, and I didn't need to track down a floppy drive or a Xenix 286 Kermit or rz/sz binary. God, what primitive pieces of crap that machine and OS were.

You might wonder about the wisdom of using a 15-20 year old machine as the sole method of building a piece of software. It's dog slow and obviously will break sooner or later. In fact I raised this very issue and suggested maybe imaging the hard drive and getting everything running virtualized. That idea was nixed since the machine was old and fragile, we couldn't risk poking around in the inside. It'd be really hard to replace, when they went hunting for this machine from antique computer specialists, they only found two remaining working units [3].

This might be a good time to say that computationally speaking, I was raised by the wolves on a SunOS 4 server (which I ended up sysadmining for a few hundred users). My personal email was still going over UUCP in 2005. The highlight of my previous weekend (in 2015, when I'm writing this) was finding what looks like a partial source repository for a Lisp implementation written before I was born, and which appeared to have been completely lost to time. It was on a copy of some old backup tapes from an ITS server, and I don't even remember how or when those ended up on my harddrive. Which is to say, I like old computer systems more than is reasonable.

But even by my standards this level of computational archeology was going a bit too far. And the rabbit hole still had a little bit deeper to go.

A couple of weeks later the source drop arrived. I'll talk about the other compiler later, let's tackle this one that needed to be built on a 286 first.

So it was written in PL/M. (Wait, is that even a thing? That's not a thing, right?). And it was last modified in the mid 80s. I'd like to say the build instructions were generated using a typewriter, but it could be that my memory is playing tricks on that. Some of the components didn't build cleanly, and required various Makefile tweaks with excruciating round trip times for every test. Because, you know, this is a 286.

The hard drive wasn't large enough for all of the components either, so the process of rebuilding everything would be:

  • Upload the linker source tarball over the 9600 bps serial connection from a Linux server acting as a frontend
  • Unpack it
  • Build
  • Download the linker binary back to safety
  • Remove the source and the build artifacts
  • Repeat the same for all five components of the system.

Just the data transfers for each component took an hour. But after a long time fighting with it I had a script that with a single keystroke generated bit-identical binaries when compared to the ones that had apparently been in use for almost the last 20 years.

I was pretty worried though, it'd be really hard to actually make any use of this source. There was no documentation except for the build instructions, we'd need to reverse engineer everything. There wouldn't be any training either from company Z either, frankly it's a miracle if anyone who originally worked on the software was still with the company. Nobody knew PL/M. The roundtrip time from making a change on the build machine to having a binary on a machine capable of actually running it was at least an hour. And we didn't have a source level debugger for this, so that'd mean an hour just to add a single debug printf. (Wait, not a debug printf of course. A debug whatever-it-is-that-PL/M-uses-for-io). It'd be pure pain.

I expressed these concerns, and was told not to worry.
- " Oh, we'll never want to make changes to this compiler, not enough code is compiled with it these days for that to be worth it. The more modern suite is the important one."
- "Wait? I just spent a month elbow deep in PL/M and Xenix/286 over a 9600bps Kermit connection, and you're telling me we're never going to actually use any of this?!"
- "Right, we just needed to verify that we really got what we bought."

I didn't really know whether to be happy about not having to do any more work on that crap, or angry about the waste of time.

The sad

That concluded the bizarre retrocomputing part of the story. We now get to the part with sad dysfunctional corporate politics. If you're just reading this for the laughs, maybe just skip to the end.

The more modern compiler suite wasn't a spring chicken either. It had to be compiled specifically on Visual Studio 6. There were again no design docs, nor tests. The lack of tests was explained as being due to third party IP concerns. The lack of documentation we never got an answer for.

Unlike the truly ancient compilers, this one was easy to build. But what could we possibly do with it? So I read through the compiler, tried to understand what each file did, did some experiments and wrote some notes.

We arranged a big meeting with senior engineers from all the relevant departments of X. The agenda was to figure out what improvements they'd want in the compiler. It was pretty dispiriting. Half of them seemed to think it'd be better not to touch it at all, since we'd probably just break it. Even those who weren't completely opposed to changes couldn't think of anything they really needed. Finally someone took pity on me, and noted that the compiler isn't very smart about scheduling segment register loads, and those were expensive operations. Maybe that could be improved?

After the meeting one of the managers told me that it was really our job to come up with projects that the customer wanted to buy, not the other way around. And it usually couldn't just be a general project for minor improvements, it'd need clear and ideally measurable goals. The projects would also need to be pretty large to justify all the overhead. It should go without saying that this is an absolutely insane way of doing platform development, but it's something that follows directly from the incentives of the two parties. How anyone at X thought that anything good would come out of this, I don't know.

But never mind that. Our initial project for taking over the compiler maintenance was well funded, and vague enough that it was easy to argue that proving capability of shipping some kind of improvements is a core deliverable. We could at least proceed with the only improvement anyone had shown any interest in.

So I implemented a new peephole optimizer stage for the segment registers, and even got the code reviewed by the original authors of the compiler when they came over from Z to give us a training session. It seemed to work, but as mentioned above we didn't have a test suite and building one would take a long time and a lot of work. (Excellent! We can propose that as a project later!).

We couldn't even run any of the production code since that would require the ingenious in-house operating system. The only way to get any performance numbers and confidence in the changes being correct would be to schedule a load test in X's test lab. Unfortunately weeks and weeks of discussion over that never got us both the lab time and the people from their side who would have been needed. It's of course understandable; whether these compiler changes got released or not wouldn't make a difference to these people, who had their own actual work to do. But it also made it very hard to see how we could ship this change. The justification would be improved performance, but with no numbers it'd be a hollow claim.

That's when it dawned on me that there was never going to be any real compiler work there. These special compilers would not really matter once X migrated away from the custom OS, which would have to happen. Oh, sure it'd need to be "maintained" just in case a customer running 20 year old code needed a bugfix. Given the dysfunctional processes, it seemed pretty clear that the costs for any improvements would be massive in the short active development life these systems had remaining. They'd probably spent a seven figure sum on this project as insurance, but actually doing something with the code? No way.

All of this infrastructure was just going to be on life support while it was being replaced by new systems that would in turn be obsoleted in five years. But nothing would ever actually go away, all this cruft would just accumulate and accumulate, nominally supported for ever. And you'd have these extremely good engineers doing this completely insane work, having been moved working from a prestigious high tech company to a despised consulting firm.

And how do you even get out of that job? I imagined myself in a job interview in 2010, trying to explain how useful my extensive knowledge of Xenix, PL/M build systems, and VMS would be to my prospective new employer. There might be a time when you just stop keeping up with tech, but doing it in your 20s is really not that time :)

Coda

So I quit without arranging for another job first, assuming that something would probably turn up. In an amazing display of serendipity, during my notice period ITA Software posted to the SBCL mailing list that they wanted to pay somebody to work on SBCL improvements for them, which was pretty much my dream gig at the time [4]. Perfect timing.

Ok, that's all. You can now proceed with the one-upping with stories of developing new production software on a physical IBM 1401 in this millennium, or something ;-)

Footnotes

[0] I don't know the outcome of that evaluation.

[1] No, transpiling is not a word no matter how much you people try to make it one.

[2] And that leads us to the question of whether Solaris is really what you want to be migrating to in 2005.

[3] Wait?! There are two machines in the world that you think can be used to build this software, and we only bought one of them? "Oh, yes. It would have been a pretty good idea to buy the second one as a spare. Let's do that now!"

[4] Of course if one worry with the job at Y was that I'd be unemployable due to only having worked on boring and obsolete technology, one might wonder about the long term career prospects in Common Lisp compilers. But look, in 2006 CL was really going places!