Phlogiston Blue - Porting a Large(ish) Program

Porting a Large(ish) Program

This is the second of four articles published, or to be published in the journal C Vu. For more details please see the introduction to the first article in the series - 'Moving to the Web'.

Last issue I promised that I would go into a little more detail about the problems we had porting our main product - the game Federation - from AOL to the web...

The project raised some rather interesting issues.

I think perhaps the most instructive way to discuss these issues is to start way back when Federation was first written, with an explanation of its structure. Federation is an example of what in those days was called a multi-player game, but in these days when everything is 'multi-player' is called a 'massively' multi-player game, or a 'persistent world' game.

This means that the world in which it is set is persistent - when players log off, the same world is there for them when they log back on. It also means that the game is scalable as far as the number of people it can hold goes. (In fact game design limitations mean that Federation has a practical limit of about 800 simultaneous players.)

When the game was first written, the web was not even a twinkle in anyone's eye. There were BBS systems and a small number of commercial online systems. This meant that the game has never been a stand-alone system - it was designed to rely on others for services like billing, bulletin boards and customer acquisition. It also means that it has to run on whatever operating system they provided. In short it always had to be highly portable.

The original version was written for the UK Compunet service - a fairly successful commercial service for CBM64 owners that ran in the early and mid 1980s. The operating system they used was called OS9/68K. This was a real time operating system that looked a little like Unix, but with somewhat less features. The main problems with OS9/68K were that it had no swap memory and no BSD style socket/stream facilities.

The program was written as a main server which handled a command from a player, processed it through to completion, and then processed the next command. When a player logged onto the game, the host system fired up a small client program (which we called the 'driver' program) and connected stdin and stdout for the driver to the player's input and output streams. The driver then communicated with the server ('host') through shared memory.

This host/driver architecture proved to have a number of advantages from the game design point of view, and stayed around until very recently.

The first major test of the program's portability came when we needed to move it onto a system using Unix. This was a strict System V implementation, so it didn't have streams, which meant all interprocess communications had to go through shared memory.

Actually, this was a bit of a nightmare. Shared memory had to be 'mounted', just like a disk drive, and if you wanted to dynamically allocate memory - for instance with malloc() - then you had to unmount the shared memory first! One of the things this port did was to reduce the game's dependence on dynamic memory.

Other than this, the port went gratifyingly smoothly. I'd been careful to avoid some of the more dodgey construct of K&R C - such as bitfields - and all the player input and output for the program was concentrated in two functions.

All in all we survived quite well though, and were reasonably satisfied.

Our next port was some nine months later when we did a deal with MicroLink, a fairly successful UK commercial service which had just been sold to AT&T's Istel network. Istel were a nightmare - but more about that shortly.

Istel wanted us running on one of their VAX machines, using its native VMS operating system connected to an X25 line. No one involved on our side had any experience of VMS (or X25 for that matter), but undaunted we set to. Istel loaned us an old VAX to do the development on. It was too big to get into the lift so the movers had to carry it up the stairs to the fourth floor, and take all the doors off to get it in. When you switched it on all the lights dimmed, and we never had to have the heating on all winter. It was about as powerful as a Intel 386sx machine.

We had two major problems with the port to VMS - the file system and the X25 interface. When I wrote the original program, it never occurred to me that some systems might not support byte oriented files. They seemed so obviously the way to go that I didn't realise that they were one of the key revolutionary concepts introduced by UNIX.

The more aging hacks amongst you will have realised that I've never worked in a mainframe environment! Actually, most of the operating systems then extant had record-oriented file systems - i.e. the basic unit of storage in the file was not the byte but a record. And if you didn't use up a whole record, the file system padded it out. And if you used one and a bit records to store each element of your file there was a massive performance hit.

By very good fortune, and possibly because I like my code to be 'tidy', the code only interfaced with the file system at a small number of points. I have to say though, that it wasn't deliberate on my part - it had never occurred to me that the file system interface might be a porting problem. Eventually we sorted it out and stuffed our data structures into VMS records.

The X25 problem was more complex. Eventually we figured out why the system was only putting one character into each X25 packet, and padding the rest with nulls. Fixing the problem was a doddle, because of the fact that only two functions were talking out to the players.

For all the work, we sadly didn't last very long on MicroLink. The service died fairly rapidly after Istel took it over - they insisted on charging customers by the number of bytes sent out! A classic failure to understand the nature of the emerging consumer market.

With the next port we crossed the Atlantic to go onto General Electric's GEnie service, then at number three in the consumer online market. GEnie had a special reputation for providing multi-player games, so this would be the big one for us.

GEnie was the big one in more ways than one - porting to their system was the biggest challenge yet.

GEnie ran on Honeywell minis, using a proprietary operating system called Mark III. Among it's more unusual features were 46-bit words and a per process memory limit of one megabyte. The compiler was pretty primitive as well, the editor was virtually unusable, and the linker could only cope with 120 characters worth of file names!

We coped with the 46-bit word pretty well. The only assumption I'd made in writing the program was that an int would be at least 32-bits. Obviously binary data files had to be converted - but we wrote a utility to read in the files on the old system and write each field out in ASCII. A matching utility on the new system read in the ASCII fields and wrote out the new style binaries. All very tedious to write, but essentially straightforward.

The 46-bit word did cause some problems with interprocess communication - especially communicating with processes on other machines with more conventional word lengths. We eventually solved this by only communicating in ASCII. Clumsy, but it worked.

Actually the 46-bit word length is not quite as bizarre as it sounds - it started as 48-bits, but the o/s used two bits for its own affairs, leaving 46 for applications programmers. As a special bonus for applications programmers the library contained a function to pack nine five-bit 'ASCII' characters into a word!

The one meg process limit caused a lot of heartbreak. In order to pare down the size of the program to fit in, we had to strip out most of the server's extensive sanity checking code. The absence of this code cost us a lot of grief in the long run and is one of the things that I still regret - with all the other time pressures on us, few of the checks have been restored in the years since.

As our user base grew, even we couldn't run in a single one meg space, and so I rewrote part of it to act as a separate process - a data server. Whenever the main process required text to send out to a player it asked the data-server for it. By doing this I effectively gave the program a one meg code and stack space, and a one meg data segment.

Actually this whole business was a classic case of what's good for individual users not being good for the system as a whole. We discovered that the one meg process limit wasn't an operating system limit - it was a compiler limit. Someone in the distant past had decided that no one should have processes larger than one meg.

Obviously the compiler is not the correct tool to enforce resource usage - and even worse the limits were compiled into the compiler, so there was no way to change them. The net result of this was that people with large applications had to split them into multiple processes all communicating with one another - some people had as many as five processes.

You may ask why this was so bad? Well there are a number of reasons. At the coding level it made each of the communicating programs more complex than a single program would have been, thus making them potentially more unstable. These programs collectively took up far more memory than a single program would have. But the worst thing was that they caused an enormous hit on the system processing resources because interprocess communications were being called instead of ordinary functions - not to mention all the extra context switching involved. Unfortunately the programmers were left with no choice.

A classic case of individual solutions to a problem resulting in degradation of the overall environment.

Well this is proving to be a longer tale than I first realised, so I'm going to leave the move to AOL, how we got on with their system and the final move to the web to another article in the next issue.

Have fun programming.

Alan Lenton

Read other articles about the history of online games

Back to the Phlogiston Blue top page