Electronic Tekumel Part Two

I left off the previous post with a promise to talk about data recovery of Apple ][ diskettes. Here's that story:

I was able to lend the Foundation a Platinum //e with two working Unidisk drives. This is a late-model Apple ][, with 128K of memory and integrated 80-column card. I also tricked it out with the crucial component: an Apple Super Serial card, which I've put into Slot 2.

There's a magnificent program called ADTPro, which is designed for exactly this sort of recovery project. On the Apple ][ side, it is a straightforward program to either read a block from disk, or read a block of memory from either a serial, ethernet (yes, there are Apple ][ ethernet cards; a few (rare) from Back In The Day, and some modern ones as well), or cassette audio interface, and then write that block to diskette. On the modern-computer side, it's a Java program that does
the same thing: reads blocks of data from disk and pushes them over the wire, or receives blocks of data from the wire and puts them onto the disk.

We are of course going to be using this in its read-a-physical-diskette-and-transfer-it-over-the-wire mode. Serial is the best combination, as Apple ][ Ethernet interfaces are not cheap.

While most modern computers don't come with RS-232 ports anymore, a USB-to-RS232C converter is about twenty bucks at Micro Center. Finding drivers for modern Macs is a little more problematic, but there's an open-source PL-2313 driver that works fine (the Prolific PL-2313 seems to be the USBRS232 economy chipset of choice). As it happens, the Foundation's laptop is a Windows 7 machine, and of course all these converters come with Windows drivers. Linux support, as always, is comprehensive and unproblematic.

So that's the hardware needed: Apple ][, Super Serial Card, and a USB-to-RS232C serial converted. The software is ADTPro. After some initial experimentation, we were able to start transferring disk images. Early results suggest that about 60% of the disks are recoverable, which isn't too bad for 30-year-old media. Our goal is to work with the actual diskettes as little as possible: to the degree we can extract data from them and then work with the extracted copy, that's what we want to do, in order to reduce, as much as possible, the damage that reading the diskettes inevitably will do.

Somewhat strangely, Professor Barker's documents seemed to have a much better recovery rate than those diskettes containing programs. Some of this may be that those programs are copy-protected, and therefore not amenable to the simple track-by-track copying that ADTPro does.

However, there is certainly enough to keep us all busy for quite some time just working with the disk images we are able to transfer. Once we have the disk images, our work isn't yet done. I donated a Virtual II limited license to the Foundation, so we can fire up a virtual Apple //e and mount the disk images and use them. This may be necessary as we find documents in strange binary formats. If it becomes necessary, we can try a number of other recovery options--the Asimov Apple II Archives contain a vast collection of Apple II software of, admittedly, dubious legality. Armed with disk manipulation utilities and (almost certainly) the program that produced this data in the first place, we should be able to work with the data either under emulation or on the Apple //e itself.

However, the bulk of what we've recovered so far are just text files. Inspection reveals some word processing formatting codes, but for the most part, the contents are just ASCII text. For these cases, it obviously will be a great deal faster to simply extract the files without having to go through an emulator to view or "print" them (one suggestion was to install a printer driver that simply printed to a local file from the emulator). A little Googling suggested a solution.

AppleCommander is an Open Source java program designed to allow manipulation of a wide variety of Apple ][ diskette images. While it doesn't have an "extract everything" mode, it turned out to be very easy to write a little perl wrapper to determine the diskette format and retrieve a listing, and then extract each file. I did a little bit of file-type association to append an appropriate suffix to each file name, and a little bit of filename mangling to make the files easier to work
with in a Unix context. As it stands, the wrapper can work with DOS 3.3 or ProDOS images, and knows about text, Applesoft BASIC, and Integer BASIC files (everything else just gets extracted as unmodified binary).

This tool will enable the Tekumel Foundation to quickly extract the word processing documents that represent much of Professor Barker's Tekumel work. Future enhancements may include a converter to interpret the formatting codes and transform the raw text into cleanly-formatted documents.

However, even the capabilities we have now will enable us to (most importantly) get what data we can onto media with a longer shelf life, and (secondarily, but still significantly) fairly easily prepare text files so that Tekumel scholars can have access to some of the pre-publication stages of various Tekumel documents.

References:
Asimov Apple II Archives: ftp://ftp.apple.asimov.net/pub/apple_II/
ADTPro: http://adtpro.sourceforge.net/
AppleCommander: http://applecommander.sourceforge.net/