Transforming a Mac 68k program into a Linux Daemon
Just a few days before the end of #MARCHintosh a call went out over Mastodon asking if anyone had an Apple IIe with an Apple Workstation Card. The hope was that someone could try to use it to join the #GlobalTalk network. While I have the hardware required, I had been reluctant to try and participate in GlobalTalk. Many posts had gone by my feed about the steps required to join and it looked like quite a bit of work. I replied that I had the hardware available but wasn't sure about being able to get it online because of the effort that would be involved. Joining GlobalTalk required using a very old program called Apple Internet Router which only ran on System 7.1 on 68k Macs.
I quickly got a response from @europlus that it might be possible to do it using an emulated Mac, at least saving me the trouble of trying to pull one of my Macs out of storage, finding some place to set it up, and hoping that the hardware actually works. After providing all the documentation and files he had been using, I gave it a try using QEMU on my Linux box. Unfortunately it didn't work at all, the emulated Mac came up to a white screen and just hung there.
@europlus and @robdaemon worked while I slept and by morning they had figured out how to build a custom version of QEMU which would actually boot up. Because it required a bunch of extra libraries and downloads, I decided I'd prefer to do it inside Docker. Mostly because I didn't want to accidentally contaminate my Linux host with a bunch of possibly incompatible libraries and config. Getting the QEMU source to download and compile inside Docker was easy for me since @europlus and @robdaemon had already done all the heavy lifting and provided the list of required libraries and compile flags. Getting networking working inside the emulated Mac was another challenge though.
When going into the Chooser control panel and choosing AppleShare I couldn't see the netatalk server I keep on my network. Going into MacTCP and manually configuring also didn't seem to work because none of the other machines on my network could ping the IP that I had assigned inside of MacTCP. After too much fiddling around I discovered that the Network Control Panel was pointing to using the serial port for AppleTalk. Once I switched that over to Ethernet then Chooser started showing the netatalk server on my network!
Getting TCP/IP working wasn't so simple. It turns out that even though QEMU had been configured for bridged networking, absolutely no TCP/IP traffic ever seemed to pass to or from the emulated Mac. Was there a problem with MacTCP? Some other network stack extension needed for System 7.1? Some option missing when the custom QEMU was compiled? I went round and round trying to figure it out. With no idea what it could be, @wotsac jumped in to help troubleshoot.
Was there some setting in Linux that could be causing problems? I couldn't see how since clearly the emulated Mac had network access because I could connect to the netatalk server on my LAN just fine. With @wotsac trying to replicate my problems I started to notice a pattern emerging and began to wonder if somehow Linux iptables was intervening and messing around with only IP based traffic. A bit of digging around the intertubes and it did seem like iptables might be the culprit. I found a command to tell iptables to stop dropping traffic going through the bridge, and suddenly the emulated Mac was responding to pings! Moments later @wotsac was able to see my netatalk server being routed over the internet! My emulated Mac running Apple Internet Router was working!
Now that I had a base to work from, the next thing to do was to see if there was a way I could reconfigure MacTCP and AIR from outside of the emulated Mac. In theory it should be possible since I have complete access to every file on the hard drive image and there should be a way to make changes to those files. But they are Mac files, which means resource forks. Is there a way to mount the hard drive image under Linux and reach the resource forks?
Being able to edit the resource forks of files inside the HFS file system turned out to be a little more cumbersome than it should have been. At one time Linux was able to mount an HFS file system and through a little bit of odd path syntax the resource forks and creator/type information could be accessed. Apparently that ability had been dropped though way back in the 2.6 kernel days.
More research turned up hfsutils which could directly access the HFS file system. The hcopy command was able to copy a file from the HFS file system and convert it to MacBinary and drop it on the Linux file system. The resource fork was there, but in a way that was a little difficult to edit. I found another package called macutils with a command called macunpack which could split a MacBinary into its three forks: data, resource, and Finder info.
Looking around more, I found a Python module called rsrcdump which was able to load in the resource fork file and convert it to JSON, allowing me to change options in configs within the Mac and then quickly see what parts of the resource fork changed. Another feature of rsrcdump is that it can be used in your own Python program and will unpack a resource fork into a collection that's easy to manipulate. After making changes to the collection it can be packed back up and written back out as a resource fork file. From there it is pretty easy to bundle all three forks back together into a MacBinary file and put that file back onto the HFS file system.
It didn't take too long to figure out which files to change and what resource IDs were involved by making changes to a setting, then dumping out the resource fork and comparing to the previous file. I was able to add command line flags to update the MacTCP config and set the IP address, netmask, gateway, and DNS servers. To create the AIR config I created a blank config file to use as a template and then added options for setting the zone name, zone number, and hosts.
The challenge came when trying to get AIR to automatically load my config during boot. No matter what I did, AIR insisted the file didn't exist when it started up. Unsatisfied with the prospect of still needing to mess around inside the emulated Mac just to get one configuration setting working, I set about doing more investigation to find the root cause.
Instead of saving the full path to the config file, AIR instead saves the Catalog Node ID which identifies the config file. The Catalog Node ID or CNID is unique a 32 bit number which HFS assigns to every single file and directory. By using the CNID instead of a full path, an application can find a file even if the user moves it to another directory or renames it. I had verified through hfsutils that I was able to get the CNID of the AIR config file, and I was certain that I was setting the correct CNID in the Router resource. If I manually used the Set Startup menu in Router Manager and pointed to the file it worked perfectly! If I pulled my config file out after using Set Startup there were no changes to the config file. It was the same when I pulled out the Router extension and examined its resource fork, there were no differences between what I had put in the resource fork and what Router Manager had done. How was it possible for it to insist that the file didn't exist when I was updating the resource fork?
Without any differences in the files, I tried making a snapshot of the hard drive after my program had configured it, and a snapshot of the hard drive after it had been "blessed" by Router Manager. Comparing the two hard drive images showed some differences, but what exactly were they. Using hls to do a recursive directory listing of the entire drive and comparing the listings between the two snapshot didn't show any differences either, other than a couple of timestamps. But which bytes that had been changed were the important ones? I spent a day writing a program that would let me test the differences one byte at a time and narrow down what made AIR find the config file. It was tedious and took an entire day, and all I learned was it had something to do with the HFS Catalog.
Clearly the only path forward was to learn how the HFS Catalog is stored on disk and learn how to tear it down and dump out all the information stored in it, including information that would never be presented to a user. That meant having to find very old documentation and write some tool, meaning several days of work. But being the stubborn person I am, I was not willing to let it go and I had to know exactly what Router Manger was doing to bless the file.
Apple used to have all the Macintosh Toolbox documentation on their site but it had been deleted a long time ago. Once upon a time I had done some Mac development and had a CD-ROM of Inside Macintosh around. After some trial and error I was able to get the CD-ROM to mount under Linux only to find that the documentation was in a file format that wasn't going to be easily readable on a modern operating system. Luckily archive.org came to the rescue and I was able to find the documentation I was looking for.
Slowly I started putting together a quick & dirty Python hack to read the B*-tree catalog structure. So many linked lists and keys and nodes and IDs and logical blocks and inedexes and records and leaves and allocation blocks. It took quite a while just to figure out which sector on the diskwas the starting point of the catalog. While guessing at which field pointed to what, I stumbled upon a B*-tree root node, but it never seemed to link to more than a few files. Turned out I had been misinterpreting some of the fields and that just happened to put to some other B*-tree on the disk. Once I straightened that out I was at least pulling in the root node correctly.
I kept picking away at things, but after the first day I still couldn't even walk the root directory and get a list of all the files I knew were in it. While trying to understand how the search keys worked, something that kept nagging at me was there didn't seem to be a way to look up a file by CNID. All the catalog indexing I was seeing was based on the parent directory CNID and the name of the file, not the CNID of the file itself.
The next day I finally made some progress, no doubt because I was wearing the binary tree T-shirt my sister had given me! By the afternoon I was finally able to walk the entire catalog, moving from one index node to another and down to a leaf node and decoding the directory and file information. This was enough to create a print out all the internal B*-tree details for both the unblessed and blessed snapshots and do a diff. Suddenly I could see what the difference was. It was something that was hidden away that normal directory tools never show. Something users would never even know about. It was the File Thread Record. This was the missing piece that enabled finding a file by CNID.
Knowing what the missing piece was, I went looking to see if there was a way to create a File Thread Record using existing HFS tools. Nope. Not only that, but most documentation that mentioned File Thread Records said that they weren't actually necessary!
At this point I was starting to feel a bit defeated. Yes I could probably make something to create the necessary entry in the HFS catalog, but that would mean far, far more work and having to write some very complicated code for managing allocating records, nodes, blocks, and sectors. There was too much risk of getting that wrong since, as I had already learned, the whole system was quite complex. I had also only scratched the surface of the complexity, there was a bunch of stuff about "overflow" that I hadn't needed to worry about in order to print the B*-tree.
After sleeping on it, I remembered that there was something that always had thread records: directories. Perhaps there was some way I could hack my hack so that it didn't have to do the heavy lifting of actually inserting new records into the HFS catalog? I played around with using hmkdir to make a directory, then modifying my little hack so I could find the associated Directory Thread Record and the sector it was in. With that information I pulled the disk image into a hex editor and changed the Directory Thread Record into a File Thread Record and changed its CNID to point to the config file. I then booted the emulated Mac up and it worked! AIR found the config file and loaded it and brought the routing online!
I quickly modified my HFS catalog dumping hack so that I could have it automatically do the same thing I had done manually. It worked perfectly! I could now alter the settings in the config file and have AIR find it and start it completely automatically. No need to ever connect to the screen of the emulated Mac to get AIR configured and running. It's still a horrible hack, but it works!
If you're looking for an easier way to join GlobalTalk then hopefully this will get you started, I've put everything up here: https://github.com/FozzTexx/globaltalk