Do You Object?By Peter Naulls. Published: 13th Dec 2004, 21:34:52 | Permalink | Printable
Looking at a new object and executable file format for RISC OS
-- "The Queen can go suck an elf" - The 10th Kingdom
RISC OS and RISC OS developers make use of various file formats for developing programs. It is now over a decade since Acorn drew up the specifications for these formats, and they have become decidedly dated. Times have changed, and RISC OS users and developers deserve a more useful format. This article looks at the existing formats in use, and puts forward the case for moving to a new one. We start by looking at object files, the building blocks of programs.
An object file is an intermediate binary file generated from a human-readable form: usually, C, C++ or assembler (although any language can in principle produce object files). Inside an object file is the actual code for program functions (ARM code in the case of RISC OS), information about other functions and how to put it together with other object files.
Ultimately, object files are put together with library files (which are just collections of object files themselves), usually including a C library to produce a final working program or executable.
Although important to C programmers, none of this is terribly exciting by itself, but as we'll see shortly, the exact choice of format for the object files can be important.
Under RISC OS, object files are in a format known as AOF, or Acorn Object Format. We also see ALF, or Acorn Library Format, which are collections of AOFs. Finally, most non-Basic RISC OS programs are AIF, or Acorn Image Format - these are files with RISC OS filetype &ff8. There are other program filetypes too - mostly modules and the RISC OS Utility formats, but we'll come to those later.
Under Unix, traditional object and executable formats include COFF and a.out formats, and some systems still use those, although the vast majority, including Linux, now use ELF (Executable and Linking Format). I won't go too much into the specifics of ELF, except that the main point here is its wide acceptance, having learnt from the experience of issues with previous formats. You can read more about it here.
AOF vs ELF
An obvious issue with AOF is that it's clearly a minority format. Not exactly a novel situation for RISC OS, but hardly a particularly useful one. Compared to ELF, AOF has poor (in practice, none) support for shared libraries, ELF's handling of weak symbols (symbols which are optionally defined) is much better, and AOF's handling of some types of constructs required for C++ is poor in a variety of obscure ways.
If that were all, then probably we wouldn't be too concerned. But the larger issue is that to turn assembly code into programs (output by RISC OS compilers and from hand written assembler) requires a variety of tools. All these tools are RISC OS specific, since they have to deal with AOF, which is also RISC OS specific. These tools include the assembler (which turns assembler into AOF), the linker (which combines library and object files to product programs), libfile (which creates libraries) and other sundries.
Maintaining these tools as part of GCCSDK development is extra work for a very limited number of developers. For dealing with ELF, there are equivalent programs - this package is called 'binutils'. Clearly, the solution is to move RISC OS development to binutils and ELF.
Moving to ELF
The suggestion that moving to ELF for RISC OS was a good idea was made more than a year ago, but efforts have been focused on other projects which have demanded more immediate attention. Despite that, there has already been quite a bit of activity with ELF on RISC OS:
- 'link' in the Castle C/C++ tools knows about ELF, as people are just a bit too quick to mention when the topic comes up. Although indeed, 'link' does usefully support ELF, this is the only tool in the suite that does so, and there's no sensible way to directly create ELF object files. But for the purposes of testing, link can input ELF object files and output static ELF binaries.
- John-Mark Bell's ELFLoader module - this is a key component, because it lets us take the all-important step of running ELF binaries. Although there are 3rd party utilities that do the following for AIF, one important point here is that ELFLoader knows precisely how much memory an ELF program will initially need. This means that explicit WimpSlot setting is no longer needed. ELFLoader can only deal with static binaries, however. We'll come to the issue of dynamic binaries shortly.
- Some time ago, I produced a full ELF toolchain for RISC OS - that is, GCC plus a binutils port to demonstrate that the concept was possible. You can download this from the GCCSDK site. I don't recommend you use this compiler for any serious work because it's an old version of GCC and Unixlib, but it may be interesting if you wish to experiment.
- ELF support in 'as', the assembler bundled with GCC. Around the same time, ELF output support was added to the GCCSDK assembler. The reason for this might not be immediately obvious - after all, the assembler in binutils already knows how to generate ELF (and is superior for a variety of other reasons), but one fly in the ointment is that the assembler format used by RISC OS tools is the traditional ARM format which is not 100% compatible with the 'gas' (GNU assembler) format used by binutils. In particular, Unixlib contains quite an amount of assembler of this format. GCC itself outputs assembler, although for an ELF setup, it would be configured to output 'gas' format.
So, if ELF is so great, why aren't we using it already? Well, it's probably true that I could recompile all my code to ELF, and start distributing ELF programs and it would be just fine. The issue is that immediately discarding AOF isn't a terribly helpful thing to do. binutils doesn't know about AOF, so it's a repeat of the 32-bit situation, where things have to be recompiled or they can't be used under the new system.
There's plenty of examples where it's impractical or not sensible to recompile everything in one go, and some things - e.g., libraries from ROL and Castle - don't have source available to the unwashed. In any case, for most RISC OS usage, the AOF format doesn't have any intrinsic problems and contains perfectly good code, and there's no reason to throw the format away out of hand.
Interworking ELF and AOF
As demonstrated by 'link', it is perfectly possible to product binaries from both ELF and AOF objects. Of course, 'link' isn't freely available and adding ELF support to the RISC OS GCC linker 'drlink' would involve a very considerable amount of rewriting. Besides, maintaining our own tools is something we want to avoid or minimise.
The solution is the 'bfd' library (binary file descriptor). This library is used by the tools in binutils, and is responsible for understanding a variety of object formats, including the ELF, COFF and a.out formats previously mentioned, as well as plenty of other obscure ones. Adding AOF support to this is the obvious step. This isn't entirely a straightforward, but is certainly possible given some effort. As of writing, I have already made significant progress on this.
Adding AOF support to the bfd library has some interesting side effects. Apart from the whole issue of making ELF sensible on RISC OS, some or all of the following are possible with sufficiently complete support:
- objdump, nm, strip - tools from binutils for examining object files will work on AOF files (as well as ELF)
- It might be possible to run GDB (GNU debugger) because GDB uses bfd to understand files it loads.
- 'ld' (the binutils linker) will work transparently on ELF and AOF. Indeed, it certainly wouldn't care if it was used just for AOF if that's what you were dealing with. 'ld' also has rather more features than drlink.
- 'gas' can produce AOF files, although that's perhaps of more limited use.
Dynamic Linking and Shared Libraries
Most of what we've written above relates to static linking - that is, an executable contains all the code that it requires to run (apart from OS calls). We previously did a whole article on this, which you can read.
As I suggested earlier, AOF isn't terribly suited to dynamic linking, but ELF is, and 'ld' knows precisely what to do to generate dynamically linked executables. This doesn't immediately give us shared libraries on RISC OS, but going to ELF is certainly at least half the problem. To complete the picture, ELFLoader, and probably Unixlib need to be extended to be a full shared library manager. This part is still an unanswered question, but hopefully won't remain so for long.
Other RISC OS Formats
ELF executables perhaps don't lend themselves well to being RISC OS modules (although that could be proved wrong). Of course, GCC for RISC OS has never been capable of producing modules anyway, so it's slightly academic; although some of the issues in producing shared libraries are the same as producing modules, so the solution might end up being to ask 'ld' to do something slightly different when producing the binary.
The current plan for GCC and GCCSDK is to do a full release very shortly - there are a variety of fixes we would like to make available - two of these are yet more issues with the linker - this should be the last AOF oriented release. GCC 4.0 is also being looked at (the current version is a pre-release of 3.4.4). Given this, and allowing for testing of my bfd changes, there should be some programs for testing in January. See you then.
Description of the ELF binary format
GCCSDK and GCC for RISC OS
Previous: Gaming news in brief
Next: Dmoz RISC OS section needs rescuing
DiscussionViewing threaded comments | View comments unthreaded, listed by date | Skip to the end
Please login before posting a comment. Use the form on the right to do so or create a free account.
Search the archives
Today's featured article
Internationalising RISC OS
Unicode, i18n and more explained
30 comments, latest by caliston2 on 16/7/03 8:57PM. Published: 10 Jul 2003
The RISC OS Projects Initiative: The list grows
Discuss this. Published: 16 Apr 2001
News and media:
RISCOS Ltd •
RISC OS Open •
MW Software •
Advantage Six •
CJE Micros •
Liquid Silicon •
Chris Why's Acorn/RISC OS collection •
The Register •
The Inquirer •
Apple Insider •
BBC News •
Sky News •
Google News •