Drobe logo

Do You Object?


Published on 13th Dec 2004, 21:34:52, source is drobe.co.uk
By Peter Naulls

Looking at a new object and executable file format for RISC OS


-- "The Queen can go suck an elf" - The 10th Kingdom

AOF vs. ELF motifIntroduction
RISC OS and RISC OS developers make use of various file formats for developing programs. It is now over a decade since Acorn drew up the specifications for these formats, and they have become decidedly dated. Times have changed, and RISC OS users and developers deserve a more useful format. This article looks at the existing formats in use, and puts forward the case for moving to a new one. We start by looking at object files, the building blocks of programs.

Object Files
An object file is an intermediate binary file generated from a human-readable form: usually, C, C++ or assembler (although any language can in principle produce object files). Inside an object file is the actual code for program functions (ARM code in the case of RISC OS), information about other functions and how to put it together with other object files.

Ultimately, object files are put together with library files (which are just collections of object files themselves), usually including a C library to produce a final working program or executable.

Although important to C programmers, none of this is terribly exciting by itself, but as we'll see shortly, the exact choice of format for the object files can be important.

File Types
Under RISC OS, object files are in a format known as AOF, or Acorn Object Format. We also see ALF, or Acorn Library Format, which are collections of AOFs. Finally, most non-Basic RISC OS programs are AIF, or Acorn Image Format - these are files with RISC OS filetype &ff8. There are other program filetypes too - mostly modules and the RISC OS Utility formats, but we'll come to those later.

Under Unix, traditional object and executable formats include COFF and a.out formats, and some systems still use those, although the vast majority, including Linux, now use ELF (Executable and Linking Format). I won't go too much into the specifics of ELF, except that the main point here is its wide acceptance, having learnt from the experience of issues with previous formats. You can read more about it here.

AOF vs ELF
An obvious issue with AOF is that it's clearly a minority format. Not exactly a novel situation for RISC OS, but hardly a particularly useful one. Compared to ELF, AOF has poor (in practice, none) support for shared libraries, ELF's handling of weak symbols (symbols which are optionally defined) is much better, and AOF's handling of some types of constructs required for C++ is poor in a variety of obscure ways.

If that were all, then probably we wouldn't be too concerned. But the larger issue is that to turn assembly code into programs (output by RISC OS compilers and from hand written assembler) requires a variety of tools. All these tools are RISC OS specific, since they have to deal with AOF, which is also RISC OS specific. These tools include the assembler (which turns assembler into AOF), the linker (which combines library and object files to product programs), libfile (which creates libraries) and other sundries.

Maintaining these tools as part of GCCSDK development is extra work for a very limited number of developers. For dealing with ELF, there are equivalent programs - this package is called 'binutils'. Clearly, the solution is to move RISC OS development to binutils and ELF.

Moving to ELF
The suggestion that moving to ELF for RISC OS was a good idea was made more than a year ago, but efforts have been focused on other projects which have demanded more immediate attention. Despite that, there has already been quite a bit of activity with ELF on RISC OS:



ELF already?
So, if ELF is so great, why aren't we using it already? Well, it's probably true that I could recompile all my code to ELF, and start distributing ELF programs and it would be just fine. The issue is that immediately discarding AOF isn't a terribly helpful thing to do. binutils doesn't know about AOF, so it's a repeat of the 32-bit situation, where things have to be recompiled or they can't be used under the new system.

There's plenty of examples where it's impractical or not sensible to recompile everything in one go, and some things - e.g., libraries from ROL and Castle - don't have source available to the unwashed. In any case, for most RISC OS usage, the AOF format doesn't have any intrinsic problems and contains perfectly good code, and there's no reason to throw the format away out of hand.

Interworking ELF and AOF
As demonstrated by 'link', it is perfectly possible to product binaries from both ELF and AOF objects. Of course, 'link' isn't freely available and adding ELF support to the RISC OS GCC linker 'drlink' would involve a very considerable amount of rewriting. Besides, maintaining our own tools is something we want to avoid or minimise.

The solution is the 'bfd' library (binary file descriptor). This library is used by the tools in binutils, and is responsible for understanding a variety of object formats, including the ELF, COFF and a.out formats previously mentioned, as well as plenty of other obscure ones. Adding AOF support to this is the obvious step. This isn't entirely a straightforward, but is certainly possible given some effort. As of writing, I have already made significant progress on this.

Adding AOF support to the bfd library has some interesting side effects. Apart from the whole issue of making ELF sensible on RISC OS, some or all of the following are possible with sufficiently complete support:


Dynamic Linking and Shared Libraries
Most of what we've written above relates to static linking - that is, an executable contains all the code that it requires to run (apart from OS calls). We previously did a whole article on this, which you can read.

As I suggested earlier, AOF isn't terribly suited to dynamic linking, but ELF is, and 'ld' knows precisely what to do to generate dynamically linked executables. This doesn't immediately give us shared libraries on RISC OS, but going to ELF is certainly at least half the problem. To complete the picture, ELFLoader, and probably Unixlib need to be extended to be a full shared library manager. This part is still an unanswered question, but hopefully won't remain so for long.

Other RISC OS Formats
ELF executables perhaps don't lend themselves well to being RISC OS modules (although that could be proved wrong). Of course, GCC for RISC OS has never been capable of producing modules anyway, so it's slightly academic; although some of the issues in producing shared libraries are the same as producing modules, so the solution might end up being to ask 'ld' to do something slightly different when producing the binary.

What now?

The current plan for GCC and GCCSDK is to do a full release very shortly - there are a variety of fixes we would like to make available - two of these are yet more issues with the linker - this should be the last AOF oriented release. GCC 4.0 is also being looked at (the current version is a pre-release of 3.4.4). Given this, and allowing for testing of my bfd changes, there should be some programs for testing in January. See you then.

Links
Description of the ELF binary format
GCCSDK and GCC for RISC OS
GNU binutils
ELFLoader

Related articles
How productive are you on RISC OS?
Tell UPP what games you want for Christmas
"Thank you Roy Heslop"

This article has been linked to, or is available in the following formats:  
 
 
 
 
 
[Printable] [Digg this] [Blog search]


Design and concept (c) Fudgecake Design, 1999 - 2001. Content (c) The Drobe Team, 1999 - 2006. See www.drobe.co.uk for more information. For non-commercial personal use only.