Drobe :: The archives
About Drobe | Contact | RSS | Twitter | Tech docs | Downloads | BBC Micro

Do You Object?

By Peter Naulls. Published: 13th Dec 2004, 21:34:52 | Permalink | Printable

Looking at a new object and executable file format for RISC OS


-- "The Queen can go suck an elf" - The 10th Kingdom

AOF vs. ELF motifIntroduction
RISC OS and RISC OS developers make use of various file formats for developing programs. It is now over a decade since Acorn drew up the specifications for these formats, and they have become decidedly dated. Times have changed, and RISC OS users and developers deserve a more useful format. This article looks at the existing formats in use, and puts forward the case for moving to a new one. We start by looking at object files, the building blocks of programs.

Object Files
An object file is an intermediate binary file generated from a human-readable form: usually, C, C++ or assembler (although any language can in principle produce object files). Inside an object file is the actual code for program functions (ARM code in the case of RISC OS), information about other functions and how to put it together with other object files.

Ultimately, object files are put together with library files (which are just collections of object files themselves), usually including a C library to produce a final working program or executable.

Although important to C programmers, none of this is terribly exciting by itself, but as we'll see shortly, the exact choice of format for the object files can be important.

File Types
Under RISC OS, object files are in a format known as AOF, or Acorn Object Format. We also see ALF, or Acorn Library Format, which are collections of AOFs. Finally, most non-Basic RISC OS programs are AIF, or Acorn Image Format - these are files with RISC OS filetype &ff8. There are other program filetypes too - mostly modules and the RISC OS Utility formats, but we'll come to those later.

Under Unix, traditional object and executable formats include COFF and a.out formats, and some systems still use those, although the vast majority, including Linux, now use ELF (Executable and Linking Format). I won't go too much into the specifics of ELF, except that the main point here is its wide acceptance, having learnt from the experience of issues with previous formats. You can read more about it here.

AOF vs ELF
An obvious issue with AOF is that it's clearly a minority format. Not exactly a novel situation for RISC OS, but hardly a particularly useful one. Compared to ELF, AOF has poor (in practice, none) support for shared libraries, ELF's handling of weak symbols (symbols which are optionally defined) is much better, and AOF's handling of some types of constructs required for C++ is poor in a variety of obscure ways.

If that were all, then probably we wouldn't be too concerned. But the larger issue is that to turn assembly code into programs (output by RISC OS compilers and from hand written assembler) requires a variety of tools. All these tools are RISC OS specific, since they have to deal with AOF, which is also RISC OS specific. These tools include the assembler (which turns assembler into AOF), the linker (which combines library and object files to product programs), libfile (which creates libraries) and other sundries.

Maintaining these tools as part of GCCSDK development is extra work for a very limited number of developers. For dealing with ELF, there are equivalent programs - this package is called 'binutils'. Clearly, the solution is to move RISC OS development to binutils and ELF.

Moving to ELF
The suggestion that moving to ELF for RISC OS was a good idea was made more than a year ago, but efforts have been focused on other projects which have demanded more immediate attention. Despite that, there has already been quite a bit of activity with ELF on RISC OS:
  • 'link' in the Castle C/C++ tools knows about ELF, as people are just a bit too quick to mention when the topic comes up. Although indeed, 'link' does usefully support ELF, this is the only tool in the suite that does so, and there's no sensible way to directly create ELF object files. But for the purposes of testing, link can input ELF object files and output static ELF binaries.

  • John-Mark Bell's ELFLoader module - this is a key component, because it lets us take the all-important step of running ELF binaries. Although there are 3rd party utilities that do the following for AIF, one important point here is that ELFLoader knows precisely how much memory an ELF program will initially need. This means that explicit WimpSlot setting is no longer needed. ELFLoader can only deal with static binaries, however. We'll come to the issue of dynamic binaries shortly.

  • Some time ago, I produced a full ELF toolchain for RISC OS - that is, GCC plus a binutils port to demonstrate that the concept was possible. You can download this from the GCCSDK site. I don't recommend you use this compiler for any serious work because it's an old version of GCC and Unixlib, but it may be interesting if you wish to experiment.

  • ELF support in 'as', the assembler bundled with GCC. Around the same time, ELF output support was added to the GCCSDK assembler. The reason for this might not be immediately obvious - after all, the assembler in binutils already knows how to generate ELF (and is superior for a variety of other reasons), but one fly in the ointment is that the assembler format used by RISC OS tools is the traditional ARM format which is not 100% compatible with the 'gas' (GNU assembler) format used by binutils. In particular, Unixlib contains quite an amount of assembler of this format. GCC itself outputs assembler, although for an ELF setup, it would be configured to output 'gas' format.


ELF already?
So, if ELF is so great, why aren't we using it already? Well, it's probably true that I could recompile all my code to ELF, and start distributing ELF programs and it would be just fine. The issue is that immediately discarding AOF isn't a terribly helpful thing to do. binutils doesn't know about AOF, so it's a repeat of the 32-bit situation, where things have to be recompiled or they can't be used under the new system.

There's plenty of examples where it's impractical or not sensible to recompile everything in one go, and some things - e.g., libraries from ROL and Castle - don't have source available to the unwashed. In any case, for most RISC OS usage, the AOF format doesn't have any intrinsic problems and contains perfectly good code, and there's no reason to throw the format away out of hand.

Interworking ELF and AOF
As demonstrated by 'link', it is perfectly possible to product binaries from both ELF and AOF objects. Of course, 'link' isn't freely available and adding ELF support to the RISC OS GCC linker 'drlink' would involve a very considerable amount of rewriting. Besides, maintaining our own tools is something we want to avoid or minimise.

The solution is the 'bfd' library (binary file descriptor). This library is used by the tools in binutils, and is responsible for understanding a variety of object formats, including the ELF, COFF and a.out formats previously mentioned, as well as plenty of other obscure ones. Adding AOF support to this is the obvious step. This isn't entirely a straightforward, but is certainly possible given some effort. As of writing, I have already made significant progress on this.

Adding AOF support to the bfd library has some interesting side effects. Apart from the whole issue of making ELF sensible on RISC OS, some or all of the following are possible with sufficiently complete support:
  • objdump, nm, strip - tools from binutils for examining object files will work on AOF files (as well as ELF)
  • It might be possible to run GDB (GNU debugger) because GDB uses bfd to understand files it loads.
  • 'ld' (the binutils linker) will work transparently on ELF and AOF. Indeed, it certainly wouldn't care if it was used just for AOF if that's what you were dealing with. 'ld' also has rather more features than drlink.
  • 'gas' can produce AOF files, although that's perhaps of more limited use.


Dynamic Linking and Shared Libraries
Most of what we've written above relates to static linking - that is, an executable contains all the code that it requires to run (apart from OS calls). We previously did a whole article on this, which you can read.

As I suggested earlier, AOF isn't terribly suited to dynamic linking, but ELF is, and 'ld' knows precisely what to do to generate dynamically linked executables. This doesn't immediately give us shared libraries on RISC OS, but going to ELF is certainly at least half the problem. To complete the picture, ELFLoader, and probably Unixlib need to be extended to be a full shared library manager. This part is still an unanswered question, but hopefully won't remain so for long.

Other RISC OS Formats
ELF executables perhaps don't lend themselves well to being RISC OS modules (although that could be proved wrong). Of course, GCC for RISC OS has never been capable of producing modules anyway, so it's slightly academic; although some of the issues in producing shared libraries are the same as producing modules, so the solution might end up being to ask 'ld' to do something slightly different when producing the binary.

What now?

The current plan for GCC and GCCSDK is to do a full release very shortly - there are a variety of fixes we would like to make available - two of these are yet more issues with the linker - this should be the last AOF oriented release. GCC 4.0 is also being looked at (the current version is a pre-release of 3.4.4). Given this, and allowing for testing of my bfd changes, there should be some programs for testing in January. See you then.

Links


Description of the ELF binary format
GCCSDK and GCC for RISC OS
GNU binutils
ELFLoader

Previous: Gaming news in brief
Next: Dmoz RISC OS section needs rescuing

Discussion

Viewing threaded comments | View comments unthreaded, listed by date | Skip to the end

Hi Peter,

Thanks for a really nice, informative programming article. Stuff like this is really interesting to hear; I hadn't really considered 'behind the scenes of the compiler' previously myself. Thanks :).

 is a RISC OS Usermd0u80c9 on 13/12/04 9:56PM
[ Reply | Permalink | Report ]

Peter,

Have you looked at using the ARM EABI? Doing so would be a significant ABI change due to the software FP calling convention, but the increased performance and convergence with a standard implementation may be worth it.

 is a RISC OS Usercbcbcb on 13/12/04 10:45PM
[ Reply | Permalink | Report ]

Not recently, and not in detail. I do try to follow these developments, as they affect ARM Linux the most. Producing such code shouldn't be a problem since we use essentially the same backend code, but the migration problems are not something I care to tackle presently, especially on a platform where not everything is open source.

 is a RISC OS Usermrchocky on 13/12/04 11:09PM
[ Reply | Permalink | Report ]

GCC rules

It might be a good idea to change to using an extention at the same time as moving to ELF. An ELF object file would be myprog.part/o and the AOF file would still be myprog.o.part1 This would allow binutils to look for the ELF file in the conventional way and would allow easyer copying of compiled cross compiled files.

 is a RISC OS UserJaco on 13/12/04 11:17PM
[ Reply | Permalink | Report ]

Jaco: I think you're confused. Whether or not a file appears as a RISC OS or Unix format filename depends upon the system is appears on, not whether it would be an AOF or ELF.

All the utilities look for files in "the conventional way" - that is, using Unix filenames. I'm not really sure what you're trying to get at here.

 is a RISC OS Usermrchocky on 13/12/04 11:24PM
[ Reply | Permalink | Report ]

Interesting article :) and important for the future of the GCCSDK.

Is the idea that ELF object files would be used to produce AIF executables, or that ELF files would be executed directly?

If it's the former, I don't fully understand the significance of ELFLoader; if the latter, does this mean that the ELFLoader module would need distributing with all applications?

Apologies if my confusion is missing the point.

 is a RISC OS Userflypig on 14/12/04 12:31AM
[ Reply | Permalink | Report ]

I don't understand technical Jargon much, but this is well explained, interesting to read and easy to follow.

Thanks for explaining it Peter.

 is a RISC OS UserSawadee on 14/12/04 6:49AM
[ Reply | Permalink | Report ]

flypig: The latter and not necessarily - it would be a similar kind of resource as the SCL/SUL/other module used by lots of programs. A couple of points about ELFLoader: 1) The manner in which it claims ELF binaries (registered as type &e1f) isn't entirely satisfactory, as this can be modified by other applications. 2) Ideally, it would be rewritten in something other than assembler (the first one to mention BASIC gets shot) to make it more maintainable. Additionally, it would be less painful implementing dynamic linking in C, afaics. I have a version of the module written in C, but there's a few bugs to iron out yet.

 is a RISC OS Userjmb on 14/12/04 7:37AM
[ Reply | Permalink | Report ]

<body armor on> BASIC!!! </body armor off> ;-)

 is a RISC OS UserJGZimmerle on 14/12/04 7:57AM
[ Reply | Permalink | Report ]

:mrchocky "All the utilities look for files in "the conventional way" - that is, using Unix filenames."

That's nice! I realy didn't like having to change the directory structure to use 'o' and 'c' directories.

Thanks!

 is a RISC OS UserJaco on 14/12/04 10:33AM
[ Reply | Permalink | Report ]

I think you are still confused - it hasn't helped that you really didn't explain yourself properly.

There is no change (and no planned change) to the way source and object file names are handled. The way I've described is (more or less) how the tools in RISC OS have always worked.

"having to change the directory structure". But there are tools to automate this. If you really must, you can turn if off, but you will cause problems for yourself, unless all your source and header files are in the same format.

 is a RISC OS Usermrchocky on 14/12/04 10:47AM
[ Reply | Permalink | Report ]

Thanks for an interesting article. In the "AOF vs ELF" section a number of benefits of the ELF format were mentioned. I'm curious to know (as a non-programmer) are there no corresponding benefits to the AOF format?

 is a RISC OS Userjms on 14/12/04 12:44PM
[ Reply | Permalink | Report ]

I.m.h.o. it would be a good thing to remove the special handling of source and object file names especially from GCC. But you work with it every day and don't see this so forget it.

 is a RISC OS UserJaco on 14/12/04 12:53PM
[ Reply | Permalink | Report ]

I don't forget it. I'm very much aware of the issue. But I think that an across the board change would cause a great deal of pointless upheaval just to appease a few users, not to mention be completely different to Norcroft.

It's not just GCC that acts this way, it's all Unixlib programs (optionally). There may be some provision I can make in GCC for accessing files in the non-RISC OS way, but it must be carefully done, as there's lots of potential ambiguity and confusion.

If you are really interested in this issue, then I recommend you investigate all the specific issues of how things work, then draw up a recommendation of what should be done.

 is a RISC OS Usermrchocky on 14/12/04 1:13PM
[ Reply | Permalink | Report ]

jms: the main benefit of AOF is already mentioned in the article - it is well supported on RISC OS and lots of code is already in this format.

 is a RISC OS Usermrchocky on 14/12/04 1:15PM
[ Reply | Permalink | Report ]

jmb:

Thanks for the explanation; that helps a lot.

After the wholesale changeover of GCC to ELF, will it still be possible to produce AIFs using ld at all, or will this have to be done with link?

Whilst I can see the benefit in terms of the maintenance of GCC, (and *possibly* in terms of dlls ;) ), the generation of AIFs is something that I personally would miss greatly.

 is a RISC OS Userflypig on 14/12/04 2:19PM
[ Reply | Permalink | Report ]

flypig: That's up to the GCCSDK developers. I doubt, however, that it would be sensible to completely drop AIF support (at least in the short term). Of course, this could simply be provided by the existing GCC, rather than future releases. Peter is better placed to comment on this than I am.

 is a RISC OS Userjmb on 14/12/04 2:52PM
[ Reply | Permalink | Report ]

Unless there's any practical reason, then in general we'd probably want to generate ELF executables. There's no pressing reason we particulaly need to stick to AIF, unless you have some tools that only work on them (say, !DDT: DeskDebug OTOH is getting ELF support).

I expect that adding AIF output to 'ld' wouldn't be especially hard, but it's extra effort. One of the things about ELF binaries is that the various tools also work on them as well as object files. This isn't the case for AIF.

 is a RISC OS Usermrchocky on 14/12/04 11:35PM
[ Reply | Permalink | Report ]

It might jsut be that I've had a depressing day, but all this seems like futile attempts to pull RO around, I was hoping that castle or RISC OS would choose to go the same route as Apple did, put there old system on ice, and develop a new one using a BSD operating system and building a vastly superiour system with emulation for backwards compatability.

I'll admit that the dated RO aproaches have been good, where else can you poke the operating system in wierd and wonderful ways out of a BASIC program, but I do worry there's not a strong future in it, and mixing the large reliance on SWI's and dynamic linking will only make the situation more of a headache.

But a move to ELF and dynamicly linked libraries for things which would currently be done via SWI's might be one way to allow for smoother transitions later on, if it ever happens.

 is a RISC OS UserNoMercy on 16/12/04 3:04AM
[ Reply | Permalink | Report ]

NoMercy: sorry, what? Please don't use this forum as a dumping ground for your depressing and irrelevant comments.

This approach is mostly certainly not "futile". If it were, then why do so many OSes use ELF?

"a new BSD operating system .. with emulation". You're aware that BSDs use ELF, right? And precisely why is emulation of any kind required?

"and mixing the large reliance on SWI's (sic) and dynamic linking". Where is this mentioned, I don't get it. SWIs (correct spelling) are an intrisic way of calling OS functionality, I hardly see the worth in getting worked up about them.

"if it ever happens". Not a very helpful comment, is it? It clearly already is happening.

Just what was your point? Did you have anything constructive to add - comments like yours really aren't at all inspiring.

 is a RISC OS Usermrchocky on 16/12/04 9:07AM
[ Reply | Permalink | Report ]

I don't think NoMercy has a problem with using ELF, I think he is questioning the logic of tweaking and advancing areas of an OS which is extremely dated in other ways. Is it really the best way forward to gradually improve RISC OS, or would we be better off 'doing an Apple' like NoMercy mentions, and start afresh from a modern base?

 is a RISC OS Userthegman on 16/12/04 10:30AM
[ Reply | Permalink | Report ]

Here are some ramblings for posterity that this article brought back from distant memory of when I was one of a small group of people working on the Norcroft toolchain for RISC OS. (E&OE!) Please forgive the parts where I drop into hideously detailed techie-talk! :-)

The history of ELF on RISC OS goes back at least 6 years to when I modified link to support a subset of ELF functionality. This first became public when we made a beta release of the 32-bit toolchain, although as I recall, we did not exactly trumpet it from the rooftops. Additionally, support was added for symbol definition files (files that the linker would output listing all the symbols in the generated output files and the values assigned to them), which was probably more useful for us to make building ROM images simpler than to anybody outside Acorn. So in the end, the linker could combine AOF, ELF and symdef files (COFF support was compiled out, IIRC). It could write out ELF relocatable object files and ELF program files (in addition to AIF and modules, of course)

I also made significant changes to libfile. libfile was also updated to support ELF object files and symdef files. Furthermore, I broke the implicit dependency that ALF files could only contain AOF files (and COFF objects only in AR-format library files): any library file format can contain an arbitrary mixture of any type of object file (including symdef files).

IIRC, the beta tools were distributed with ReadELF (or something with a similar nme) to help dissect ELF files. If I don't recall correctly, I just had it on my machine to help me debug the linker - sorry about that.

The major headache for object file interchanging is the relocation data. The ARM ELF spec was missing some required types of relocation, and the AOF specification includes relocations that require you to examine the instruction being modified and change behaviour based on the instruction you find - and, worse, allows you to modify several instructions! This just doesn't happen in ELF, since all ELF relocations must be described in a static table describing a mechanical unconditional alteration to the instruction.

I was quite surprised to see a module called ELFLoader pop up that *didn't* come out of Castle. I thought I'd registered that module name and SWI chunk, perhaps I just called my module "ELF" and gave it the title "ELF Loader". I certainly received the filetype allocation that I specifically asked for (&E1F - what else ;-) and made the changes to FileSwitch to support direct execution of ELF program files (i.e. you could *Run it just like files of type &FF8). Now that was one of the last things I did, so that change might never have made it out. My ELF module would load the binary and start it as an application correctly in exactly the same way that FileSwitch launches AIF images.

I never got around to modifying CMHG - it wasn't really urgent, because the linker would happily link the AOF file generated by CMHG with the other ELF object files.

The one important facility that was missing was automatic run-time decompression of ELF program files. This is not something that the wider non-RISC OS community is particularly interested in, I discovered. There are technical reasons for this (typically other systems like to memory map code directly from disc into memory - so you wouldn't want to have the code compressed), but with some hijacking of platform-dependent parts of the ELF header could no doubt have been used.

One problem at the time which may well have been sorted out now is that GNU bfd's ARM ELF differed in subtle, but irritatingly incompatible, ways with ARM's ARM ELF (for example, differing meanings for flag bits). I considered compatability with ARM's version to be more important (and indeed, we were looking at the way ARM did shared libraries in their ADS toolchain (the successor to ARM SDT - which is what Norcroft was, putting it simplistically). That work had not really progressed very far, although the beginnings of it were in place in the C compiler in that I removed the crusty way that module code was generated and replaced it by a much cleaner, less invasive (to the core compiler code) implementation [Aside: this is the stuff that loaded data address constants by just blindly stuffing the equivalent of "LDR rX, address:LDR r12,[r10,=-0]:ADD rX, r12, rX" (plus a relocation for the second LDR); somebody once found a case where the compiler used R12 for rX. This made the program concerned go bang! This was replaced using virtual register constructs that the compiler could actually recognise and optimise properly. End aside.] Once that had been tidied, the idea was to generate the shared library data loading code instead of just whacking the "LDR rY, [r10, #=-0]: ADD rZ, rY, rX" sequence into the code (which was the tidyup - because once real virtual registers are used and rY can be declared constant, a lot of peephole optimisations become possible that were not possible before when effectively two completely opaque instructions were inserted).

We never got around to modifying the assembler or the C compiler to generate ELF. It would have been a fiddly job, but not particularly difficult provided you could concentrate on it for however long it would take and do it properly.

I remember thinking at the time that the best way to do modules would be to generate relocatable ELF object files with program tables, which probably wouldn't have been too hard. This code would probably have been loaded into a separate dynamic area instead of the RMA, kept write protected and the data instantiated in another companion area or the RMA. With an appropriate loader, you could even have multiple versions of the same library loaded at once. The SharedCLibrary module would instead because a ResourceFS wrapper with a large ELF file in it (although there are interesting mutual-dependency problems if you go on to write your loader in C - and the way that GNU ld gets around this problem is based around not writing code that triggers any library calls until you can get the C library linked.)

Then one day, just like that, RISC OS was canned. The End.

 is a RISC OS Userstewart on 21/1/05 12:55AM
[ Reply | Permalink | Report ]

Please login before posting a comment. Use the form on the right to do so or create a free account.

Search the archives

Today's featured article

  • Drobe price comparison chart
    Checking out the competition
     21 comments, latest by govind on 14/11/03 2:15PM. Published: 9 Nov 2003

  • Random article

  • UniPod speed tested
    IDE and ethernet with a need for speed
     28 comments, latest by micken on 17/8/04 3:53PM. Published: 2 Jun 2004

  • Useful links

    News and media:
    IconbarMyRISCOSArcSiteRISCOScodeANSC.S.A.AnnounceArchiveQercusRiscWorldDrag'n'DropGAG-News

    Top developers:
    RISCOS LtdRISC OS OpenMW SoftwareR-CompAdvantage SixVirtualAcorn

    Dealers:
    CJE MicrosAPDLCastlea4X-AmpleLiquid SiliconWebmonster

    Usergroups:
    WROCCRONENKACCIRUGSASAUGROUGOLRONWUGMUGWAUGGAGRISCOS.be

    Useful:
    RISCOS.org.ukRISCOS.orgRISCOS.infoFilebaseChris Why's Acorn/RISC OS collectionNetSurf

    Non-RISC OS:
    The RegisterThe InquirerApple InsiderBBC NewsSky NewsGoogle Newsxkcddiodesign


    © 1999-2009 The Drobe Team. Some rights reserved, click here for more information
    Powered by MiniDrobeCMS, based on J4U | Statistics
    "D underheads R adicals O r B loody E xcentrics"
    Page generated in 0.2824 seconds.