Drobe :: The archives
About Drobe | Contact | RSS | Twitter | Tech docs | Downloads | BBC Micro

Of C Libraries and RISC OS

By Peter Naulls. Published: 3rd Jan 2005, 14:35:35 | Permalink | Printable

Unixlib versus the World

In my latest article, I'll again talk about some of the obscure, but very important, issues related to porting work I've been doing which are also crucial to many RISC OS programs. I've chosen to discuss the issues involved with C and C libraries on RISC OS. This is an even longer article than my usual offerings, but if you can hang in there, I hope you'll come out with a much improved understanding.

RISC OS and C
As you many be aware, the majority of new RISC OS programs, and a considerable proportion of older programs, are written using the C programming language. Indeed, much of RISC OS itself is written in C. I'm not doing to discuss C in great detail, since you'll probably be aware of it already at some level, and the details of programming it are mostly irrelevant to this article. What's most important to understand about C is that to ultimately do anything useful, programs need to either call functions which do some processing, or call the operating system services in some way (possibly by a SWI or assembler wrapper on RISC OS).

Apart from code provided by the program itself, these functions are stored in libraries, and are combined with the program during compile time (with headers referring to library functionality used during a compile and the final executable created from program code and library code at link time).

Unixlib and RISC OSC Libraries
There's an important distinction to be had here. In the first instance, there are libraries written in C (and sometimes with small amounts of assembler). DeskLib and OSLib are both examples of these. These types of libraries will mostly not concern us for the purposes of this article. The other type of library, whilst still written in C, contains functionality that is required to be there for any C implementation. The two libraries on RISC OS that fulfil this role are the SharedCLibrary and Unixlib. I'll refer to them, and others in this article as "a C library".

The functionality required to be present in a C library is covered by various specifications including the ANSI C specification and C99. The behaviour of the C compiler is also crucial in ensuring that specifications are met. There are also further degrees of functionality that may be met by a given implementation. For example, Unix systems will try to meet requirements laid out by POSIX and several related specifications. There are also GNU extensions, and under RISC OS, specific behaviour Acorn decided was required, and therefore implemented by RISC OS C libraries.

RISC OS C libraries
One of the questions you might ask is why RISC OS has two libraries. The question is rather more complex than it might first seem because of various versions of the libraries and ways of generating code that uses them.

The short(er) answer is that they meet two different sets of requirements. The SharedCLibrary was originally developed by Acorn to support C (and other high level language) programs under RISC OS. Its primary aim is to fully and correctly implement the ANSI C specification (and later C99), and little more - although it has had a few accretions over time. It's called the SharedCLibrary because it's a module and hence shared - programs talk to it via a small amount of code called stubs. This sharing means that it doesn't have to be combined with each and every C program on RISC OS therefore making them bigger, and also has the benefit that if bugs are found, a new version can be soft loaded. The SharedCLibrary has appeared in ROM in RISC OS machines for a long time, although many RiscPCs will now be running a soft-loaded version provided by Castle. We'll talk a little about this later on.

The SharedCLibrary has filled these roles admirably over the years, providing services to countless RISC OS C programs. The problem is that when it comes to converting programs from other platforms, (i.e. porting) the SharedCLibrary may let you down. In fairness, Acorn did provide TCPIPLibs, which is an implementation of BSD sockets, used by Unix networking programs. They also did provide their own "unixlib", which had a handful of functions often used by Unix programs, and plenty of programs have been ported by using the SharedCLibrary, but it's a long way from a satisfactory and complete solution.

To facilitate porting on a larger scale, and address issues it would be hard to resolve by simply using the SharedCLibrary or building onto it, it was decided that the best solution was to start from scratch around 1995. In truth, this wasn't entirely why it came about - Unixlib was originally created to support the GCC 2.4.5 port to RISC OS - but the result was the same. Unixlib became an alternative to the SharedCLibrary. Both are compliant with the ANSI C and C99 specifications, and except for a small number of cases due to requirements of greater functionality, Unixlib will work exactly the same when used to compile a program that also works under the SharedCLibrary.

As an aside, the Cygwin DLL under Windows performs much the same role as Unixlib does in RISC OS. But on most, but certainly not all, the norm is to have only one C library.

Why Unixlib?
The point behind Unixlib becomes obvious if you've ever tried to compile a program written for a Unix system on RISC OS. Whilst it's true that there are many programs which are well behaved and don't use any non-ANSI features and will work as expected using the SharedCLibrary, there's many more that won't compile - they use non-standard features, GNU extensions, or functions simply not specified in standard C. Indeed, Unixlib offers a whole host of advantages over the SharedCLibrary:
  • Filename translation - Many Unix programs assume filename formats quite different to those in use in RISC OS. We covered this in a this article in detail. The upshot is that the default behaviour for Unixlib is to treat all filenames as Unix format unless they are unambiguously a RISC OS filename, and only translate at the lowest interface to RISC OS. That is, the SWI call to OS_File and friends.

  • File and Socket handling - Unlike the separation of SCL and its sockets library, Unixlib's implementation is integrated with the open/close/read/write functions like a real Unix system and includes handling of various special files without any special RISC OS code in the program.

  • Extra headers and functions - Probably the most obvious. Unixlib contains much of the functionality specified by POSIX and other specifications. Some of these functions are dummy and some aren't complete, and some don't make sense to implemented on RISC OS. Not comprehensive by any means, but an extensive and heroic effort, and sufficient for the a very large number of Unix programs to work with little or no modification. Very useful.

  • Open Source - Any Tom, Dick and Harry can have a go at modifying Unixlib to fix bugs, add features, or generally make a hash of it. In fairness, there isn't much call to modify the SharedCLibrary since it is complete and essentially bug free, but potential modifications to Unixlib are to features the SCL doesn't provide.
And the down sides? Not so much of an issue nowadays, but Unixlib adds a minimum of 100k to your program to implement to basic level of C functionality and Unix compatibility. This is because Unixlib is linked, like most RISC OS libraries, statically to your program. In contrast to the SharedCLibrary, it's not a module so the executable must contain all the code it might potentially run. This is partly because GCC has never been able to generate module code, and partly because it would be extremely involved. In any case, this is often dominated in size by much larger, and also statically linked, ported libraries, so there is limited motivation to do this. A much more useful solution would be a RISC OS shared library system which would allow creation of a shared Unixlib with little extra effort, and with the considerable benefit of automatically propagating any fixes instead of old binaries contain potentially disastrous bugs.

What is SharedUnixLibary?
I want to say very little about this, since it's been a huge cause of confusion for no good reason. Its singular, and only purpose, is to catch a callback that RISC OS makes when an application quits. RISC OS helpfully insists on calling this (for a variety of reasons) when the listening application might be paged out, thereby jumping to random code and quickly causing your computer to freeze or crash some unrelated program. This code is therefore in a module and only passes the message onto Unixlib when it is safe to do so. The issue has been fixed in Select/Adjust, but Unixlib has not been modified to avoid this on machines running those versions of RISC OS.

Compilers and C Libraries
To complete the picture, I'd like to briefly discuss to interaction between the two main RISC OS compilers - Norcroft (Acorn C/C++) and GCC with the two C libraries.

By default, GCC will compile your program against Unixlib and Unixlib headers. but this is not set in stone, and with a command line switch, you can compile with the alternate stubs and headers for the SharedCLibrary provided with GCC.

Norcroft, by contrast, by default unsurprisingly compiles with the SharedCLibrary stubs and headers it is bundled with. This too, can be modified with some switches. These options are detailed on on my C programming site.

Because Norcroft is a strict C compiler, and doesn't understand various GNU extensions implemented by GCC and has a different range of warnings, maintaining the ability to use Unixlib, let alone compile it, with Norcroft has become problematic. Not because it's impossible to modify Unixlib to perform sensibly, but because of the maintenance factor: Unixlib has had considerable changes in the last two years, and it may be that the current version is the last it is possible to build with Norcroft, especially in light of changes I'll discuss in the next section. Fortunately, the number of people doing this is quite limited.

To round this off, there's one final complexity. The advent of the 32-bit SharedCLibrary means that a new version of stubs, the glue library that talks to the SCL module, has been required. That's well and good, and of course a new version came with the Castle C/C++ suite and GCC's stubs have also been updated. The problem in that once you've built a program against these stubs, you have to have the 32-bit SharedCLibrary loaded all the time. Generally, this isn't a big deal - on RISC OS 5, it's in ROM, and most other systems it's loaded by the first program that requires it, or during the boot sequence.

But there are some instances where this isn't practical, or you'd prefer not to load the 32-bit SharedCLibrary, or it's a system where it simply hasn't been installed. To avoid this, you can instead use RISCOS Ltd's "StubsG" instead of the the regular stubs provided with Castle's C/C++, or older versions of Acorn C/C++ program and your program will work equally well on systems that do and do not have the 32-bit SCL loaded.

GCC doesn't provide an equivalent of StubsG, because the task involved in making one is quite involved, but there's nothing stopping you using StubsG with GCC and instructions are provided for doing so.

The problem with Unixlib
We've made great progress with Unixlib - and in 2004 we have no less than 70 ChangeLog entries. Whilst many of these changes were very important to improving Unix compatibility and generally fixing bugs, they can be mostly characterised into two kinds. The first is in RISC OS functionality and the crucial translation between Unix behaviour and RISC OS behaviour: the magic that makes Unixlib work as sensible system. The other functionality is code, whilst required to be in a C library, contains little or no RISC OS specific code. In many cases, this code has been taken from other systems - sometimes modified to fit into Unixlib - and sometimes used verbatim. In other instances, the code has been written from scratch. Because Unixlib exists on a minority platform, and has only been widely used as a result of programs converted to RISC OS in the last few years, it's not always the case that code in Unixlib has been widely tested with the huge range of inputs that result from running a large number of programs.

Many instances of this problem rear their heads much earlier on. Whilst spotting programming problems earlier on almost always saves time and effort, they are no less annoying. The issue is the layout and content of C header files - the files that advertise the behaviour of the C library. These must be laid out with certain declarations appearing (or not appearing), and they must also advertise all the functionality that is present, and none that isn't. Faults in header files will result in programs either not compiling at all - if some unusual sequence of includes doesn't advertise the right behaviour, then the program might not buildable at all. Or it might be configured incorrectly resulting in different problems during compile or when running.

It's true that there are tools to check the behaviour of header files, and that these issues can be fixed given sufficient effort, but it's a huge amount of time not available to a limited number of developers. There are still further bugs in Unixlib that don't even have entries in the bugs database because it's not even clear what's causing them. Such bugs have prevented programs like GNUChess and bash being released for RISC OS.

Beyond Unixlib
I mentioned that Unixlib contains code taken from elsewhere. In particular, Unixlib contains code taken from the GNU C Library. The GNU C Library, or glibc, is certainly not the only free Unix C library, but it is probably the most accessible, widely used (it's used on almost all Linux systems) and it's written to support use on multiple operating systems. Indeed, with reference to the previously section, Unixlib uses a small number of header files taken from glibc. With the assumption that glibc contains far fewer bugs than Unixlib and that the programs we're trying to make work on RISC OS have long ago been built against glibc and are known to work, this clearly a step in the right direction for compatibility.

You might therefore be asking why Unixlib doesn't use more of glibc. Or perhaps, why RISC OS instead isn't using a port of glibc instead of Unixlib. Probably, if we were starting from scratch, a port of glibc would certainly be the correct way to go, but 10 years ago, when Unixlib was begun, things were quite different. And right now, we have a great deal of code in Unixlib which is the crucial translation between Unix and RISC OS and simply wouldn't exist in any other C library.

As for using more of glibc in Unixlib, this is a solution which I am now actively investigating, but to import much more code than we have presently means a radical alteration of the way Unixlib is put together. glibc's files are arranged quite differently in order to support multiple systems, and has a proliferation of public and private header files which must be used in the precise manner dictated by glibc's build system. To compound the issue, glibc uses some structures differently to Unixlib - notably, the one containing information about open files, so we must be very careful in mixing files. Glibc also contains much more extensive wide character and locale handling and this causes its own set of problems, especially since the latter in Unixlib makes use of the Territory module. I've already had to modify GCC to support the GNU alias attributes - something that glibc makes extensive use of.

There are other reasons to use parts of glibc too - it contains a considerable portion of the code required to support use of shared libraries. Ultimately, the point is a bug and maintenance avoidance exercise. It may prove indeed, that the best solution is to integrate RISC OS parts from Unixlib into glibc. Whilst this might mean we have a less free hand in modifying code, it has the same benefit of other RISC OS ports - improvements and bug fixes (at least, to generic code) are made by an army of people with little or no knowledge of RISC OS.

Licensing
To round off this feature, I will briefly mention some parallel issues that have been considered during Unixlib's development. As with much open source software, Unixlib is governed by well known licenses. In particular, most of Unixlib is covered by the BSD licence and by the Lesser GNU Public Licence, although there are some other files that are covered by some more obscure but compatible licenses. Unixlib in the last few years has also managed to acquire some code that is not LGPL, but GPL. I won't go into the implications of this here, but the number of files is small, and we intend in the next maintenance release of GCC for RISC OS, to replace these with versions with alternate licenses in the hope that attempting to clarify the legal standing of Unixlib might encourage greater usage. For reference, glibc mostly uses LGPL code and a variety of BSD-style licenses.

Onwards
Whilst my present work with glibc is a long way from being anything but a research exercise, and I won't be giving anyone and programs using it for a long time, it's clear that continuing to patch and improve Unixlib piecemeal can only go so far. What the precise solution is likely to be for RISC OS is going to come down to a good deal of debate, experimentation, and plenty of persistence. It must also compliment the related work on bringing ELF to RISC OS as well as the effort to port GCC 4.0 to RISC OS. In particular, this last effort relies heavily on the way the C library works, so this must be done correctly.

In conclusion, it looks like we have our work cut out. If you've got any input of your own, we'd be glad to hear from you.

Links
GCCSDK and Unixlib
GNU C Library
RISCOS Ltd's StubsG
Castle's C/C++ Development Suite
RISC OS C Programming

Previous: Select hardware profiles uncovered
Next: Jan sale on Unipod features

Discussion

Viewing threaded comments | View comments unthreaded, listed by date | Skip to the end

WRT Licensing:

If you've got GPL code in the library then the whole thing is GPL (section 2, including the description beneath the page break). Once the whole distribution is under the GPL, you can't just remove those files that originally introduced the GPL. If you want to return the library to non-GPL status then you're going to have to roll back your CVS changes of the entire library to the point at which the GPL components were added and develop from there. You can't just 'take them out again', because by including the GPL components you've placed the entire distribution under the GPL.

The standing of a library the moment that you incorporate a GPL component into it is very clear - it and everything that links against it is under the GPL. Once you've started to use the GPL, the fact that there are LGPL and BSD license components in it is pretty much irrelevant. You still have to say that there are parts under those licenses (because that's what those licenses say you must do), but the freedom to do what you like under those licenses is subsumed under the all encompasing "thou shalt honour no god before me" GPL (section 4).

Whilst I'm sure that the discussion of the issue of incorporating GPL sections into the library was long and difficult between the various developers and users of UnixLib, it's hardly a trivial thing. I don't use UnixLib, so maybe I missed it, but with such a large change in the license for a library like that I would have thought that I would have at least heard about it...

 is a RISC OS UserGerph on 3/1/05 4:06PM
[ Reply | Permalink | Report ]

s h a l t = shalt?

 is a RISC OS Userhutchies on 3/1/05 8:11PM
[ Reply | Permalink | Report ]

Yes. Ian's anti-swear filter thought it was something else.

Chris.

 is a RISC OS Userdiomus on 3/1/05 8:43PM
[ Reply | Permalink | Report ]

Just dropping in from the BeOS side of the OS wars here...

There is a way you can legally include GPL'ed code in a BSD/LGPL/MIT/Whatever licenced library - taint it at runtime.

Assuming you have either dlopen() or a native totally dynamic (not dynamic linking) addon system, you can have the library dlopen() the GPL'ed objects at runtime. This means that the user has tainted it, not the developer, so its legal. This is used to keep the FFMpeg libavcodec (core of mplayer and VLC on UNIX, etc) under the LGPL - if the user provides the GPL'ed objects, it will load them at runtime and taint the code then and only then.

Of course, if RISCOS doesn't have a native add-ons system, and dlopen() is in UNIXLib, then you've got a problem....

 is a RISC OS UserMYOB on 4/1/05 1:22AM
[ Reply | Permalink | Report ]

Though I have no idea if it's there or not, I highly doubt there's a dlopen() inside Unixlib.

With luck and the hard work of some great minds, we'll have a dynamic linking system in a few years time, and there won't be much worry about a LGPL licenced library, for now I suspect most things built with unixlib are not a worry, and until then there's stubs.

*dreams of cross-language dynamicaly-loaded shared libraries* yeah I can dream like the best of em :)

 is a RISC OS UserNoMercy on 4/1/05 4:00AM
[ Reply | Permalink | Report ]

Getting into a debate about licensing of Unixlib was really the last thing I wanted to do. I've already received abuse and grief becuase of my insistence on trying to clarify the situation for the benefit of RISC OS users and developers. As they say, no good turn goes unpunished. The clarification of the Unixlib licensing was already well in hand before I wrote this article and will be finished in time for the next release.

It's a real shame that this is the _only_ bit of my article that's been picked up, when what I really wanted to was encourage construtive debate about what the future of Unixlib shoud be. Perhaps after 3300 words, I really left nothing else to say.

NoMercy: "...we'll have dynamic linking system in a few years (sic) time". What an odd comment - we _already_ have a dynamic linking system, courtesy of rink. If you really meant shared libraries, as suggested by your later comment, then what makes you think they will take so long? I don't think you would have made that comment if you'd read all the material I referred to on shared libraries. If someone was really really keen to get (ELF) shared libraries on RISC OS above and beyond all other considerations, it could probably be done inside a month.

I hope someone else might have some more positive comments including thanking the early Unixlib and GCC developers for what turned into an extrodionary amount of work, which has in turn enabled myself and many others to help RISC OS users.

 is a RISC OS Usermrchocky on 4/1/05 12:11PM
[ Reply | Permalink | Report ]

mrchocky: "I hope someone else might have some more positive comments " I don't think you need to be defesive - I'd sugest that the lack of comments (excl. licensing) implies that it was a thorough and accurate article. (Of course it is also a tehnical article which immediately limits the number of people who might reply...) Adam

 is a RISC OS Useradamr on 4/1/05 3:24PM
[ Reply | Permalink | Report ]

Hmm, sorry about the poor spelling!

 is a RISC OS Useradamr on 4/1/05 3:25PM
[ Reply | Permalink | Report ]

Like "adamr" said, too technical for me!

Peter, does the article mean that we are in a far better position now than ever before to "modernise" RISC OS in a big way, from a technical aspect?

It sounds like we (or RO developers) need to decide how we are going to correctly standardise our OS programmes, and like conforming to where we think RO should be?

It looks a well written article thanks, but I'm still reading it to gain a better understanding!

 is a RISC OS UserSawadee on 5/1/05 7:34AM
[ Reply | Permalink | Report ]

Yes, I think that's it. It is a well written article with well reasoned arguments that are hard for anyone else to argue against. In particular, I doubt that anyone else knows Unixlib as well as Peter. His comments on its problems and future development stategy are taken from experience that he alone has. I think what Peter is doing is fishing for ideas, just in case he has missed something that will simplify the task ahead. For this reason he has explained things from first principles, to try to bring in non-programmers. But it is difficult to argue with the strategy of building on what is already there and lifting code from elsewhere when this can save effort.

 is a RISC OS Usermrtd on 5/1/05 9:11AM
[ Reply | Permalink | Report ]

Well I'll admit I wasn't totally aware, and looking at rosh's work, it looks like a system of shared libraries. I'm still not optimistic on the time scale before we see it, or a similar system in common use.

 is a RISC OS UserNoMercy on 5/1/05 10:56PM
[ Reply | Permalink | Report ]

Care to provide even the most basic of reasoning for your time prediction? If you want to contradict my evidence, that's fine, but please actually provide some of your own. If not, we'll have to dismiss your statement as speculative pessimism.

As it as, and as I've said, the development of a shared library system is well in hand, but due dilligence must be done so we have a good solution.

 is a RISC OS Usermrchocky on 19/1/05 5:19PM
[ Reply | Permalink | Report ]

I had missed the interesting article about bringing ELF to RISC OS that is mentioned at the bottom this article ([link]) I've added my own personal historical comments to that article - perhaps it should have been attached to this article instead.

People are bandying around timescales for implementing shared libraries + dynamic linking. One important question is: does that include adapting the SharedCLibrary module itself?

Ideally, the dynamic linker would underpin the whole OS, with the built-in ROM libraries being no different to any other library (except for having fixed library numbers - because you wouldn't be able to have a dynamic number for ROM libraries as you couldn't patch them!) If you are going to ignore the SharedCLibrary, then it becomes a much easier job and I concur that it could be done in a month - if you only concern yourself with the technical aspects of the work.

Then, assuming that you would load libraries from System:Libraries.libname.version, you could work on the changes to allow all the libraries to be managed with !SysMerge or whatever the modern equivalent is nowadays :-) Then you need documentation and guides for developers.

On the bright side, you could probably get away with releasing the system to run on top of RISC OS 5 - particularly if the FileSwitch hooks I mention in the other article comments are still there in FileSwitch - before committing it as the basis of the next major revision of the OS (if there is one)

 is a RISC OS Userstewart on 21/1/05 1:16AM
[ Reply | Permalink | Report ]

Stewart: Indeed. Ideally, yes, it would (the dynamic linker). Alas, neither Castle nor ROD have publically stated interest in shared libraries to any degree (who knows, internally, they might be dead keen on the idea), so integration into the OS is going to be limited.

As for the SCL, having it as a Shared Library is of rather limited value to myself and my colleagues, even though it might be of academic interest, since we use Unixlib almost exclusively. In any cae, we simply don't have access to its source.

One thing I failed to mention about Shared libraries is that it's quite possible to have multiple versions loaded - this is mostly useful when testing development or debug versions loaded concurrently without interfering with the operation of other programs using another version.

Variants on this behaviour seen in Linux include overriding of symbols for memory checking (electric fence) or other unusual behaviour (fakeroot).

 is a RISC OS Usermrchocky on 21/1/05 3:23PM
[ Reply | Permalink | Report ]

Please login before posting a comment. Use the form on the right to do so or create a free account.

Search the archives

Today's featured article

  • Review: A9home v. Koolu
    Clash of the tiniest
     31 comments, latest by polas on 18/10/07 6:03PM. Published: 15 Oct 2007

  • Random article

  • OpenBSD Comes to ARM
    No Shortage of Operating Systems
     2 comments, latest by guestx on 17/2/04 12:53PM. Published: 12 Feb 2004

  • Useful links

    News and media:
    IconbarMyRISCOSArcSiteRISCOScodeANSC.S.A.AnnounceArchiveQercusRiscWorldDrag'n'DropGAG-News

    Top developers:
    RISCOS LtdRISC OS OpenMW SoftwareR-CompAdvantage SixVirtualAcorn

    Dealers:
    CJE MicrosAPDLCastlea4X-AmpleLiquid SiliconWebmonster

    Usergroups:
    WROCCRONENKACCIRUGSASAUGROUGOLRONWUGMUGWAUGGAGRISCOS.be

    Useful:
    RISCOS.org.ukRISCOS.orgRISCOS.infoFilebaseChris Why's Acorn/RISC OS collectionNetSurf

    Non-RISC OS:
    The RegisterThe InquirerApple InsiderBBC NewsSky NewsGoogle Newsxkcddiodesign


    © 1999-2009 The Drobe Team. Some rights reserved, click here for more information
    Powered by MiniDrobeCMS, based on J4U | Statistics
    "Drobe often has glaring factual errors that could simply be avoided with the bare minimum of research"
    Page generated in 0.2085 seconds.