RISC OS Memory ProtectionBy Peter Naulls. Published: 25th Apr 2005, 12:20:20 | Permalink | Printable
Corruption at the lowest levelsOne common complaint or feature request for RISC OS improvement is to add "memory protection". This is largely a result of the relative ease of which single programs can take out the entire operating system, combined with a misunderstanding of what precisely memory protection is.
In this article, I'll try and cover some of the issues around memory protection, and why RISC OS is often so susceptible to breakage and some of the measures which can be taken to improve the situation.
RISC OS and Memory Protection
Of course, even a basic grounding in computer architecture will teach you that RISC OS does indeed have memory protection, and that it's essentially the same mechanism that's used in most other operating systems with any degree of complexity. I won't get into the nitty gritty of the memory protection system and the difference between physical and logical memory (sometimes called virtual memory, which is not to be confused with swap space), but it should be sufficient to say that a running application cannot see the memory used in wimpslots of other applications. It also cannot see some parts of memory used by RISC OS itself. Other parts of memory will be readable only, and some parts will be freely readable and writable by all programs.
Some simple operating systems found in embedded devices do not have any mechanism of this type, either because the CPU does not support the required functionality or the lack complexity of the software does not warrant it: they use what is called a flat memory model, where all the memory is visible to the entire system. It's only these systems that you could really say lack memory protection.
The Problem With RISC OS
There are several issues that can leave to instability in RISC OS. The root cause of all of many of these is the ability to read and write certain areas of memory with impunity, and they can be categorised into three main problems:
Low Memory Access. In the C programming language, a null memory reference is represented by a value of zero, meaning a value which does not refer to anywhere. There are instances where zero value memory references are used in BASIC, but that tends to occur less often because of things like BASIC's comprehensive string handling.
Problems arise when programs erroneously try to access memory at location zero, or small increments after it. This memory contains nothing of interest to normal applications, and it is a program bug to try and access it. On most systems, this would correctly cause an immediate crash. But not on RISC OS - instead, the values (often, ARM instructions) are read, and the program may continue on for a time until it later tries to use the values resulting in a crash well removed from the point of the program fault. Or, it might try and display the values because it thought it was looking at a message - this is precisely the cause of the "ofla" bug seen in some RISC OS programs.
Note that it's rarely useful for a program to try and stagger on after a memory access failure - not only does the programmer want to know about any failures as soon as possible, but further indiscretions the with bogus data could lead to other crashes elsewhere in the system or data corruption in the document the program is currently handling.
The situation on RISC OS prior to RISC OS 4 is even worse: low memory areas were writable as well as readable. But the bottom line is that there shouldn't be any reason at all for applications to be reading or writing memory below its start address (the 32k between zero and the 0x8000 wimpslot address), and attempt to do so should result in an immediate signal being sent to the application (which normally means the application immediately quits - some programs may attempt to die more gracefully).
At present, as mentioned earlier, programs attempting to read low memory will succeed, with a value that will often cause grief much later in the program - it's these bugs can have caused a huge waste of developer time. Being able to immediately pinpoint the point of failure can be very important.
So why isn't this the case? Well, it can be, but there are a number of minor obstacles. Some SWIs such as OS_GetEnv return workspace in this memory, and the ShareFS filer also has memory it validly accesses here. Of course, these could be moved, and it's very likely that there are a number of other issues that would have to be addressed before this could become the default in a version of RISC OS. In the meantime, Adrian Lees has written a program "Prot1K", which can be fetched from his pages. This prevents RISC OS applications reading the first 1KB of memory - this will still allow OS_GetEnv and friends to work, but it will crash the ShareFS filer. Using Prot1K allowed me to immediately spot some long-standing low memory accesses in Unixlib and some of my other code. I encourage you to try it on your own programs.
The RISC OS module model. RISC OS modules are very powerful; they run with the same status as the rest of the operating system (which is mostly built from modules itself), which means that any badly written module has free reign to trash whatever it likes. On top of that, the module area, where modules themselves live and their workspace is allocated from, is readily accessible from RISC OS applications. This has advantages - it means that modules can directly pass back references to buffers they have, avoiding a copying mechanism. This design accounts for much of the responsiveness of RISC OS on relatively low specification hardware, but is clearly also wide open to abuse by misbehaving applications or other modules.
There's not a great deal that can be done about this - changing the system would require huge changes in RISC OS and many applications. One mitigating suggestion that has been mentioned is to add a module flag that could be used by new modules (or set in old modules in which it is possible to release new versions of). This flag would indicate that the module code ought to be loaded into memory that is read only to the module and the rest of the OS, and completely inaccessible in RISC OS applications. Buffers can still be passed back to programs via module workspace, which many modules claim anyway, and is readable by everyone.
One of the reasons that modules are so pervasive in RISC OS is that there are a number of OS facilities that simply can't be accessed by regular programs. To get access to certain data or be informed of certain types of events, you need to have a module. Clearly there is room here for improvement in RISC OS to provide new APIs to access such services, and reduce the need for modules.
Garbage in, Garbage out. Passing nonsense to a SWI has traditionally been an excellent way to take out the system. Often the reason for this is really the same as the above - the ability of modules to access anywhere in memory. For speed, or perhaps just laziness, parts of RISC OS don't do as much checking as they should, and likewise many third party modules and applications.
This is a situation which can, and has been improved. Many of the improvements in Select, although often seemingly minor, have been to considerably improve the robustness of RISC OS, part of which is to include such checking. This is why Adjust/Select is generally considered much more stable than previous versions.
As for avoiding passing rubbish around - either within an application itself, or to parts of the OS, there are freely available tools to help C/C++ programmers.
One of these is Fortify, which provides wrappers for the memory allocation facilities used in C - the calls to malloc, free, et al. Fortify is not written for RISC OS, but works perfectly well with it, and has been invaluable in spotting buffer overruns, naughty use of memory, and other bad behaviour. Providing very similar facilities, but arguably better tuned for RISC OS, is MalCheck.
These tools can prove to be very important, but one of the down sides in using them is that you need to recompile your entire program, and make sure that the memory check header is used in all files where allocation functions are used. If libraries the program uses expect your program to free memory they've allocated or similar, then these libraries will need to be compiled with the headers too.
Recognising this issue, I've made some changes recently to GCCSDK. This makes use of an obscure linker feature whereby the names of symbols are changed when the program executable is generated, so that the memory allocation functions used in all parts of the program can be redirected via the memory checking functions if a single option is specified. This means that the single step of linking is only required to generate a binary either with or without memory checking (which naturally degrades performance), and also means that there's no penalty when it's turned off. You can read more about this in my post to the GCCSDK mailing list. This has already proved to be valuable in finding issues that would otherwise be very hard to spot.
There's perhaps quite a bit more that could be said, but I think I've covered the main issues, and you'll understand my glossing of some points in the name of brevity. But now you know; the next time someone asks for memory protection in RISC OS, point them at this article.
Previous: "Vast majority" of ROS 4 now 32bit
Next: Wakefield 2005 theatre details
DiscussionViewing threaded comments | View comments unthreaded, listed by date | Skip to the end
Please login before posting a comment. Use the form on the right to do so or create a free account.
Search the archives
Today's featured article
RISCDomain magazine reviewed
A media watch special
9 comments, latest by druck on 30/10/07 8:55AM. Published: 20 Oct 2007
CJE RAM special offer not so special
25 comments, latest by apdl on 26/6/05 6:38AM. Published: 23 Jun 2005
News and media:
RISCOS Ltd •
RISC OS Open •
MW Software •
Advantage Six •
CJE Micros •
Liquid Silicon •
Chris Why's Acorn/RISC OS collection •
The Register •
The Inquirer •
Apple Insider •
BBC News •
Sky News •
Google News •