I've not heard the term "block multithreading" before. Under UNIX and similar systems, you only really have one extra entry point (the signal handler); everything is done via system calls blocking and letting another process run until the process can be unblocked (data loaded from disc, data arriving on a socket, a file changing, etc). There's no reason the API for RISC OS couldn't be adapted to do the same; it's just loads of software won't work because they expect to be the only thing running between Wimp_Poll calls. Now, extending the Wimp_Poll API is one solution, but old software will still have to block the entire system; and given most software is old, and the likelyhood of new software being written is astonishingly tiny, there's little point.
As for the ARM not containing multithreading hardware; it doesn't contain anything like Hyperthreading. But it has everything that any other OS can use quite effectively; a high-resolution timer, an atomic swap instruction (which was introduced for the multithreading explicitly), different modes for kernel and user land, and virtual memory. You don't need anything more.