Right, I've just run some benchmarks using nbench with the two compilers.
The result? There seems to be a very marginal speed increase in the 3.3 compiler. No doubt there are some cases where the code is much better, but those might be pretty limited.
The output from GCC for ARM is actually pretty good, so I would be surprised at overall substantial improvement. Having said that, the output from optimising for XScale or StrongARM may well show marked improvement.
Moral of the story: please check your technical claims.