It involves sending a parmeterised command to a drawing engine, which then does the drawing all by itself. The podule bus is only involved in the transfer of the command (which is, say, 8 words long). I.e. the drawing speed is completely dominated by the memory bandwidth available to the drawing engine (e.g. 1.2 GB/sec in Radeon's case), not by the bandwidth of the podule bus (8 MB/sec with DMA).
This is true for all drawing operations that benfit most from acceleration, i.e. where the drawing engine draws a lot of pixels, like rectangle copy (used for moving windows around the desktop) and rectangle fill (used for clearing window and menu backgrounds). In ViewFinder's case, since it caches sprites in the video card's memory as well, it even applies to sprite plotting (if the sprite data was not cached, it would have to pass the podule bus on each draw).