I've been poking around in side the WD live using ssh (it's a pretty vanilla linux system with almost everything installed (though not tcpdump unfortunately)).
WHat I see is that when it's going very slowly, the problem is competing reads on the drive.
These sometimes appear to also be coming in over the appletalk connection that timemachine uses. I suspect that it is a problem that timemachine is forcing the lower software levels to do read-modify-write opertions which are dead slow.
to understand the problem, here's how time machine works with a NAS.
if you bacj up to a share, it creates a directory hierarchy in it called {yourhostname}.sparsebundle.
This set of files is in a known format and can be mounted as a group as a DISK IMAGE.
This image type is special because it is made out of thousands of small files in the "bands" subdirectory. I believe that the reason for this is so that if you want to do an incremental backup of your share, then hopefully only some of those bands will be changes and it will happen quickly. (there may also be a parallelism reason if you use all apple hardware, but I'm specualting here).
It also measn that apple can format the image regardless of what format the NAS is in. The need a specific HFS+ format level before they can use it.
These bands can be considered like "block numbers" in a regular disk. each band (in my machine) is 8388608 bytes long or 16384 (2^14) 512 byte blocks.
The trouble seems ot be that in some situations, the timemachine software (or maybe the disk image mounting layer) gets into a mode where it wants to read each block before it writes it. so it is reading data and writing it.
I'm guessing that in some cases it may be that not the entire block is written at once, so in order to add in the new data it has to read in the entire block, then write it back out again with the new data inserted..
Obviously if you are trying to write 256K (watching the tcp stream in tcpdump on the apple suggests that and that it seems to be doing at least some operations on 256K chunks (due to tcp window size I think)) requires you to read 8MB and then write 8MB this is not very efficient.
THis is not quite all ringing quite right, but I think this theory is heading in the right direction. The MBlive has 256MB of RAM in it according to /proc/meminfo, so it can cache a few of these 8MB blocks, but not enough of them.
It seems to be thrashing, reading htem in and out of memory. Maybe memory contention is also a probkem. That may explanin why a reboot sometimes helps. Might point to a memory leak as well I guess.
In addition I also kill the mediascanner process that sits there reading blocks and conending for the disk. that helps a bit..