For tgz, what we found is that the slowness is mostly because of the gz part and most of the cloud VPS are not very fast at this. And the whole tgz by it's nature is single core.
For rsync, parallelism and buffer size were indeed a constraint. But these are both now configurable in the 5.5 and 5.6. Note that the concurrency her also depends much on the storage backend. For example, DO can handle only 20 at a time. But S3 can handle 1000s at a time. One has to experiment with the values a bit to figure the right number. Mostly s3 connectors don't publish ideal sizes and concurrency unfortunately.