SSHFS read speed significantly slower than scp with the same target

djxx

I'm trying to use an SSHFS volume and noticing some speed issues.

With SSHFS, I can write to the remote server at 50 MB/s, but I can only read at 7 MB/s.

When using SCP with the same remote server, I can read and write at 50 MB/s.

The network link, local disk, and remote disk speeds all exceed 50 MB/s.

Has anyone else experienced this, and possibly have a fix?

james

Hello @djxx sshfs and scp can not be compared in the way you do.
scp is just a tool aka secure copy (remote file copy program) while sshfs is a filesystem client based on ssh.
scp can only copy and read data while sshfs creates a whole file system with permission and so on.

Still, there might be some parameters in Cloudron for SSHFS that might could get improved.
Could you disclose how you did your testing and also what provider for SSHFS you are using?

djxx

@james Thanks for your response. Yes, I'm aware the tools are different, and that sshfs will have more overhead. But the scp speed shows a maximum throughput of 50 MB/s, and I cannot believe that a properly configured sshfs connection has an 85% performance hit vs scp.

The remote server I'm mounting through SSHFS is a server in my LAN and my cloudron server is with Hetzner. My home connection is gigabit fiber.

I did my testing by mounting via SSHFS and just seeing how quickly I could read/write with this:

dd if=/dev/zero of=tempfile bs=1M count=100 conv=fdatasync
dd if=tempfile of=/dev/null bs=1M count=100

For my SCP test, I just copied the same file both directions.

I tried different sizes and counts for dd; the writing (of=tempfile) is always 50MB/s, and the reading (if=tempfile) is always 7MB/s or less.

nebulon

A drop in speed like that seems quite huge indeed. Overall though sshfs on Linux is implemented via a userspace filesystem called FUSE and likely can't be much compared with scp. Especially the read performance being slower is curious though, I would have expected the reversed.

If you find any sshfs/fuse mounting options which improve the performance, the current mountpoint systemd config looks like:

...
Options=allow_other,port=23,IdentityFile=/home/yellowtent/platformdata/sshfs/id_rsa...,StrictHostKeyChecking=no,reconnect
Type=fuse.sshfs

So basically Cloudron sets up the mountpoint with all defaults, maybe there is some room for improvement. I will run some tests in a bit on my sshfs volume.

imc67

I’m curiously following this post because I have for backups a sshfs / rsync from a NetCup server to a Hetzner Storagebox and noticed that backups are really really fast but restores takes hours/days!

nebulon

So I ran a couple of tests now and I get pretty consistent speeds from my home server to a hetzner SSH storage bot. Both scp and sshfs pretty much perform the same:

nebulon@saturn:/mnt/cloudronbackup/test$ dd if=/dev/zero of=tempfile bs=1M count=100 conv=fdatasync
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 20.1171 s, 5.2 MB/s
nebulon@saturn:/mnt/cloudronbackup/test$ dd if=tempfile of=/dev/null bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 10.0484 s, 10.4 MB/s
....
scp -i /home/yellowtent/platformdata/sshfs/id_rsa_... -P 23 tempfile u193526@u193526.your-storagebox.de:./test/
tempfile                                                                                                  100%  100MB   4.9MB/s   00:20

So I can't say I can reproduce those results.

For the cloudron backup. Once a snapshot is uploaded for backups, the rotation and uploading of only changed data is very fast. The backup rotation is special cased for sshfs to run the filesystem operations to create hardlinks directly on the storage server. This makes it fast. During a restore, now all the data has to be downloaded, which will of course take much much longer than just uploading the changed data every day.

djxx

As a data point, I was also using a hetzner storage box previously. 10 MB/s writes, 1.5 MB/s reads. The speeds used to be faster, but I moved my server to the US and storage boxes aren't offered in the US. Moving the data to my server gave me a 5X write speed, and a ~5X read speed - but I still think that a 7 MB/s read vs a 50 MB/s a write points to some configuration issue.

Since SSHFS is barely maintained, are there other volume options coming? From what I've read, CIFS could work if we override the user ID at the mount configuration level.

djxx

Another datapoint. I tried using sshfs on my lan to the data server, and I got 112 MB/s write and 117 MB/s read - both of which are right at the theoretical limit of the gigabit connection. Not to mention it's taunting me with the read speed actually being faster than the write speed.

I decided to do another test with my laptop <-> hetzner server. So now we're transferring between the SSD on the VPS and the SSD on my laptop. The speeds are the same:
writing from laptop to VPS: 50 MB/s
reading from VPS to laptop: 7 MB/s

I checked and both my laptop and Cloudron are using the same version of SSHFS and Fuse:

SSHFS version 3.7.3
FUSE library version 3.14.0
using FUSE kernel interface version 7.31
fusermount3 version: 3.14.0

To get the best picture possible of the traffic, I put a fast watch on the connection:
watch -n 0 ss -t state established dst <server_ip>/24

With this, I can see that the write sizes are ~10x bigger than the read sizes. I'm wondering if this is why the performance difference is only seen in WAN situations; because 10x the round trips hurts a lot more on the WAN than the LAN. And for those of us with storage boxes in Europe and servers outside of europe, 10X the round trips really hurts.

I finally stumbled across this article (https://www.admin-magazine.com/HPC/Articles/Sharing-Data-with-SSHFS) which does some pretty detailed performance testing and tuning with SSHFS.

The options they suggest for the sshfs mount didn't help much. I did notice that using direct_io can sometimes make the read speeds go up to 20 MB/s, but it's not reliable.

I'm wondering if we're now into the realm of TCP configuration (which is the article's option #2) to increase a TCP buffer sizes. This would be a server-wide change, and is out of my depth. What are your thoughts, @nebulon ?

robi

To me this points to the kernel buffering writes much more than reads and the reads may have additional checks that further slow it down.

You should see a difference with SCP file transfers being fast, since it just streams data of known size down the pipe.

The following links may provide more hints:

And here are some insights on FUSE from someone who built their own FS on FUSE:
https://blog.gistre.epita.fr/posts/angelina.kuntz-2024-09-22-how_to_implement_our_own_file_system/

djxx

@robi - I wonder if it still does any buffering when writing to /dev/null ? Since that's what the dd read command above does. In any case, this suggestion caused me to revisit the direct_io option. It says it disables the kernel paging cache, which does seem to give the most consistent performance improvement.

Yet Another Data Point - I did a lot more testing today, and I think I'm as far as I can go. The good news: I can consistently get 16 - 25 MB/s read speeds.

TL;DR: using this command gives me the best read performance (2x-3x improvement): nice -n -10 sshfs -s -o direct_io,compression=no

Why I'm using these options:

direct_io

direct_io disables caching, and had quite an interesting effect on reads.

Using the -f -d options I was able to watch the packets going through. I was wrong before about the writes being bigger than the reads; they're not. But the writes are being done more parallel than the reads.

Before direct_io:

[01315] READ
  [01308]           DATA    32781bytes (31ms)
  [01309]           DATA    32781bytes (31ms)
  [01310]           DATA    32781bytes (31ms)
  [01311]           DATA    32781bytes (31ms)
[01316] READ
[01317] READ
[01318] READ
[01319] READ
  [01312]           DATA    32781bytes (31ms)
  [01313]           DATA    32781bytes (31ms)
  [01314]           DATA    32781bytes (31ms)
  [01315]           DATA    32781bytes (31ms)

READ requests 4 chunks at a time, waits for them, and then requests 4 more.

[05895] WRITE
  [05827]         STATUS       28bytes (34ms)
  [05828]         STATUS       28bytes (34ms)
  [05829]         STATUS       28bytes (35ms)
  [05830]         STATUS       28bytes (35ms)
  [05831]         STATUS       28bytes (35ms)
  [05832]         STATUS       28bytes (34ms)
  [05833]         STATUS       28bytes (34ms)
  [05834]         STATUS       28bytes (34ms)
  [05835]         STATUS       28bytes (34ms)
[05896] WRITE
[05897] WRITE

WRITE requests at least 60 chunks at a time, and sometimes I saw over 100 chunks pending.

After turning on direct_io, the reads look more like the writes:

[06342] READ
[06343] READ
[06344] READ
  [06313]           DATA    32781bytes (31ms)
  [06314]           DATA    32781bytes (31ms)
  [06315]           DATA    32781bytes (31ms)
  [06316]           DATA    32781bytes (31ms)
  [06317]           DATA    32781bytes (32ms)
  [06318]           DATA    32781bytes (32ms)
  [06319]           DATA    32781bytes (32ms)
  [06320]           DATA    32781bytes (32ms)
  [06321]           DATA    32781bytes (33ms)
  [06322]           DATA    32781bytes (35ms)
  [06323]           DATA    32781bytes (35ms)
  [06324]           DATA    32781bytes (36ms)
  [06325]           DATA    32781bytes (36ms)
  [06326]           DATA    32781bytes (36ms)
  [06327]           DATA    32781bytes (37ms)

Note the difference in the chunk IDs and you can see it's allowing at most 31 chunks to be pending before requesting more.

I think this is the primary reason for the speed increase.

-s for single threading

I noticed that running it on a single thread caused the degredation of the repeated file reads to be less pronounced. Instead of dropping back to 8 MB/s after a few reads, it does 25 MB/s read at least 5-6 times (500-600 MB) before dropping down to 16 MB/s. Also, it recovers back to 25 MB/s over time, whereas with multi-threaded I needed to restart the SSHFS connection in order to get 25 MB/s speeds again.

nice

Since there seems to be an element of CPU bottleneck (as resolved by running in a single process) I also wanted to give this process priority. It seems to help the session get more 25 MB/s reads before slowing down

compression=no

Because we're now on one thread, and we're hogging lots of CPU time, I disabled compression. I didn't notice a difference in throughput with it on, but turning it off helps reduce CPU load

Next Steps:

I will run this test a few more times, and probably even adjust my mount for the volume manually to see if it helps performance.

There is definitely some element of throttling / filling up, because repeated reads in the same session can get slower, and starting a new session can help the speed go back up. I'm not sure if this is on the client side or the server side. Any insights would be greatly appreciated.

Even though I wish there was a clearer answer, I'll be happy if the 2x boost to read speed works.

P.S. - I even tried a "high performance SSH" binary hpnssh, and it did not make a noticeable difference in my tests.

robi

@djxx said in SSHFS read speed significantly slower than scp with the same target:

I wonder if it still does any buffering when writing to /dev/null ?

If it's going through FUSE, it certainly does. It likely also does going through the FS for /dev/null.

djxx

@nebulon Can you tell me how / where to edit this so it uses the options -o direct_io,compression=no? And if it's safe to do so, and how long the change will persist?

james

Hello @djxx

You can find that file in the systemd mount file for the Cloudron backup.

systemctl status mnt-cloudronbackup.mount
● mnt-cloudronbackup.mount - cloudronbackup
     Loaded: loaded (/etc/systemd/system/mnt-cloudronbackup.mount; enabled; preset: enabled)
     Active: active (mounted) since Sat 2025-05-17 07:04:02 UTC; 2 weeks 1 day ago
      Where: /mnt/cloudronbackup
       What: REDACTED@REDACTED.your-storagebox.de:/home
      Tasks: 11 (limit: 9212)
     Memory: 9.9M (peak: 18.1M swap: 2.5M swap peak: 11.1M)
        CPU: 25min 37.199s
     CGroup: /system.slice/mnt-cloudronbackup.mount
             ├─ 905 /sbin/mount.fuse.sshfs REDACTED@REDACTED.your-storagebox.de:/home /mnt/cloudronbackup -o rw,allow_oth>
             └─9487 ssh -x -a -oClearAllForwardings=yes -oport=23 -oIdentityFile=/home/yellowtent/platformdata/sshfs/id>

Notice: journal has been rotated since unit was started, output may be incomplete.

So the file in question would be:

/etc/systemd/system/mnt-cloudronbackup.mount

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

SSHFS read speed significantly slower than scp with the same target

Why I'm using these options:

direct_io

-s for single threading

nice

compression=no

Next Steps:

Why I'm using these options:

direct_io

-s for single threading

nice

compression=no

Next Steps: