sftp: break large writes into multiple requests
ClosedPublic

Authored by sitter on May 11 2020, 2:04 PM.

Details

Summary

servers have arbitrary limits that we should stay below. to ensure this
happens use MAX_XFER_BUF_SIZE as maximum size per request. if we read
data large than that, break it apart into multiple requests

(I am mostly guessing here, the rfc doesn't seem to impose any size
constraint on the write requests themselves, so probably packet
constraints apply by default)

BUG: 404890
FIXED-IN: 20.04.2

Test Plan
  • fallocate -l 128M file
  • sftp to localhost
  • copy file from one dir to another
  • compare checksums match
  • open xls file (via kio-fuse)
  • change
  • save
  • close
  • open again
  • changes are still there

Diff Detail

Repository
R320 KIO Extras
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
sitter created this revision.May 11 2020, 2:04 PM
Restricted Application added projects: Dolphin, Frameworks. · View Herald TranscriptMay 11 2020, 2:04 PM
Restricted Application added subscribers: kfm-devel, kde-frameworks-devel. · View Herald Transcript
sitter requested review of this revision.May 11 2020, 2:04 PM
feverfew added a subscriber: feverfew.EditedMay 11 2020, 4:18 PM

Seems like something similar should also occur in FileJob::write?

Yeah the RFC is weird, seems like we have a guarantee (i.e. use of the word "should") so SFTP servers should be happy with 32kb at a minimum, hence it'll be safe to have that as our maximum.

sftp/kio_sftp.cpp
1841–1842

AFAICT the size of the buffer never changes so this will easily cause a buffer overrun if I'm not mistaken?

Say for example you have a buffer with buffer.size() == MAX_XFER_BUF_SIZE + 1. Then on the second iteration of the while loop (assuming bytesWritten == MAX_XFER_BUF_SIZE) you'll do a sftp_write() pointing to a char buffer of size 1, but which incorrectly states that the size is MAX_XFER_BUF_SIZE.

meven requested changes to this revision.May 12 2020, 8:03 AM
meven added a subscriber: meven.
meven added inline comments.
sftp/kio_sftp.cpp
58

Why not change it now to 32 * 1024 then ?
I guess you tested this value works at least with openssh.

I guess the best solution would be to ask/figure out the server buffer size.

What does gvfs, or other libs ?

1841

const auto length = qMin<int>(MAX_XFER_BUF_SIZE, buffer.size() - offset);

This revision now requires changes to proceed.May 12 2020, 8:03 AM
meven added inline comments.May 12 2020, 8:09 AM
sftp/kio_sftp.cpp
1841–1842

Maybe we can deduce the server buffer size based on the bytesWritten value : at init serv_buffer_size =MAX_XFER_BUF_SIZE; and then if (length > bytesWritten) { serv_buffer_size = bytesWritten } and use serv_buffer_size instead of MAX_XFER_BUF_SIZE in the loop.

feverfew added inline comments.May 12 2020, 10:44 AM
sftp/kio_sftp.cpp
1841–1842

I'm not sure if that would help. MAX_XFER_BUF_SIZE would be the upper bound of this approach, and if the server buffer size is less I imagine we'll crash anyway (as detailed in the bug report). If we simply write less then yes, we can use this to "calibrate" but seeming as it crashes instead then unfortunately I don't think this will work.

sitter added inline comments.May 12 2020, 11:47 AM
sftp/kio_sftp.cpp
58

The reason I am not changing it is because it is out of scope for the bug fix and requires research. I'm also kinda leaning towards keeping it until there's a bug report for a server that doesn't support it. The way the RFC is written reads incredibly open ended to the point where there may be no maximum or it may be even smaller than 32kb.

1841–1842

What Alex said.

I guess we could try to infer if a write failed because of buffer size but it seems a waste of time in the grand scheme of things (and also has lots of pits to fall into since we'd have to hook onto a callback and then carry out a mid-flight session reset).

In the long run we aren't loosing much by going with a static size and request queuing like we do for read() already. From what I have seen in the past the small requests of sftp become largely irrelevant with queuing and efficient (threaded) encryption.

meven added a comment.May 13 2020, 6:13 AM

Well apart from the length of data needed to be reduced as we progress, all this makes sense.

Btw gvfs uses 32768 as MAX_BUFFER_SIZE for sftp https://gitlab.gnome.org/GNOME/gvfs/-/blob/master/daemon/gvfsbackendsftp.c#L94

sitter updated this revision to Diff 82723.May 13 2020, 11:37 AM
  • rejigger write segmentation into new sftpWrite function used by both the put() and write()
  • fix length calculation
  • refine buf size comment
sitter edited the test plan for this revision. (Show Details)May 13 2020, 11:38 AM
meven accepted this revision.May 13 2020, 1:38 PM
This revision is now accepted and ready to land.May 13 2020, 1:38 PM
feverfew accepted this revision.May 13 2020, 2:46 PM

I imagine something similar should be done for FileJob::write?

ngraham accepted this revision.May 13 2020, 5:19 PM

Nice work.

I imagine something similar should be done for FileJob::write?

Yeah.

meven added a comment.May 13 2020, 6:08 PM

Nice work.

I imagine something similar should be done for FileJob::write?

Yeah.

I guess you meant FileProtocol::write.
There is no need there, it uses QIODevice::write directly.

Nice work.

I imagine something similar should be done for FileJob::write?

Yeah.

I guess you meant FileProtocol::write.
There is no need there, it uses QIODevice::write directly.

Sorry, I meant kio_sftp's implementation of this: https://api.kde.org/frameworks/kio/html/classKIO_1_1FileJob.html#a481871536fb9471ccb64929792f31165

meven added a comment.May 14 2020, 7:45 AM

Nice work.

I imagine something similar should be done for FileJob::write?

Yeah.

I guess you meant FileProtocol::write.
There is no need there, it uses QIODevice::write directly.

Sorry, I meant kio_sftp's implementation of this: https://api.kde.org/frameworks/kio/html/classKIO_1_1FileJob.html#a481871536fb9471ccb64929792f31165

I believe this is taken care of this patch in SFTPInternal::write

This revision was automatically updated to reflect the committed changes.