Remove a partially copied file if copyjob was cancelled in the middle of file copying
AbandonedPublic

Authored by dmitrio on Feb 19 2018, 4:43 PM.

Details

Reviewers
dfaure
ngraham
Group Reviewers
Frameworks
Summary

BUG: 125102
FIXED-IN: 5.46

Remove a partially copied file if copyjob was cancelled in the middle of file copying.

File is considered to be in the process of being copied after some data block is actually written (or, to be more precise, slotProcessedSize is called). This should help us avoid cleaning up by mistake files that existed before the operation. This also means that in some very rare occasions when user cancels copying in the very beginning of the operation cleaning up procedure may not work — which is, I believe, the best result we can obtain without more significant code changes.

Unlike D10635, my option does not include handling of the full disk or any other errors. In its current state it applies only to the case when job gets killed by user.

Test Plan

While I am not sure about automated testing, various manual testing scenarios can be suggested:

I.

  1. Start copying a large file(s) to empty directory
  2. Cancel copying
  3. File which copying was cancelled should not be present at the destination.

II.

  1. Start copying a large file(s) to the directory with existing files with colliding names
  2. Cancel copying (not via "File exists" dialog)
  3. Original file should be present at the destination

One may also want to test behavior in situation when it is not possible to delete the file.
For example:

  1. Start copying a file to some network folder
  2. Turn off network connection
  3. Cancel copying
  4. File deletion UI should appear in the notifications area. You should now be able to cancel deletion or wait a bit for an error message.

Diff Detail

Repository
R241 KIO
Lint
Lint Skipped
Unit
Unit Tests Skipped
dmitrio requested review of this revision.Feb 19 2018, 4:43 PM
dmitrio created this revision.
ngraham edited the summary of this revision. (Show Details)
meven added a subscriber: meven.Feb 19 2018, 5:52 PM

Add this to your commit message
FIXED-IN: 5.44

anthonyfieroni edited reviewers, added: dfaure; removed: Dolphin.Feb 19 2018, 5:54 PM
anthonyfieroni added a subscriber: anthonyfieroni.

It looks D10635 is duplicate.

src/core/copyjob.cpp
559

Why only first not all?

meven added a comment.Feb 19 2018, 6:02 PM

It was suggested on IRC to hide this behavior behind a flag such as the jobFlag enum, so that it can be opt-in/opt-out in applications.

We could consider changing the job state during the clean up, something like

d->state == STATE_CLEANING_INCOMPLETE_FILE
src/core/copyjob.cpp
559

I believe this is because files are copied in sequence, so at any given time only the first file is being copied.

Maybe you could use m_currentSrcURL to avoid accessing the iterator.

dmitrio updated this revision to Diff 27582.Feb 19 2018, 11:03 PM
dmitrio edited the summary of this revision. (Show Details)

Add a flag which turns off cleaning up on job cancel.
Use m_currentDestURL instead of obtaining the destination file name from an iterator.

dmitrio marked an inline comment as done.Feb 19 2018, 11:26 PM

Add this to your commit message
FIXED-IN: 5.44

Thank you, added it to the description.

It looks D10635 is duplicate.

In fact, this revision is a duplicate of D10635. I am a bit surprised why this one has a smaller number :)

It was suggested on IRC to hide this behavior behind a flag such as the jobFlag enum, so that it can be opt-in/opt-out in applications.

I agree with this proposal, but I can suggest that the added flag should rather turn off the new behavior, so the default behavior will be to do this cleanup on job cancel.
However, if more people think that it should not be turned on by default, it would be easy to change this diff to achieve an opposite behavior.

We could consider changing the job state during the clean up, something like

d->state == STATE_CLEANING_INCOMPLETE_FILE

For now, I see no point in this change, since, to avoid breaking a logic of doKill() method, the deletion job is currently launched not as subjob, but as a separate job. So, after doKill() execution the CopyJob does nothing anymore (as it is supposed to be) and has no need in tracking its state. Feel free to correct me if I am wrong.

src/core/copyjob.cpp
559

You are right, updated a diff.

On the question on deleting first file only, yes, we need to delete only the file that was being copied before the user canceled the job. Perhaps using m_currentDestURL would make this intention more clear.

I would prefer to have the cleanup behavior happen by default, if we end up making this optional.

It would be nice if we could also also handle the KIO::ERR_DISK_FULL case.

dmitrio planned changes to this revision.Feb 23 2018, 2:29 AM

It seems that the proposed patch in its current state does not fully prevent possible problems with move operation (did not catch any problems while testing, but looking at the code again I don't think that all is good with it). I suppose I am better to revise it again.

dfaure requested changes to this revision.Feb 25 2018, 7:13 PM

I don't like the idea of a flag for this in the API. It just moves the problem (of whether it's safe / a good idea to clean up) to the applications, who are not in a better place to decide about this. Better make it happen all the time, and better make it work right. A flag almost sounds like a excuse for a half-hearted feature ("if it works badly in case XYZ, then apps can just opt out"). I don't see why the app would care, really. It wants to copy A to B, and wants to know if it worked or not; details like cleaning up on cancel are best handled by the KIO library.

I don't see any provision for the case I mentioned, where the destination file already exists, and should therefore NOT be deleted?

I'm also missing a unittest for this. Now it should be quite easy to unittest. Copy a dir with two files, connect to copyingDone() to know when the first file was copied, then kill the job in the slot [possibly as a Queued invocation]. Dest dir should have only the first file. Then the same when connecting to copying(), i.e. before the copy of the first file. Dest dir should be empty. And then unittests with existing files at destination.

Thanks for your work on this.

Better make it happen all the time, and better make it work right. A flag almost sounds like a excuse for a half-hearted feature ("if it works badly in case XYZ, then apps can just opt out"). I don't see why the app would care, really. It wants to copy A to B, and wants to know if it worked or not; details like cleaning up on cancel are best handled by the KIO library.

+1

dmitrio abandoned this revision.Feb 26 2018, 11:50 PM

I don't see any provision for the case I mentioned, where the destination file already exists, and should therefore NOT be deleted?

In fact, this is exactly the case that I tried to deal with here. In this proposal the destination file gets deleted only if some data has been written into it, so all the previous data are already lost regardless of our cleanup.

What is not handled here is the case of moving to another partition, where subjob may launch original file deletion just after successful file copying and before emitting result, which may lead to cleanup being done when the original file can potentially be deleted. I also expect some trouble with job pause feature when user may pause this job, eventually start copying the same file with another tool (be it something like cp or another instance of CopyJob), and then abort this job. For this case we should probably add something like a check on whether size&timestamp of the destination file has changed since our last change of that file. So it seems that to work properly this feature needs some more complex solution including, probably, some separate job class dealing with cleanup and all the necessary safety checks. For now, I can only apologize for underestimating the problem and rolling out such a raw solution for it.

I am probably better to close it, if I am able to come up with better solution I will reopen this request (if it is possible) or open a new one.

"the destination file gets deleted only if some data has been written into it" ... ah I see, that's a great solution indeed, I like it.

About moving: if the file was fully copied, (but the source not yet deleted), good catch. This requires a similar solution in FileCopyJob, i.e. one layer below (with a similar "in progress" flag that is only true while the file copy is in progress, and false as soon as it's done, i.e. before the DeleteJob). I suggest starting with that, and then in CopyJob when killing a FileCopyJob we just let it do the cleanup. As you found out, CopyJob is too high-level for this.

Pausing and doing nasty things will never be safe, I don't think we need to worry about that.

I think all you have to do is go one more step down the stack :-)

@dmitrio: you can click on the "Plan Changes" action, no need to abandon this revision and create a new one.

@dmitrio, are you still planning to work on this patch, or a new version of it?

dmitrio reclaimed this revision.Mar 26 2018, 12:25 AM

Well, sorry, I did not expect you waiting for me. I had some new version of this patch, but thought about working on unit tests for it and never finished them. I'll upload now what I have to date, hope it will be useful.

Concerning the ERR_DISK_FULL case which was mentioned earlier, it is easy to add some handling for this case in the new version of patch . However, I should note that at least file ioslave for Unix-like systems also tries to remove partially copied file in case of full disk (see e.g. this). Not sure how it would interfere with more high-level treatment in FileCopyJob and whether we should keep or remove that code in case we decide to handle this case in FileCopyJob.

This revision now requires changes to proceed.Mar 26 2018, 12:25 AM
dmitrio updated this revision to Diff 30570.Mar 26 2018, 12:26 AM
ngraham edited the summary of this revision. (Show Details)Apr 17 2018, 10:11 PM

@dmitrio are you ready for review, or still working on this?

Yes, this code is ready for review.
As I described in the previous comment, there is still a question concerning handling the full disk error case: it is being handled currently in some ioslaves (at least in file_unix slave), and it needs to be decided whether such handling should be moved from the slaves code to FileCopyJob level or it should be left as is. But I would anyway ask for developers' opinion on this prior to making such changes.

ngraham requested changes to this revision.Apr 26 2018, 8:20 PM

When you cancel an overwrite operation in the middle with this patch, the destination file is destroyed.

This revision now requires changes to proceed.Apr 26 2018, 8:20 PM

Then I probably didn't understand what was proposed in the discussed bug report. When you start writing to the existing file the data that have been in the destination file before the operation become lost. If we want to be able to restore content of destination files after overwrite operation has started then we need to change the whole mechanism of file copying: we should copy source file to some temporary file, delete the destination file an rename temporary file to the destination file name. If this is what was meant by this bug report then I am probably trying to fix the wrong issue. Is it so, or what is the desired behavior in this case?

Then I probably didn't understand what was proposed in the discussed bug report. When you start writing to the existing file the data that have been in the destination file before the operation become lost. If we want to be able to restore content of destination files after overwrite operation has started then we need to change the whole mechanism of file copying: we should copy source file to some temporary file, delete the destination file an rename temporary file to the destination file name.

That's what it already does. Without your patch, canceling an overwrite is harmless because only the temporary file is deleted. In other words, the case in the bug is already handled for overwrite operations.

The case that is NOT handled is for normal move/copy operations, where the destination file is not cleaned up after the job is canceled in the middle. Essentially, we want your patch to only delete the destination file when it's not an overwrite, because overwrites already clean up properly.

Sorry, but I was not able to locate such a behavior. When cancelling an overwrite in the middle of the process I got only a partially copied file at the destination path. Tested in the latest git-stable KDE Neon, the procedure was a copying of the source file to the other folder containing the file with the same name. Perhaps you told about some other kind of an overwrite operation?

Here, let me show you.

Without your patch:

With your patch:

anthonyfieroni added inline comments.May 2 2018, 3:43 AM
src/core/filecopyjob.cpp
320 ↗(On Diff #30570)
&& metadata(QStringLiteral("overwrite")) != QLatin1String("true")

If I understand correctly what has happened, without this patch the file at destination path existed after the operation was cancelled, and it might even not be corrupted. Still I believe that it is the source file that may remain after the cancellation, but not the one existed at the destination path. Try the same test with different files (I mean, with different content) at the source and destination paths. After cancelling an overwrite operation you will have either a copy of the source file or a corrupted file if the source file was sufficiently large (more than 1 GB works in my case, but this probably depends on memory size and disk I/O speed).

I agree though that disappearing of the destination file may be confusing to user, so some special handling of this case may be useful. But currently I do not see any sign of this case being already handled by KIO, neither in the source code nor in real tests.

You're right, the destination file was indeed silently corrupted in the "before" case. Yikes, that's bad. Really, both options are bad. I can see now that your patch doesn't actually regress anything--it just exposes a dormant problem that was previously hidden but becomes very visible with the patch. That's a good sign that we ought to resolve that issue first. Perhaps for overwrites, if there's enough space on the destination filesystem, we should copy to a temp file, then if that completes, swap the target file and the temp file, and then finally delete the old target file. What do you think?

Restricted Application added a subscriber: kde-frameworks-devel. · View Herald TranscriptMay 12 2018, 2:55 AM

Well, I think it would be logical to implement something like that if we intend to make copying cancelable. But we still will have to spoil the destination file if there is no enough space left (which is not always determined correctly, in fact. One example of this is a remote sftp filesystem with a per-user quota set up on a remote machine). So the idea sounds good, but it requires careful implementation and adds a bit of inconsistency to the program behavior. But perhaps in this case it doesn't matter — a user anyway gets a warning when trying to overwrite files.

Right, I think that makes sense.

Any progress on that, dmitrio?

dmitrio abandoned this revision.Jun 28 2018, 9:02 PM

No, didn't work on that, sorry. I think I am better to close the revision.