kio_trash: Add size, modification, access and create date for trash:/
ClosedPublic

Authored by meven on Oct 19 2019, 9:07 AM.

Diff Detail

Repository
R241 KIO
Branch
arcpatch-D24773_1
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 25161
Build 25179: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
src/ioslaves/trash/trashimpl.cpp
1088

Is this really needed? We just created entry.

1091

coding style: local variables never start with an uppercase.

meven marked 2 inline comments as done.Oct 20 2019, 2:00 PM
meven added inline comments.
src/ioslaves/trash/trashimpl.cpp
1088

It was done previously in void TrashProtocol::createTopLevelDirEntry to save some memory, so I just kept it here and line 500.
Although it is not very useful.

1091

TrashedFileInfoList is the type of TrashedFileInfoList, no issue with codding style.

meven marked 3 inline comments as done.Oct 20 2019, 2:07 PM
dfaure requested changes to this revision.Oct 20 2019, 3:39 PM
dfaure added inline comments.
src/ioslaves/trash/kio_trash.cpp
502

why?

src/ioslaves/trash/trashimpl.cpp
1085

This method could be const, right?

1101

This looks very slow. We have a cache for the size usage. See TrashImpl::trashSpaceInfo.

But anyway, we never return recursive directory size in the UDS_SIZE field, in no ioslave.
If someone wants to know the size of a directory, they can use the properties dialog, which has a calculate button.

1108

Why replace? This is inserting into a fresh new entry, it could use fastInsert

(same for the other calls below)

This revision now requires changes to proceed.Oct 20 2019, 3:39 PM

Works great. Not sure creation date makes sense to show for the trash (if it's available) though. Can we suppress that?

meven edited the summary of this revision. (Show Details)Oct 21 2019, 7:01 AM
meven updated this revision to Diff 68402.Oct 21 2019, 7:19 AM

Use TrashSizeCache to file directory size, do not add creation date to the UDSEntry, better use UDSEntry reserve/clear

meven marked 3 inline comments as done.Oct 21 2019, 7:27 AM
meven added inline comments.
src/ioslaves/trash/trashimpl.cpp
1085

No becauseof the call to list() that cannot be const.

1101

I missed the feature of trashSpaceInfo, thank you for pointing it out.

We need to add the UDS_SIZE field here : this is a missing feature for a special common use case : Finding out what amount of space is occupied by the trashed files ?
The trash size should be easy to find out for users. Plus we have a cache file to get this information cheaply.

UDS_SIZE may not be used that way currently but it is semantically correct nonetheless.
In most other cases directory size calculation is not used because it is too resource intensive but here we have the value pre-computed.

And by the way the trash:/ has not Properties entry in its context menu, so it is not even possible to get the trash size currently.

dfaure requested changes to this revision.Oct 21 2019, 8:21 AM

I meant in the Dolphin view. RMB / Properties and calculating the size is available for trash:/. See http://www.davidfaure.fr/2019/dolphin_properties_dialog_for_trash.png

I don't like abusing UDS_SIZE for this, because KIO is a virtual filesystem, trying to work like a normal filesystem. When you ::stat() a local directory, you get the size used by the directory entry, NOT the recursive size of all of its contents. There are ways to ask for that, by recursing, but explicitly, not as part of normal directory listing.

There is indeed a fast way to get this particular one, but I'm worried that this creates a precedent where the information panel starts expecting that all kioslaves fill UDS_SIZE with the full recursive size rather than just the size of the directory entry. And users expect to see the full recursive size there. I mean, after this goes in, the user goes to his ~/Music folder, the information panel still says Size: 4123 items, they'll report a bug that it should instead show the full disk usage of that folder, right?

I'm also concerned that now UDS_SIZE would have different meanings depending on the URL (sometimes the size of the directory entry, sometimes the full recursive size). This is how bad architecture starts, and unsolvable bugs - or ugly workarounds - tend to appear.

To fix this issue, I would much rather that the information panel uses DirectorySizeJob (possibly on demand, like the Properties dialog); that class could have a fast path for trash: (e.g. making a special() call to the ioslave). Or if you don't want a button in that panel, special-case trash, call DirectorySizeJob only for trash, implement the fast path there - and change the label in the information panel to "Disk space usage" so it's not confused with the usual Size field.

This revision now requires changes to proceed.Oct 21 2019, 8:21 AM
meven marked 2 inline comments as done.Oct 21 2019, 11:09 AM

I meant in the Dolphin view. RMB / Properties and calculating the size is available for trash:/. See http://www.davidfaure.fr/2019/dolphin_properties_dialog_for_trash.png

This works from the folder view, but not from the places context menu where I tried it before.
This is another issue.

I don't like abusing UDS_SIZE for this, because KIO is a virtual filesystem, trying to work like a normal filesystem. When you ::stat() a local directory, you get the size used by the directory entry, NOT the recursive size of all of its contents. There are ways to ask for that, by recursing, but explicitly, not as part of normal directory listing.
I'm also concerned that now UDS_SIZE would have different meanings depending on the URL (sometimes the size of the directory entry, sometimes the full recursive size). This is how bad architecture starts, and unsolvable bugs - or ugly workarounds - tend to appear.

Maybe we should add another fileld to UDSEntry like UDS_CONTENT_SIZE for this use case then to keep the two use cases clearly separate ?

There is indeed a fast way to get this particular one

The way that exists necessitates two clicks and a load ; and it is available from only a particular place as mentioned up but that is not the issue here. But the value is in cache in fact for trash:/ we can shortcut the load.
It is inconvenient and not what users would expect, some may just not find it.

, but I'm worried that this creates a precedent where the information panel starts expecting that all kioslaves fill UDS_SIZE with the full recursive size rather than just the size of the directory entry. And users expect to see the full recursive size there. I mean, after this goes in, the user goes to his ~/Music folder, the information panel still says Size: 4123 items, they'll report a bug that it should instead show the full disk usage of that folder, right?

Concerning recursive dir size, it is already a feature wish, that I would argue we should implement in some way sometimes (not default, not everywhere but still we would need something along this line). This is a legitimate feature in 2019 IMO :
https://bugs.kde.org/show_bug.cgi?id=48434
https://forum.kde.org/viewtopic.php?f=22&t=87452
https://bugs.kde.org/show_bug.cgi?id=158090
So I would argue that we need a way to surface directories content size to users.

But here this does not concern size of dirs in general but only the size of trash:/ that will only be displayed in the information panel.
So it might acceptable here.

To fix this issue, I would much rather that the information panel uses DirectorySizeJob (possibly on demand, like the Properties dialog); that class could have a fast path for trash: (e.g. making a special() call to the ioslave) Or if you don't want a button in that panel, special-case trash, call DirectorySizeJob only for trash, implement the fast path there - and change the label in the information panel to "Disk space usage" so it's not confused with the usual Size field.

I would argue that "Disk space usage" and "Size" might be confusing to users. A folder size is the size of his content idiomatically.
That is our current usage of Size field for folders that is weird : My Image folder Size is "42 elements".
That is a paper cut that all new Plasma users have to learn "Size for folders is a content description" : not very intuitive.
We may want to in fact rename this field for folders to "Content" and keep Size for actual disk usage size (except in details view Size column of course).

The fast path in the information panel is completely viable here.

I would appreciate some feedback from VDG as to what course of action is recommendable for a user experience perspective. @ngraham ?

I would argue that "Disk space usage" and "Size" might be confusing to users. A folder size is the size of his content idiomatically.
That is our current usage of Size field for folders that is weird : My Image folder Size is "42 elements".
That is a paper cut that all new Plasma users have to learn "Size for folders is a content description" : not very intuitive.
We may want to in fact rename this field for folders to "Content" and keep Size for actual disk usage size (except in details view Size column of course).

Personally I would prefer it if we could actually calculate the total of a folder's contents: https://bugs.kde.org/show_bug.cgi?id=158090

I tried to do this once and found that it was possible, but would need a lot of work to ensure adequate performance, and probably a lot of caching. I come from the macOS world where this is implemented and has no appreciable performance penalty whatsoever so I know it's possible.

However this is probably unrelated to the current patch.

Regarding this patch, I think the "Size" field for the trash should always show the total on-disk size of what's in the trash, regardless of the above, because the question that the user is looking to answer is, "How much space am I going to get back if I empty the trash right now?" For that, we need to always show the on-disk size.

meven added a comment.Oct 26 2019, 7:30 AM

I would argue that "Disk space usage" and "Size" might be confusing to users. A folder size is the size of his content idiomatically.
That is our current usage of Size field for folders that is weird : My Image folder Size is "42 elements".
That is a paper cut that all new Plasma users have to learn "Size for folders is a content description" : not very intuitive.
We may want to in fact rename this field for folders to "Content" and keep Size for actual disk usage size (except in details view Size column of course).

Personally I would prefer it if we could actually calculate the total of a folder's contents: https://bugs.kde.org/show_bug.cgi?id=158090

I tried to do this once and found that it was possible, but would need a lot of work to ensure adequate performance, and probably a lot of caching. I come from the macOS world where this is implemented and has no appreciable performance penalty whatsoever so I know it's possible.

However this is probably unrelated to the current patch.

Agreed

Regarding this patch, I think the "Size" field for the trash should always show the total on-disk size of what's in the trash, regardless of the above, because the question that the user is looking to answer is, "How much space am I going to get back if I empty the trash right now?" For that, we need to always show the on-disk size.

Agreed

Your thought @dfaure ?

I would argue that "Disk space usage" and "Size" might be confusing to users. A folder size is the size of his content idiomatically.
That is our current usage of Size field for folders that is weird : My Image folder Size is "42 elements".
That is a paper cut that all new Plasma users have to learn "Size for folders is a content description" : not very intuitive.
We may want to in fact rename this field for folders to "Content" and keep Size for actual disk usage size (except in details view Size column of course).

Personally I would prefer it if we could actually calculate the total of a folder's contents: https://bugs.kde.org/show_bug.cgi?id=158090

I tried to do this once and found that it was possible, but would need a lot of work to ensure adequate performance, and probably a lot of caching. I come from the macOS world where this is implemented and has no appreciable performance penalty whatsoever so I know it's possible.

However this is probably unrelated to the current patch.

Regarding this patch, I think the "Size" field for the trash should always show the total on-disk size of what's in the trash, regardless of the above, because the question that the user is looking to answer is, "How much space am I going to get back if I empty the trash right now?" For that, we need to always show the on-disk size.

@dfaure any feedback ?
Others maybe ?

You can do whatever you want in the GUI -- I'll happily step out of that part of the discussion --, as long as you don't abuse UDS_SIZE for what it was not meant for, thus creating an ambiguous meaning for it.

My suggestion would be to add a UDS_RECURSIVE_SIZE instead.
The problem with that, however, is that the kioslave can't know if the application actually wants the information. So in most cases we would be wasting a lot of time for nothing -- even here it's not fully for free.
One solution would be to use the "details" metadata for this.

  • 0 (as set by DeleteJob) means bare minimum (filename and type).
  • 1 isn't used anymore, but it adds uid, gid, atime, mtime, btime
  • 2 is the default, which is the above plus ACL data
  • 3 requests in addition the device and inode number (from kio_file), so that DirectorySizeJob can check for hardlinks
  • we could add that 4 means "I want UDS_RECURSIVE_SIZE too".

I don't know if this will be useful in any other kioslave, but at least here it would allow skipping this stuff when not needed.

Sample code, like in kio_file and kio_ftp:

const QString sDetails = metaData(QLatin1String("details"));
const int details = sDetails.isEmpty() ? 2 : sDetails.toInt();
meven added a comment.Oct 27 2019, 1:42 PM

You can do whatever you want in the GUI -- I'll happily step out of that part of the discussion --, as long as you don't abuse UDS_SIZE for what it was not meant for, thus creating an ambiguous meaning for it.

My suggestion would be to add a UDS_RECURSIVE_SIZE instead.
The problem with that, however, is that the kioslave can't know if the application actually wants the information. So in most cases we would be wasting a lot of time for nothing -- even here it's not fully for free.

Some ioslave could provide it by default in some situation like for kio_trash, this make sense for /.

One solution would be to use the "details" metadata for this.

  • 0 (as set by DeleteJob) means bare minimum (filename and type).
  • 1 isn't used anymore, but it adds uid, gid, atime, mtime, btime
  • 2 is the default, which is the above plus ACL data
  • 3 requests in addition the device and inode number (from kio_file), so that DirectorySizeJob can check for hardlinks
  • we could add that 4 means "I want UDS_RECURSIVE_SIZE too".

Good idea, could be a base to fix https://bugs.kde.org/show_bug.cgi?id=158090 later.

This KIO::stat details parameter begs to be a bitmask...
I am thinking about adding a KF6 TODO about this.

I don't know if this will be useful in any other kioslave, but at least here it would allow skipping this stuff when not needed.

Sample code, like in kio_file and kio_ftp:

const QString sDetails = metaData(QLatin1String("details"));
const int details = sDetails.isEmpty() ? 2 : sDetails.toInt();

All dependent of the implementations and the users.
But for sure UDS_RECURSIVE_SIZE cannot be a default.

About https://bugs.kde.org/show_bug.cgi?id=158090
I would imagine something where the job recurse at most N levels, or M number of directory before stopping with N = 4 and M = 300.
Returning incomplete results, but enough to give good information in most usercases.

meven updated this revision to Diff 68829.Oct 27 2019, 1:43 PM

Add KIO::UDSEntry::UDS_RECURSIVE_SIZE to store recursive size of folders

This looks good. I think I failed to notice the last update, sorry for the delay.

Can you rebase (I suspect a few conflicts) and update version numbers to 5.70? Then it'll be good to go.

Ah, and the documentation for the KFileItem getter needs to say that this is generally NOT available and right now only implemented for trash:/.

meven edited the summary of this revision. (Show Details)Apr 11 2020, 5:51 AM
meven updated this revision to Diff 79804.Apr 11 2020, 6:20 AM
meven marked 3 inline comments as done.

Rebase, update @since, add precision to KFileItem comment

dfaure added inline comments.Apr 11 2020, 9:09 AM
src/core/kfileitem.h
349

typo: Initially

src/ioslaves/trash/trashimpl.cpp
1097

Reusing info is more expensive (and less readable) than creating it here everytime (price of assignment)

const QFileInfo info(trashPath);
1098

same here

1105

Why not move the whole code of createTopLevelDirEntry into this method?

It's weird to fill this in two separate steps.

OK this would require moving m_userName and m_groupName into this class...

meven updated this revision to Diff 79813.Apr 11 2020, 9:30 AM
meven marked 4 inline comments as done.

Fix typo, avoid allocations, improve variable naming

dfaure accepted this revision.Apr 11 2020, 4:14 PM
This revision is now accepted and ready to land.Apr 11 2020, 4:14 PM
meven planned changes to this revision.Apr 11 2020, 4:56 PM

The modification date is not what to expect currently, working on it.

meven updated this revision to Diff 79942.Apr 12 2020, 5:28 PM

Improve latest modification time of trash implementation, fix case when a symlink is in the trash returning incorrect trash size

This revision is now accepted and ready to land.Apr 12 2020, 5:28 PM
meven updated this revision to Diff 79943.Apr 12 2020, 5:30 PM

Improve naming of struct CacheData to SizeAndModTime

dfaure requested changes to this revision.Apr 13 2020, 10:51 AM
dfaure added inline comments.
src/ioslaves/trash/trashimpl.cpp
1096

typo: latest

src/ioslaves/trash/trashsizecache.cpp
155

This calls lastModified().toMSecsSinceEpoch() twice.

How about a lambda for (at least)

if (lastModTime > max_mtime) {
     max_mtime = lastModTime;
}

and then you can call that lambda with lastModified().toMSecsSinceEpoch()?
I was thinking about suggesting such a lambda anyway since that logic exists 4 times in this code.

In fact, checking that the trashinfo file exists is something that should be done in all code paths, no? So there's a lot more code that can be factored out, I would think.

173

orphaned sounds like "no .trashinfo file pointing to that dir".

But that's not the case here. It's just items that have been added to the trash without being added to the cache file (many implementations do that, the cache is optional).

So the directory is not listed in the cache, or is listed with a too old mtime i.e. it wasn't updated when adding another item into it.

176–178

Not technically correct, this is the mtime of the directory, while there could be much more recent items inside that directory.

But I'm wondering how much effort we want to put into finding the latest mtime, this is starting to sound like a very expensive operation.

So I'm actually find with this known bug, maybe with a comment about how this is a heuristic -- in a fallback for bad implementations.
When kio_trash takes care of the trashing, we should never end up in this code path (right?)

This revision now requires changes to proceed.Apr 13 2020, 10:51 AM
meven updated this revision to Diff 80008.Apr 13 2020, 1:26 PM
meven marked 3 inline comments as done.

Add lambdas to factorize code, comments update, move a couple of DiscSpaceUtil::sizeOfPath to where there are needed

meven added a comment.Apr 13 2020, 1:26 PM

But I'm wondering how much effort we want to put into finding the latest mtime, this is starting to sound like a very expensive operation.

I agree let's not got too deep down the rabbit hole.
This is not a very common use case

When kio_trash takes care of the trashing, we should never end up in this code path (right?)

AFAIK

dfaure requested changes to this revision.Apr 13 2020, 1:41 PM
dfaure added inline comments.
src/ioslaves/trash/trashsizecache.cpp
131 ↗(On Diff #80008)

Typo: checkMaxTime with a 'k'?

131 ↗(On Diff #80008)

Interesting. One benefit of lambdas is that they can work on local variable; I would have captured max_mtime by reference and modified it inside the lambda.
Written this way (which I guess more "pure functional programming" because no side effects), it could be a static helper function ;)

Wait, this doesn't work, does it? It makes a copy of mtime right now, while it's still 0.

I'm pretty sure you want to capture by [&] instead
(and then, unless you insist on pure functions, I'd suggest just modifying it here, and returning void). It removes the "max_mtime = " duplication ;)

This revision now requires changes to proceed.Apr 13 2020, 1:41 PM
meven updated this revision to Diff 80016.Apr 13 2020, 2:29 PM
meven marked 2 inline comments as done.

Fix, typo, pass max_mtime by ref to lambda

meven added inline comments.Apr 13 2020, 2:29 PM
src/ioslaves/trash/trashsizecache.cpp
131 ↗(On Diff #80008)

It probably didn't work, thanks

meven marked an inline comment as done.Apr 13 2020, 2:29 PM
dfaure accepted this revision.Apr 13 2020, 4:12 PM
This revision is now accepted and ready to land.Apr 13 2020, 4:12 PM
This revision was automatically updated to reflect the committed changes.

TestTrash::renameDirInTrash makes the dir cache invalid (it still lists trashDirFromHome). I think because of the optimization that we don't always care about the size... but this had the side effect of keeping the cache uptodate.... I wonder if the test is too strict and an old cache isn't better than keeping it coherent at extra cost... I'll dig further.

src/ioslaves/trash/trashimpl.cpp
411

This was done unconditionally...

dfaure added inline comments.Apr 15 2020, 9:03 PM
src/ioslaves/trash/trashimpl.cpp
1093 ↗(On Diff #80008)

It's actually this call which triggers the unittest failure.

I wonder if maybe this happens at the wrong time....

Yep, that's it. Renaming trash:/A to trash:/B calls KIO::moveAs which stats trash:/ [useful to resolve desktop:/ to a local dir for instance], which now updates the on-disk cache... with 'A', before it gets renamed to 'B'.

I think I know what the best solution is. Using StatDetails to skip the (expensive) size stuff when CopyJob calls stat(). Bugfix and performance improvement :)