KFileItem: call stat() on demand, add SkipMimeTypeDetermination option
ClosedPublic

Authored by hoffmannrobert on Mar 19 2019, 4:38 PM.

Details

Summary

This changes the initialization of KFileItem. Constructors won't call init() which might do an unnecessary QT_LSTAT/stat() system call. Instead, only if a method which needs the information from stat() is called, init() and stat() are run.

In addition this adds a KFileItem constructor for passing an enum MimeTypeDetermination parameter. If set to SkipMimeTypeFromContent,
KFileItem::isDir(), KFileItem::determineMimeType() (via isDir()), KFileItem::currentMimeType() and KFileItem::iconName() (via isDir()) won't call init(), QT_LSTAT or read the file.

See also https://phabricator.kde.org/D19784

Diff Detail

Repository
R241 KIO
Branch
add_skipStat
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 10154
Build 10172: arc lint + arc unit
hoffmannrobert created this revision.Mar 19 2019, 4:38 PM
Restricted Application added a project: Frameworks. · View Herald TranscriptMar 19 2019, 4:38 PM
Restricted Application added a subscriber: kde-frameworks-devel. · View Herald Transcript
hoffmannrobert requested review of this revision.Mar 19 2019, 4:38 PM
dfaure requested changes to this revision.Mar 21 2019, 8:32 AM

The use case for this needs to be documented in the constructor. I am not sure I am in favour of this because it breaks mimetype determination from contents, which is part of the shared mime info spec.
But actually if that's the only thing that this breaks, and there's a good use case for it, then the parameter should be SkipMimetypeDetermination. Tell the user of the API what they'll miss, not what internal implementation detail this changes.
Everyone wants to skip a syscall, but they need to realize what this means in terms of feature loss.

src/core/kfileitem.cpp
198

move it so it's with the other bits above

203

depending on the use case, wouldn't it be enough to implement this TODO instead?

824

missing "else" after this line?

1626

same here -- duplicated code => move to helper function

src/core/kfileitem.h
123

booleans parameters are bad, please make it an enum.
https://wiki.qt.io/API_Design_Principles#The_Boolean_Parameter_Trap

+ docu missing including @since tag.

This revision now requires changes to proceed.Mar 21 2019, 8:32 AM

OK so I read about the use case in https://phabricator.kde.org/D19784 and it confirms my suspicion: the right fix is to do the stat on demand, called from those methods that need the information it gathers -- with a boolean to check that we only do it once of course.

  • On-demand stat(), helper function, SkipMimeTypeDetermination
hoffmannrobert retitled this revision from Proposal for KFileItem to skip stat() to KFileItem: call stat() on demand, add SkipMimeTypeDetermination option.Mar 26 2019, 4:27 PM
hoffmannrobert edited the summary of this revision. (Show Details)
hoffmannrobert marked 5 inline comments as done.Mar 26 2019, 4:29 PM
dfaure requested changes to this revision.Mar 27 2019, 8:48 AM

Thanks.

Did you also run the kio unittests to ensure they still pass?

src/core/kfileitem.cpp
203

Remove the TODO :-)

491

So, we don't really skip it. We just use a "fast and less precise" mode.

Sounds like this should be called DetermineMimeTypeFromExtension (though that's incorrect for http),
or FastMimeTypeDetermination (but people then think it's a good idea to call this in all cases...)
or SkipMimeTypeFromContent -- maybe better?

1019

It reads like a bit of a hack here, because reading .directory is unrelated to mimetype determination.

But I think your idea is that isDir() is what we don't want to call right?
Another way to do this, then, would be to change isDir() to have an early return false if mimetype determination is skipped?

1605

my suggestion would simplify this here too

This revision now requires changes to proceed.Mar 27 2019, 8:48 AM
  • Fix cmp(), SkipMimeTypeFromContent rename, change isDir()
hoffmannrobert marked 4 inline comments as done.
  • Fix cmp() #2

KFileItemTest::testCmp() failed - fixed.

JobTest also fails, but this fails in master, too. Something with FileCopyJob.

dfaure requested changes to this revision.Mar 28 2019, 9:21 PM
dfaure added inline comments.
src/core/kfileitem.cpp
86

Remove this comment, no longer true (only keep the first line, remove the other 3)

362

What about the other way around? I think this needs the symmetrical test to call item.init() if needed
(and the corresponding unittest, write it first)

731

This use of d->m_entry needs a call to init(), no?

767

This kind of method (which only uses d->m_entry in one place) could be simplified by just doing

return entry().stringValue(....);

Then the init() would happen inside entry().

This would work in user() just above, too.

This revision now requires changes to proceed.Mar 28 2019, 9:21 PM
  • Fix cmp() #3, add test for cmp()/init(), simplifications
hoffmannrobert edited the summary of this revision. (Show Details)Mar 29 2019, 11:38 AM
hoffmannrobert marked 4 inline comments as done.
hoffmannrobert added inline comments.
src/core/kfileitem.cpp
362

You're right, fixed, unittest added.

731

It does, now there via entry(). And ACL() needs init(), too, it's in hasExtendedACL() there.

767

And in defaultACL(), setLocalPath(), linkDest(), hasExtendedACL() and overlays().

dfaure requested changes to this revision.Mar 29 2019, 12:41 PM
dfaure added inline comments.
src/core/kfileitem.cpp
366

There is one thing I've been wondering back and forth about: the alternative way of doing this, which is to call init() unconditionally and testing the bool first thing in there.
The benefit is that it would reduce the code size overall, and greatly simplify the top of this method (two calls to init, done).
The downside is that for the reader, a call to init() might in fact do nothing (and it reads like it's doing something, until going there to check). So it's maybeInit(), urgh.

Ah, in another code base (QMimeType) I wrote ensureLoaded(). This could be ensureInitialized() ?

Sorry for not suggesting this earlier. Do you agree that it would make the code better?

613

No no, this won't work. entry() returns a copy, not a reference.

671

here entry() would work, no?

1078

It's more fragile this way here, because the reader of this code won't see that d->m_bLink is initialized indirectly via entry() calling init().

This is why I had only suggested to use entry() in a few places, not in those other places which need init() for other reasons anyway.

By fragile it means, it works today, but any further work/refactoring of this code is likely to break.

Same problem in linkDest().

This revision now requires changes to proceed.Mar 29 2019, 12:41 PM
hoffmannrobert marked 3 inline comments as done.
  • Add ensureInitialized(), entry() changes
hoffmannrobert marked 4 inline comments as done.Mar 29 2019, 2:33 PM
hoffmannrobert added inline comments.
src/core/kfileitem.cpp
366

Yes, I agree, but here you still need the two ifs, and the init() still is needed because of refresh().

613

Ah, sorry, yes.

671

No, it's KFileItemPrivate.

dfaure accepted this revision.Mar 29 2019, 2:55 PM

Thanks! Looks nice now.

I don't see you in the list of developer accounts, so I guess you need someone to push this for you?

This revision is now accepted and ready to land.Mar 29 2019, 2:55 PM
hoffmannrobert marked 3 inline comments as done.Mar 29 2019, 2:57 PM
This revision was automatically updated to reflect the committed changes.