- User Since
- Feb 14 2017, 10:36 AM (143 w, 2 d)
@bruns: I've missed D16593: [ExtractorCollection] Use only best matching extractor plugin, and had in mind previous situation where we've matched all extractors based on inheritance. In that case, "Secondly" part indeed does not seem to apply anymore.
(as for my previous answer: I misunderstood you, thought you were asking about the case where PlainTextExtractor did not match & matched afterwards)
Oct 4 2019
Minor optimization: change check order (filesize / extractor property)
Sep 25 2019
Address raised issue: fetch file size only once
Sep 8 2019
Aug 18 2019
AFAIK canonicalFilePath() cannot return path without slash. The result is either empty (if path does not exist; this is covered by first if), or a proper path. The second check is redundant.
Aug 8 2019
Aug 7 2019
Aug 5 2019
Jul 22 2019
Jul 19 2019
Jul 17 2019
Jul 14 2019
Jul 11 2019
Apparently, it does fix bug 409257, which is pretty serious one (db corruption, after all).
Jun 30 2019
Note that it somewhat duplicates work done inside QueryParser. Which is almost not used anywhere - most of the parsing is done by AdvancedQueryParser.
Not entirely sure, but bug 409257 might be caused by that (at least in my case, it looked the same).
Jun 28 2019
I've found a way to reproduce a related issue:
$ mkdir ~/test $ balooctl config add includeFolders ~/test $ balooctl stop <make some changes with ~/test, i.e. add a tag> $ balooctl start
Jun 21 2019
Jun 16 2019
Rebase on master.
As the limit is somewhat arbitrary, maybe we can just limit the QString? I don't think this has any serious side effects.
Actually, there is an issue with that code right now, which I wanted to fix, but forgot.
The trimming part finalArr = finalArr.mid(0, maxTermSize); actually should be performed on QString instead of QByteArray - unicode symbols inside term can consist of two bytes, and cutting at maxTermSize bytes can actually cut half of last symbol. I end up with terms like тождественно� inside balooshow -x.
Not to mention that russian terms end up being pretty small.
Jun 10 2019
Apart from small nitpick, I think it's fine.
@bruns what do you think about it?
Can we also cover this case with tests?
That's a nice catch!
I thought about it myself. I googled it a bit (i.e. here) and saw that there might be some quite unwanted runtime overhead because of using std::function. It might be negligible (since we're doing some costly DB operations inside anyways), but I'd prefer if we did some profiling to make sure it's OK.
Makes sense to me
Jun 4 2019
I didn't know QFileInfo fetches information on demand (and caches it). That is the reason for this change, right?
I think it would be nice to elaborate on that in summary, or maybe as a brief comment in the code.
Jun 3 2019
Use single temp dir
Fixed comment with directory structure
Moved m_nameChanged check inside separate block
Split test to three separate test functions, which cover different test cases.
May 31 2019
May 29 2019
May 27 2019
May 25 2019
I've spent some time to implement structure I propose (the code is available in private clone).
Most notable changes:
Apr 7 2019
I didn't realize LMDB does not modify pages; and if we change one, it creates a new one, copies the data from the old page, modifies it and marks old one as "dirty".
In that case we'll most likely end up modifying a single page in both cases.
Apr 5 2019
Just because you hide the RMW, it does not mean it does not happen. For LMDB, duplicate keys are just a plain array, see e.g. MDB_APPENDDUP in http://www.lmdb.tech/doc/group__mdb.html#ga4fa8573d9236d54687c61827ebf8cac0.
The data entries are sorted by LMDB as soon as you do a mdb_put, and this is of course a RMW cycle.
I thought the structure behind MDB_DUPSORT is a bit more clever that just a plain array, i.e. it's still a search tree.
And insertion still should cost much less compared to what is done now (fetch whole list + insert + put it back).
Of course it still might do some RMW work (tree rebalancing, for example), my point is that it should be more efficient.
Apr 4 2019
I'd like to revive this discussion.
Mar 20 2019
Mar 19 2019
Mar 17 2019
Feb 23 2019
I've looked through the patch (quite large indeed), apart from the single note I think it's good to go.
Feb 15 2019
Forgot to define fname inside EventMoveTo
Updated comment, removed duplicated QFile::decodeName
Feb 12 2019
Feb 6 2019
Something like that? I've decided not to emit created signal from inside the function, just to have a bit less branching in the code (and documented this behavior, since it might be a bit confusing)
Added recursive iteration over all contents for Create event as well
Feb 5 2019
Added code to work with first entry that pops from FilteredDirIterator (that is the directory itself)
Test still works; but we should emit created() signal for it as well.
I am not sure if I understood your description correctly, but I am quite sure this race condition does not exist - the files/folders inside the moved folder are not created/moved one by one, but the containing folder ist just "renamed" - it is unlinked from the old parent and linked into the new one, atomically.
Sure. That concern corresponded only to the second note - if moving from another device, system has to do actual copy/move.
Feb 4 2019
Explained the race condition in summary, expanded test to check if watches were installed correctly.
Feb 3 2019
Feb 2 2019
Nice! I like it, it's definitely much better than Q_ASSERT_X macros that are just silently ignored in non-debug builds.
Dec 18 2018
Nov 21 2018
I believe can do something better here.
I think if we stick to canonical paths everywhere, and resolve symlinks ASAP (but still follow them), that might solve all the problems.