poboiko (Igor Poboiko)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Feb 14 2017, 10:36 AM (126 w, 6 d)
Availability
Available

Recent Activity

Yesterday

poboiko committed R293:5990d360a1ad: [IndexCleaner] ignore non-existent entries inside config (authored by poboiko).
[IndexCleaner] ignore non-existent entries inside config
Mon, Jul 22, 5:21 PM
poboiko closed D22557: [IndexCleaner] ignore non-existent entries inside config.
Mon, Jul 22, 4:44 PM · Baloo, Frameworks

Fri, Jul 19

poboiko updated the summary of D22502: [FileIndexerConfig] skip invalid entries from included/excludedFolders.
Fri, Jul 19, 11:42 AM · Baloo, Frameworks
poboiko requested review of D22557: [IndexCleaner] ignore non-existent entries inside config.
Fri, Jul 19, 11:37 AM · Baloo, Frameworks

Wed, Jul 17

poboiko added a comment to D22502: [FileIndexerConfig] skip invalid entries from included/excludedFolders.

The correct fix is to check the returned/calculated ID in the IndexCleaner, otherwise its racy.

Wed, Jul 17, 5:53 PM · Baloo, Frameworks
poboiko requested review of D22502: [FileIndexerConfig] skip invalid entries from included/excludedFolders.
Wed, Jul 17, 9:18 AM · Baloo, Frameworks

Sun, Jul 14

poboiko added a comment to D21427: Always skip trailing slashes in FilderedDirIterator.

Ping!

Apparently, it does fix bug 409257, which is pretty serious one (db corruption, after all).

The DB corruption is already fixed in KF5.60.

Sun, Jul 14, 3:43 PM · Baloo, Frameworks

Thu, Jul 11

poboiko requested review of D22392: [balooctl/baloo_file_extractor] Consolidate code that performs actual indexing.
Thu, Jul 11, 8:07 AM · Baloo, Frameworks
poboiko added a comment to D21427: Always skip trailing slashes in FilderedDirIterator.

Apparently, it does fix bug 409257, which is pretty serious one (db corruption, after all).

Thu, Jul 11, 7:05 AM · Baloo, Frameworks
poboiko updated the summary of D21427: Always skip trailing slashes in FilderedDirIterator.
Thu, Jul 11, 7:04 AM · Baloo, Frameworks

Sun, Jun 30

poboiko added a comment to D22166: [AdvancedQueryParser] Introduce support for phrase queries.

Note that it somewhat duplicates work done inside QueryParser. Which is almost not used anywhere - most of the parsing is done by AdvancedQueryParser.

Sun, Jun 30, 3:43 PM · Baloo, Frameworks
poboiko requested review of D22166: [AdvancedQueryParser] Introduce support for phrase queries.
Sun, Jun 30, 3:38 PM · Baloo, Frameworks
poboiko updated the summary of D21427: Always skip trailing slashes in FilderedDirIterator.
Sun, Jun 30, 12:09 PM · Baloo, Frameworks
poboiko added a reviewer for D21427: Always skip trailing slashes in FilderedDirIterator: ngraham.

Not entirely sure, but bug 409257 might be caused by that (at least in my case, it looked the same).

Sun, Jun 30, 12:08 PM · Baloo, Frameworks

Fri, Jun 28

poboiko added a comment to D21427: Always skip trailing slashes in FilderedDirIterator.

I've found a way to reproduce a related issue:

$ mkdir ~/test
$ balooctl config add includeFolders ~/test
$ balooctl stop
<make some changes with ~/test, i.e. add a tag>
$ balooctl start
Fri, Jun 28, 12:22 PM · Baloo, Frameworks

Jun 21 2019

poboiko committed R363:299be18ab485: Merge branch 'Applications/19.04' (authored by poboiko).
Merge branch 'Applications/19.04'
Jun 21 2019, 3:49 PM
poboiko closed D21962: [PrinterSortProxyModel] Make filter case-insensitive.
Jun 21 2019, 3:47 PM
poboiko committed R363:f21dc8a091d3: [PrinterSortProxyModel] Make filter case-insensitive (authored by poboiko).
[PrinterSortProxyModel] Make filter case-insensitive
Jun 21 2019, 3:47 PM
poboiko requested review of D21962: [PrinterSortProxyModel] Make filter case-insensitive.
Jun 21 2019, 12:38 PM

Jun 16 2019

poboiko updated the summary of D21509: [UnIndexedFileIteratorTest] Add tests.
Jun 16 2019, 6:19 PM · Baloo, Frameworks
poboiko updated the diff for D21509: [UnIndexedFileIteratorTest] Add tests.

Rebase on master.

Jun 16 2019, 6:17 PM · Baloo, Frameworks
poboiko added a comment to D21839: [TermGenerator] Use UTF-8 ByteArray for termList.

As the limit is somewhat arbitrary, maybe we can just limit the QString? I don't think this has any serious side effects.

Jun 16 2019, 5:26 PM · Baloo, Frameworks
poboiko added a comment to D21839: [TermGenerator] Use UTF-8 ByteArray for termList.

Actually, there is an issue with that code right now, which I wanted to fix, but forgot.
The trimming part finalArr = finalArr.mid(0, maxTermSize); actually should be performed on QString instead of QByteArray - unicode symbols inside term can consist of two bytes, and cutting at maxTermSize bytes can actually cut half of last symbol. I end up with terms like тождественно� inside balooshow -x.
Not to mention that russian terms end up being pretty small.

Jun 16 2019, 5:13 PM · Baloo, Frameworks

Jun 10 2019

poboiko accepted D21673: [FileIndexScheduler] Ensure indexer is not run in suspended state.

Apart from small nitpick, I think it's fine.

Jun 10 2019, 10:21 AM · Baloo, Frameworks
poboiko added a comment to T9805: Overhaul Baloo database scheme.

@bruns what do you think about it?

Jun 10 2019, 10:17 AM · Baloo
poboiko accepted D21705: [NewFileIndexer] Use correct mimetype for folders, check excludeFolders.

Can we also cover this case with tests?

Jun 10 2019, 9:41 AM · Baloo, Frameworks
poboiko accepted D21671: [FileIndexScheduler] Stop the indexer when quit() is called via DBus.

That's a nice catch!

Jun 10 2019, 9:30 AM · Baloo, Frameworks
poboiko accepted D21697: [BasicIndexingJob] Skip lookup of baloo document type for directories.
Jun 10 2019, 9:27 AM · Baloo, Frameworks
poboiko accepted D21704: [FirstRunIndexer] Use correct mimetype for folders.
Jun 10 2019, 9:26 AM · Baloo, Frameworks
poboiko accepted D21698: Move invariant IndexingLevel out of the loop.
Jun 10 2019, 9:25 AM · Baloo, Frameworks
poboiko added inline comments to D21706: [ModifiedFileIndexer] Use correct mimetype for folders, delay until needed.
Jun 10 2019, 9:10 AM · Baloo, Frameworks
poboiko added a comment to D21703: [Transaction] Replace template for functor with std::function.

I thought about it myself. I googled it a bit (i.e. here) and saw that there might be some quite unwanted runtime overhead because of using std::function. It might be negligible (since we're doing some costly DB operations inside anyways), but I'd prefer if we did some profiling to make sure it's OK.

Jun 10 2019, 9:03 AM · Baloo, Frameworks
poboiko accepted D21672: [PowerStateMonitor] Be conservative when determining power state.

Makes sense to me

Jun 10 2019, 8:57 AM · Baloo, Frameworks

Jun 4 2019

poboiko added inline comments to D21579: [FilteredDirIterator] Avoid RegExp overhead for exact matches.
Jun 4 2019, 5:37 PM · Baloo, Frameworks
poboiko accepted D21578: [UnindexedFileIterator] Delay mimetype determination until it is needed.
Jun 4 2019, 5:29 PM · Baloo, Frameworks
poboiko added a comment to D21577: [UnindexedFileIndexer] Skip filetime checks for new files.

I didn't know QFileInfo fetches information on demand (and caches it). That is the reason for this change, right?
I think it would be nice to elaborate on that in summary, or maybe as a brief comment in the code.

Jun 4 2019, 5:25 PM · Baloo, Frameworks
poboiko accepted D21576: [UnindexedFileIndexer] Do not try to add nonexistant file to index.
Jun 4 2019, 5:06 PM · Baloo, Frameworks

Jun 3 2019

poboiko added inline comments to D21509: [UnIndexedFileIteratorTest] Add tests.
Jun 3 2019, 11:14 PM · Baloo, Frameworks
poboiko updated the diff for D21509: [UnIndexedFileIteratorTest] Add tests.

Use single temp dir

Jun 3 2019, 11:09 PM · Baloo, Frameworks
poboiko updated the diff for D21509: [UnIndexedFileIteratorTest] Add tests.

Fixed comment with directory structure

Jun 3 2019, 2:37 PM · Baloo, Frameworks
poboiko added inline comments to D21509: [UnIndexedFileIteratorTest] Add tests.
Jun 3 2019, 2:35 PM · Baloo, Frameworks
poboiko updated the diff for D21509: [UnIndexedFileIteratorTest] Add tests.

Moved m_nameChanged check inside separate block

Jun 3 2019, 2:35 PM · Baloo, Frameworks
poboiko updated the diff for D21509: [UnIndexedFileIteratorTest] Add tests.

Split test to three separate test functions, which cover different test cases.

Jun 3 2019, 2:21 PM · Baloo, Frameworks

May 31 2019

poboiko requested review of D21509: [UnIndexedFileIteratorTest] Add tests.
May 31 2019, 9:02 AM · Baloo, Frameworks

May 29 2019

poboiko added a comment to D21440: Delay running UnindexedFileIndexer and IndexCleaner.

The idle tracking is only in the extractor process, not in baloo_file itself. On startup it runs the UnindexedFileIndexer and iterates all the folders looking for files to re-index, consuming a considerable amount of CPU time, spending most of its time doing regexp matching, mime type determination, and date time processing. Only after that it may run the extractor process when there's new files to be indexed.
So I think starting baloo_file later is safe since it checks all the files anyway? Otherwise/additionally, we should look into making the UnindexedFileIndexer start delayed.

May 29 2019, 1:25 PM · Baloo, Frameworks

May 27 2019

poboiko updated the summary of D21427: Always skip trailing slashes in FilderedDirIterator.
May 27 2019, 10:45 AM · Baloo, Frameworks
poboiko updated the summary of D21427: Always skip trailing slashes in FilderedDirIterator.
May 27 2019, 10:44 AM · Baloo, Frameworks
poboiko requested review of D21427: Always skip trailing slashes in FilderedDirIterator.
May 27 2019, 10:44 AM · Baloo, Frameworks

May 25 2019

poboiko added a comment to T9805: Overhaul Baloo database scheme.

Update.
I've spent some time to implement structure I propose (the code is available in private clone).
Most notable changes:

May 25 2019, 4:26 PM · Baloo

Apr 7 2019

poboiko added a comment to T9805: Overhaul Baloo database scheme.

I didn't realize LMDB does not modify pages; and if we change one, it creates a new one, copies the data from the old page, modifies it and marks old one as "dirty".
In that case we'll most likely end up modifying a single page in both cases.

Apr 7 2019, 6:30 PM · Baloo

Apr 5 2019

poboiko added a comment to T9805: Overhaul Baloo database scheme.

Just because you hide the RMW, it does not mean it does not happen. For LMDB, duplicate keys are just a plain array, see e.g. MDB_APPENDDUP in http://www.lmdb.tech/doc/group__mdb.html#ga4fa8573d9236d54687c61827ebf8cac0.

The data entries are sorted by LMDB as soon as you do a mdb_put, and this is of course a RMW cycle.

I thought the structure behind MDB_DUPSORT is a bit more clever that just a plain array, i.e. it's still a search tree.
And insertion still should cost much less compared to what is done now (fetch whole list + insert + put it back).
Of course it still might do some RMW work (tree rebalancing, for example), my point is that it should be more efficient.

Apr 5 2019, 4:47 PM · Baloo

Apr 4 2019

poboiko added a comment to T9805: Overhaul Baloo database scheme.

I'd like to revive this discussion.

Apr 4 2019, 10:04 AM · Baloo

Mar 20 2019

poboiko committed R293:e58804bf9eb8: React to config updates inside indexer (authored by poboiko).
React to config updates inside indexer
Mar 20 2019, 9:02 AM
poboiko closed D15983: React to config updates inside indexer.
Mar 20 2019, 9:01 AM · Baloo, Frameworks

Mar 19 2019

poboiko added a comment to D15983: React to config updates inside indexer.

Can you rebase this on master again? Sorry for the radio silence. :(

Mar 19 2019, 10:58 AM · Baloo, Frameworks
poboiko added a comment to D15983: React to config updates inside indexer.

Ping?

Mar 19 2019, 8:47 AM · Baloo, Frameworks
poboiko added reviewers for D15983: React to config updates inside indexer: bruns, ngraham.
Mar 19 2019, 8:47 AM · Baloo, Frameworks

Mar 17 2019

poboiko accepted D17162: Harmonize handling of underscore in query parser.
Mar 17 2019, 4:34 PM · Baloo, Frameworks

Feb 23 2019

poboiko added a comment to D18664: Baloo engine: treat every non-success code as a failure.

I've looked through the patch (quite large indeed), apart from the single note I think it's good to go.

Feb 23 2019, 2:46 PM · Baloo, Frameworks

Feb 15 2019

poboiko committed R293:5e1add922ab2: [baloo/KInotify] Notify if folder was moved from unwatched place (authored by poboiko).
[baloo/KInotify] Notify if folder was moved from unwatched place
Feb 15 2019, 4:04 PM
poboiko closed D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.
Feb 15 2019, 3:42 PM · Baloo, Frameworks
poboiko updated the diff for D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Forgot to define fname inside EventMoveTo

Feb 15 2019, 3:41 PM · Baloo, Frameworks
poboiko updated the diff for D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Updated comment, removed duplicated QFile::decodeName

Feb 15 2019, 11:52 AM · Baloo, Frameworks

Feb 12 2019

poboiko added inline comments to D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.
Feb 12 2019, 3:58 PM · Baloo, Frameworks
poboiko added inline comments to D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.
Feb 12 2019, 11:12 AM · Baloo, Frameworks

Feb 6 2019

poboiko added a comment to D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Something like that? I've decided not to emit created signal from inside the function, just to have a bit less branching in the code (and documented this behavior, since it might be a bit confusing)

Feb 6 2019, 10:33 PM · Baloo, Frameworks
poboiko updated the diff for D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Added recursive iteration over all contents for Create event as well

Feb 6 2019, 10:24 PM · Baloo, Frameworks

Feb 5 2019

poboiko updated the diff for D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Cosmetics

Feb 5 2019, 4:59 PM · Baloo, Frameworks
poboiko updated the diff for D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Added code to work with first entry that pops from FilteredDirIterator (that is the directory itself)
Test still works; but we should emit created() signal for it as well.

Feb 5 2019, 11:25 AM · Baloo, Frameworks
poboiko added a comment to D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

I am not sure if I understood your description correctly, but I am quite sure this race condition does not exist - the files/folders inside the moved folder are not created/moved one by one, but the containing folder ist just "renamed" - it is unlinked from the old parent and linked into the new one, atomically.

Sure. That concern corresponded only to the second note - if moving from another device, system has to do actual copy/move.

Feb 5 2019, 9:36 AM · Baloo, Frameworks

Feb 4 2019

poboiko updated the diff for D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.

Explained the race condition in summary, expanded test to check if watches were installed correctly.

Feb 4 2019, 11:43 AM · Baloo, Frameworks

Feb 3 2019

poboiko requested review of D18698: [baloo/KInotify] Notify if folder was moved from unwatched place.
Feb 3 2019, 3:44 PM · Baloo, Frameworks
poboiko abandoned D15637: Make DBusMenu work correctly with dynamically generated menus.
Feb 3 2019, 1:20 PM · Plasma
poboiko closed D18688: Check Exiv2::ValueType::typeId before converting it to rational.
Feb 3 2019, 9:10 AM · Baloo, Frameworks
poboiko committed R286:6e449d44bb5d: Check Exiv2::ValueType::typeId before converting it to rational (authored by poboiko).
Check Exiv2::ValueType::typeId before converting it to rational
Feb 3 2019, 9:10 AM

Feb 2 2019

poboiko added a comment to D18664: Baloo engine: treat every non-success code as a failure.

Nice! I like it, it's definitely much better than Q_ASSERT_X macros that are just silently ignored in non-debug builds.

Feb 2 2019, 10:21 PM · Baloo, Frameworks
poboiko requested review of D18688: Check Exiv2::ValueType::typeId before converting it to rational.
Feb 2 2019, 9:48 PM · Baloo, Frameworks

Dec 18 2018

poboiko added a comment to D15960: Don't check if file is directory based on mime-type.

@poboiko You broke the build, please fix it.

Dec 18 2018, 8:44 PM · Baloo, Frameworks
poboiko committed R293:c7416a41ddef: Fix mistakes introduced in a632a72a (authored by poboiko).
Fix mistakes introduced in a632a72a
Dec 18 2018, 8:42 PM
poboiko committed R293:a632a72a354e: Don't check if file is directory based on mime-type (authored by poboiko).
Don't check if file is directory based on mime-type
Dec 18 2018, 2:11 PM
poboiko added a comment to D15960: Don't check if file is directory based on mime-type.

Sorry, fell through the cracks - give a ping next time something is blocked for no apparent reason ...

Dec 18 2018, 2:07 PM · Baloo, Frameworks
poboiko closed D15960: Don't check if file is directory based on mime-type.
Dec 18 2018, 2:06 PM · Baloo, Frameworks

Nov 21 2018

poboiko added a comment to D16878: Resolve symlinks in exclude folders.

I believe can do something better here.
I think if we stick to canonical paths everywhere, and resolve symlinks ASAP (but still follow them), that might solve all the problems.

Nov 21 2018, 3:28 PM · Baloo, Frameworks

Nov 15 2018

poboiko added a comment to D16878: Resolve symlinks in exclude folders.

IMHO we should just disallow specifying symlinks in both include/excludeFolders. The user can just use exludeFolders = /storage/stuff if he wants to exclude it.

Nov 15 2018, 9:40 AM · Baloo, Frameworks
poboiko added a comment to D16876: [balooctl] Add possibility to create a copy of the index without freelist.

You are replicating mdb_copy -c here.
[...]

Nov 15 2018, 9:30 AM · Baloo, Frameworks
poboiko added a comment to D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript.

It seems like you've pushed something that was not intended to be pushed (XML extractor parts)

Nov 15 2018, 9:23 AM · Baloo, Frameworks

Nov 14 2018

poboiko accepted D16498: [KFileMetaData] Add extractor for DSC conforming (Encapsulated) Postscript.

Apart from trivial comment, this looks fine. I've tested it on my setup (with bunch of (e)ps files), and randomly chosen files seems to be indexed nicely. It also reduced the size of the index by almost 50MB, because those are not indexed as plaintext anymore :)
Yet I would also vote for replacing it (eventually) with a full-featured extractor based on libspectre
(I'm not a security specialist in any way, but that CVE doesn't look too harmful, and from my point of view it's not worth to abandon full support of (E)PS because of it)

Nov 14 2018, 3:50 PM · Baloo, Frameworks
poboiko requested review of D16878: Resolve symlinks in exclude folders.
Nov 14 2018, 3:14 PM · Baloo, Frameworks
poboiko requested review of D16876: [balooctl] Add possibility to create a copy of the index without freelist.
Nov 14 2018, 2:10 PM · Baloo, Frameworks
poboiko updated the diff for D15983: React to config updates inside indexer.

It's a bad idea to removeRecursively starting from root of the tree (documentid 0).
If user has indexed /home/username folder, there is also an index entry for /home (that's how IdTreeDB works).
However, /home should not be indexed, according to checks (because it's not in includeFolders, while /home/username is)
This will lead to removeRecursively("/home") call, which will wipe index for /home/username as well.

Nov 14 2018, 1:47 PM · Baloo, Frameworks
poboiko updated the diff for D15983: React to config updates inside indexer.

Rebase on master

Nov 14 2018, 1:11 PM · Baloo, Frameworks

Oct 30 2018

poboiko accepted D15826: [Balooshow] Avoid out-of-bounds access when accessing corrupt db data.

Yep, fine by me

Oct 30 2018, 12:26 PM · Baloo, Frameworks
poboiko added a comment to D16523: [Extractor] Replace homegrown IO handler with QDataStream, catch HUP.

That's nice! I'll test it a little.

Oct 30 2018, 8:13 AM · Baloo, Frameworks
poboiko added a comment to T9595: [KAddressbook] Use KPeople model for contact list.

Sorry, I've postponed this one for some time :(

Oct 30 2018, 7:31 AM · KDE PIM
poboiko updated the summary of D15960: Don't check if file is directory based on mime-type.
Oct 30 2018, 7:29 AM · Baloo, Frameworks

Oct 20 2018

poboiko added a comment to T9805: Overhaul Baloo database scheme.

BTW. Is there any reason why i.e. inside PostingDB Baloo stores key -> encoded list of values, instead of using MDB_DUPSORT | MDB_DUPFIXED (which allow to store multiple values for keys)?

Oct 20 2018, 7:57 PM · Baloo
poboiko abandoned D15959: Wait for the extraction process to finish before scheduling.

Dropped in favor of D16265: [Scheduler] Use flag to track when a runner is going idle, which handles this problem better.

Oct 20 2018, 1:08 PM · Baloo, Frameworks
poboiko added a comment to D16265: [Scheduler] Use flag to track when a runner is going idle.

I like it, it's better than D15959: Wait for the extraction process to finish before scheduling.
And it seems to be working, as far as I can see :)

Oct 20 2018, 1:07 PM · Baloo, Frameworks
poboiko added inline comments to D16266: [Extractor] Make extractor crash resilient.
Oct 20 2018, 9:16 AM · Baloo, Frameworks

Oct 17 2018

poboiko added inline comments to D16266: [Extractor] Make extractor crash resilient.
Oct 17 2018, 1:16 PM · Baloo, Frameworks