Reduce balooctl index/clear memory usage
Needs ReviewPublic

Authored by davispuh on Mar 9 2019, 7:43 PM.

Details

Summary

When running balooctl index/clear with a lot of files it uses a lot of memory because transaction to DB is committed only at the end preventing lmdb to reclaim memory thus making memory use to only grow.

For example running balooctl index ./* on a folder with ~650 files made baloo to use 9GiB of RAM, m_pendingOperations.size() was 20M

Test Plan

Run balooctl index ./* on large folder and see it doesn't use that much memory.

Diff Detail

Repository
R293 Baloo
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 9401
Build 9419: arc lint + arc unit
davispuh created this revision.Mar 9 2019, 7:43 PM
Restricted Application added projects: Frameworks, Baloo. · View Herald TranscriptMar 9 2019, 7:43 PM
Restricted Application added subscribers: Baloo, kde-frameworks-devel. · View Herald Transcript
davispuh requested review of this revision.Mar 9 2019, 7:43 PM
bruns added a subscriber: bruns.Mar 9 2019, 7:46 PM

A transaction can not be split up, thats the reason it is a transaction ...

davispuh edited the summary of this revision. (Show Details)Mar 9 2019, 8:07 PM
davispuh edited the test plan for this revision. (Show Details)
davispuh added reviewers: dhaumann, bruns, ngraham.

A transaction can not be split up, thats the reason it is a transaction ...

It can and they must be split in a lot smaller transactions. Currently sometimes baloo uses way too big transactions which causes memory usage to spike.

bruns requested changes to this revision.Mar 9 2019, 8:57 PM

Do not create a new transaction per file, thats costly.

src/tools/balooctl/main.cpp
225

You are discarding the write transaction here withouth calling commit() or abort()

This revision now requires changes to proceed.Mar 9 2019, 8:57 PM
davispuh updated this revision to Diff 53549.Mar 9 2019, 10:25 PM
davispuh edited the summary of this revision. (Show Details)
davispuh edited the test plan for this revision. (Show Details)

Don't create transaction for every file

Do not create a new transaction per file, thats costly.

Still less expensive than single transaction for 1k files