While the overall design of Baloo seems quite elaborated and well done, and it uses almost any possible kernel feature already to reduce its priority, but still there are some issues how it is negatively impacting real world configurations in the wild.
This task should track some of the observed issues and help understanding it:
- IO thrashing in low-mem situations D24540
- Re-indexing files on every reboot due to unstable DocId: Affects multiple filesystems (i.e., btrfs), some filesystem even have unstable inode numbers and baloo should blacklist such filesystems (it does already for tmpfs) BUG:404057 T9805 T11861
- Kernel memory management overhead due to huge mmap and IO behavior (but there's no real way around it currently) D24540 T9873
- High-impact transaction behavior (40 files/trx may have a high memory overhead while less files/trx increase the fsync/fdatasync overhead), a more dynamic handling could fix this T9873 BUG:400704 BUG:356357
- Database design issues (that's always a difficult balance between DB size, lookup latency, and memory usage at both index time and search time) T9805
- General memory overhead or leaks which can accumulate over time and cause crashes D24502 T9873
Discussion of the individual problems already exists in some other tasks or changesets which can be linked here. This should mostly be seen as an entry point into all the related topics so we hopefully avoid any duplicate efforts.
New ideas and references can be added to the list above and followed up in individual threads.