The problem:
frameworks/baloo/src/engine/idutils.h:106 inline quint64 statBufToId(const QT_STATBUF& stBuf) { // We're loosing 32 bits of info, so this could potentially break // on file systems with really large inode and device ids return devIdAndInodeToId(static_cast<quint32>(stBuf.st_dev), static_cast<quint32>(stBuf.st_ino)); }
This id is central and used to access info in several databases.
An id of the backend db looks like this
typedef struct MDB_val { size_t mv_size; /**< size of the data item */ void *mv_data; /**< address of the data item */ } MDB_val;
and is used e.g. like this
MDB_val key; key.mv_size = sizeof(quint64); key.mv_data = static_cast<void*>(&docId);
This needs to change see T7860.
Suggested approach:
- Aliases:
- using DocId = quint64;
- using Inode = quint32;
- using DeviceId = quint32;
- replace quint64 with DocId ~528 times
- substitute quint32 with Inode or DeviceId
- #include idutils.h where needed
This diff will be huge, but easy to review
These changes should be neutral. Are they really?
- create create class DocId
- 2 members: DeviceId and Inode
- multiple constructors and operator implementations
- Remove typedef quint64 DocId;
This should still be neutral. Is it really? Does this change affect performance?
- centralize reading and writing of ids in DocId class
- Hopefully this will offer the oportunity to change e.g. Inode to 64bit and a better DeviceId while retaining compatibility to existing databases with 64bit DocIds. I have the feeling that is possible, but not idea how, yet. Theoretically DocIds with different sizes can coexist within a database. However, trouble comes with code like this:
MDB_val val; int rc = mdb_get(m_txn, m_dbi, &key, &val); QVector<quint64> list(val.mv_size / sizeof(quint64)); memcpy(list.data(), val.mv_data, val.mv_size);
Finally, in a distant future:
- Change DocId creation to be really unique.
There's a dilemma lurking here. Diffs should be small and digestable but the whole change should be atomic.