Changeset View
Standalone View
src/file/filecontentindexer.cpp
Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Line(s) | |||||
49 | 49 | | |||
50 | void FileContentIndexer::run() | 50 | void FileContentIndexer::run() | ||
51 | { | 51 | { | ||
52 | ExtractorProcess process; | 52 | ExtractorProcess process; | ||
53 | connect(&process, &ExtractorProcess::startedIndexingFile, this, &FileContentIndexer::slotStartedIndexingFile); | 53 | connect(&process, &ExtractorProcess::startedIndexingFile, this, &FileContentIndexer::slotStartedIndexingFile); | ||
54 | connect(&process, &ExtractorProcess::finishedIndexingFile, this, &FileContentIndexer::slotFinishedIndexingFile); | 54 | connect(&process, &ExtractorProcess::finishedIndexingFile, this, &FileContentIndexer::slotFinishedIndexingFile); | ||
55 | 55 | | |||
56 | m_stop.store(false); | 56 | m_stop.store(false); | ||
57 | auto batchSize = m_batchSize; | ||||
58 | | ||||
57 | while (m_provider->size() && !m_stop.load()) { | 59 | while (m_provider->size() && !m_stop.load()) { | ||
58 | // | 60 | // | ||
59 | // WARNING: This will go mad, if the Extractor does not commit after N=m_batchSize files | 61 | // WARNING: This will go mad, if the Extractor does not commit after N=m_batchSize files | ||
60 | // cause then we will keep fetching the same N files again and again. | 62 | // cause then we will keep fetching the same N files again and again. | ||
61 | // | 63 | // | ||
62 | QElapsedTimer timer; | 64 | QElapsedTimer timer; | ||
63 | timer.start(); | 65 | timer.start(); | ||
64 | 66 | | |||
65 | QVector<quint64> idList = m_provider->fetch(m_batchSize); | 67 | QVector<quint64> idList = m_provider->fetch(batchSize); | ||
66 | if (idList.isEmpty() || m_stop.load()) { | 68 | if (idList.isEmpty() || m_stop.load()) { | ||
67 | break; | 69 | break; | ||
68 | } | 70 | } | ||
69 | QEventLoop loop; | 71 | QEventLoop loop; | ||
70 | connect(&process, &ExtractorProcess::done, &loop, &QEventLoop::quit); | 72 | connect(&process, &ExtractorProcess::done, &loop, &QEventLoop::quit); | ||
71 | 73 | | |||
74 | bool hadErrors = false; | ||||
apol: `= false;` | |||||
75 | connect(&process, &ExtractorProcess::failed, &loop, [&hadErrors, &loop]() { hadErrors = true; loop.quit(); }); | ||||
Is it OK to use QObject::connect inside a while loop? Those are not Qt::UniqueConnection, won't they fire multiple times (more and more, actually)? poboiko: Is it OK to use `QObject::connect` inside a `while` loop? Those are not `Qt::UniqueConnection`… | |||||
given loop lives on the stack it is destroyed when the scope is left and the connection severed broulik: given `loop` lives on the stack it is destroyed when the scope is left and the connection… | |||||
76 | | ||||
72 | process.index(idList); | 77 | process.index(idList); | ||
73 | loop.exec(); | 78 | loop.exec(); | ||
74 | 79 | | |||
75 | // QDbus requires us to be in object creation thread (thread affinity) | 80 | // QDbus requires us to be in object creation thread (thread affinity) | ||
76 | // This signal is not even exported, and yet QDbus complains. QDbus bug? | 81 | // This signal is not even exported, and yet QDbus complains. QDbus bug? | ||
77 | QMetaObject::invokeMethod(this, "newBatchTime", Qt::QueuedConnection, Q_ARG(uint, timer.elapsed())); | 82 | QMetaObject::invokeMethod(this, "newBatchTime", Qt::QueuedConnection, Q_ARG(uint, timer.elapsed())); | ||
83 | | ||||
84 | if (hadErrors && !m_stop.load()) { | ||||
85 | if (idList.size() == 1) { | ||||
86 | auto failedId = idList.first(); | ||||
87 | m_provider->markFailed(failedId); | ||||
88 | batchSize = m_batchSize; | ||||
89 | } else { | ||||
90 | batchSize = idList.size() / 2; | ||||
91 | } | ||||
Do I understand correctly, that's a binary-search-like way to find the file actually causing extractor to crash, and it will reindex some of the files in a batch ~log2(batchsize) times? poboiko: Do I understand correctly, that's a binary-search-like way to find the file actually causing… | |||||
Yes, its a binary search. bruns: Yes, its a binary search.
Because we only have IDs here, and the progress reporting works on… | |||||
There is Transaction::documentId(const QByteArray& path), which can resolve it using DocumentUrlDB. poboiko: There is `Transaction::documentId(const QByteArray& path)`, which can resolve it using… | |||||
But it is racy - if the file is replaced in the meantime, inode and filename no longer match. This is not completely unlikely when dealing with temporary files. It would make the code also significantly more complex, and I want it simple here. It should only be hit in exceptional cases. Also, it is not to uncommon to have a batch of size one from the start, i.e. when adding/modifying files. bruns: But it is racy - if the file is replaced in the meantime, inode and filename no longer match. | |||||
OK, we can simply count how many times we got finishedIndexingFile, and just go to the corresponding position in the batch. poboiko: OK, we can simply count how many times we got `finishedIndexingFile`, and just go to the… | |||||
Counting would give a good hint which one failed, but still no guarantee. Files may be added or removed during the indexer run, so the position is only approximate. The indexer may also have sent a "finished" message, and crash afterwards in a destructor call. Doing a binary search is straight forward and avoids any dependencies or assumptions about other parts of the code. Also, this code should be only temporary anyway - if the extractor is run in a separate process which only receives the file using a readonly file descriptor (for sandboxing) and passes back the result, the problematic documents id is known by the parent process. bruns: Counting would give a good hint which one failed, but still no guarantee. Files may be added or… | |||||
92 | process.start(); | ||||
93 | } | ||||
78 | } | 94 | } | ||
79 | QMetaObject::invokeMethod(this, "done", Qt::QueuedConnection); | 95 | QMetaObject::invokeMethod(this, "done", Qt::QueuedConnection); | ||
80 | } | 96 | } | ||
81 | 97 | | |||
82 | void FileContentIndexer::slotStartedIndexingFile(const QString& filePath) | 98 | void FileContentIndexer::slotStartedIndexingFile(const QString& filePath) | ||
83 | { | 99 | { | ||
84 | m_currentFile = filePath; | 100 | m_currentFile = filePath; | ||
85 | if (!m_registeredMonitors.isEmpty()) { | 101 | if (!m_registeredMonitors.isEmpty()) { | ||
Show All 32 Lines |
= false;