Fulltext search
Closed, ResolvedPublic

Description

We want to be able to fulltext search for mails.
Subject only would be a good start but more is welcome.

Related Objects

Options are:

  • xapian
  • sqllite
  • baloo?

There is also whoosh which is a fulltext index on top of lmdb, but written in python (but could provide inspiration)

Xapian is likely an easy candidate with as it supports single-writer, multi-reader semantics. We could just dump the full text blobs in there, and then do any additional filtering (flags, ....) later on using the indexes we have in lmdb or the actual data available.

cmollekopf triaged this task as Low priority.May 17 2016, 1:12 PM
mbohlender moved this task from Milestore: MailClient to 0.2 on the Kube board.Nov 8 2016, 1:28 PM
mbohlender edited projects, added Kube (0.2); removed Kube.
cmollekopf raised the priority of this task from Low to Normal.Nov 21 2016, 8:58 PM
cmollekopf moved this task from 0.2 to Backlog on the Kube board.Feb 21 2017, 12:40 PM
cmollekopf edited projects, added Kube; removed Kube (0.2).
cmollekopf merged a task: Restricted Maniphest Task.Feb 22 2017, 4:57 PM
cmollekopf moved this task from Backlog to 0.3 on the Kube board.Feb 28 2017, 5:14 PM
cmollekopf edited projects, added Kube (0.3); removed Kube.

Search will work along those lines:

  • A search query is issued, i.e. a simple fulltext query matching "anything" (meaning we'll have to define which properties that entails exactly).
  • The search is executed against the local search store, resulting in some immediate results. This is of course only possible for already downloaded content (that is also being indexed), so worst case this will yield no results.
  • Simultaneously the search is also sent to the resource in the form of a search command. This will trigger a search in the backend.
  • The results from the backend are transported back to the query and a the download of any match is triggered (so we have subject etc available).
  • As soon as the result is downloaded with sufficient data it can become part of the visible query result.

Some considerations:

  • One key problem will be to get consistent results online and offline. It's okay to get less results offline vs. online, but given all data the results should be consistent.
  • While we can do as-you-type searching in the local store, this will probably not work for the backend search.
  • Started searches that become irrelevant (because the query was changed etc.), will need to be aborted to avoid getting stuck on outdated searches (we will only be able to execute a limited amount of searches in parallel.
cmollekopf moved this task from 0.3 to 0.4 on the Kube board.Apr 19 2017, 11:44 AM
cmollekopf edited projects, added Kube (0.4); removed Kube (0.3).
cmollekopf moved this task from 0.4 to Backlog on the Kube board.Apr 20 2017, 8:37 AM
cmollekopf edited projects, added Kube; removed Kube (0.4).

We'll likely start off with fulltext query only when connected to the server. Locally you can still filter by subject, sender etc.

cmollekopf moved this task from Backlog to 0.6 on the Kube board.Feb 3 2018, 1:01 PM
cmollekopf edited projects, added Kube (0.6); removed Kube.

We now have local fulltext search based on xapian.

cmollekopf moved this task from Backlog to Done on the Kube (0.6) board.Feb 19 2018, 4:56 PM
cmollekopf closed this task as Resolved.Jul 5 2018, 3:02 PM
cmollekopf claimed this task.
cmollekopf closed subtask T8038: Conversation view search as Resolved.