lxr.kde.org seems to have incomplete index for General Search
Closed, ResolvedPublic

Description

While @nalvarez yesterday updated the git repo urls from no longer working git://anongit.kde.org to some that should work, and as a result at least the displayed https://lxr.kde.org/source/ show latest daily code snapshots, something else seems to be broken when it come to the general search index though.

See how searching for "KAboutData" yields only 5 results:
https://lxr.kde.org/search?_filestring=&_string=KAboutData&_casesensitive=1
Even 0 results for "qmlRegisterType":
https://lxr.kde.org/search?_filestring=&_string=qmlRegisterType&_casesensitive=1

These results have been the same for a few days now. IIRC on Monday things still worked though, where I also grepped for "qmlRegisterType" unless my memory fools me.

kossebau created this task.Jul 3 2020, 3:47 PM
Restricted Application added a subscriber: sysadmin. · View Herald TranscriptJul 3 2020, 3:47 PM

Looking at the scripts, it seems that Tuesday is reindex day, so possibly something went wrong there.
Investigating currently.

For the record: The daily lxr update tasks had been broken for a while. The kdesrc-build clone on halono was using git://, so the script failed very early when running git pull on it. I fixed the remote URL on the kdesrc-build clone and re-ran the update_kf5.sh script, and it started doing what it usually does daily. It updated to the new kdesrc-build code, which then automatically fixed the git URLs on the rest of the repositories.

It's possible that the search issue being reported here would fix itself on the next Tuesday reindex, after I fixed the above. But then it's weird that it worked before this Monday, rather than breaking the Tuesday after anongit was gone, so there's probably more to it...

I've done a manual reindex and that did not resolve the issue.

Looking through everything it seems that genxref isn't picking up the majority of the source code files - not sure why though.
Any ideas as to why this would happen @dfaure?

bcooksley changed the visibility from "Custom Policy" to "Public (No Login Required)".Jul 4 2020, 9:12 AM
bcooksley changed the edit policy from "Custom Policy" to "All Users".
dfaure added a comment.Jul 4 2020, 9:42 AM

One thing that used to work better, was that I was able to get email when the cron job fails.
But now it says
"
A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

faure@kde.org
  Mailing to remote domains not supported

"
and the mails stay in /var/mail/lxr, invisible until next ssh login.
Any chance for emails to be sent again?

It looks like when we provisioned the system we never ran dpkg-reconfigure exim4-config to configure email, and since nothing else on the system needed email, it was never noticed.

I've now done that, so you should be able to send cron emails again.

Thanks.

Here are my findings.

"general" search is powered by glimpse. I wrote a test script ~/bin/debug_glimpse.sh which reproduced the issue with a direct glimpse call, to eliminate the whole HTML + perl layers on top.

In my experience glimpse indexes can get corrupted, so I just did a rm -rf ~/glimpse-db and I'm rerunning ./update_genxref.sh kf5-qt5 in screen now. I think this is actually what's supposed to happen every tuesday, but we'll see.

I remember debugging a glimpse problem some time ago, but I think the problem was specific to some desktop files with encodings that confused glimpse. This problem now seems much more generic. However the point is, if I need to dig further into glimpse, I will, it won't be my first time :)

Thanks for the fix David. Out of curiosity, what was the cause of the issue?

dfaure added a comment.Jul 4 2020, 9:59 PM

Not fully sure. I thought it was corruption of the glimpse databases, but every tuesday they get deleted and recreated, AFAICS.

BTW ~/src/stable-qt4 is almost empty. I could look into fixing that, but I'm not sure it's worth it. It's time to delete the qt4 index, I guess?

Given that Qt 6 is just about around the corner I think we can safely say nobody has a particularly large interest in Qt 4 now, so probably safe to fully remove support for that now yes.

bcooksley closed this task as Resolved.Jul 5 2020, 10:09 AM
bcooksley claimed this task.

Thanks for sorting that out David.