Fix crash when cups returns jobs with duplicate id
ClosedPublic

Authored by aacid on Nov 29 2019, 11:56 PM.

Details

Summary

For some reason my cups was giving me two withheld jobs with id 33 and two with id 40

That made the JobModel code crash, because it went like this

  • First job with id 33 found
  • insertRow with 0 called
  • Row 0 inserted
  • Job at row 0 updated (from inside insertRow)
  • Second job with id 33 found
  • The "oh i already have this job code triggers", updates the job, then takesRow 0 and inserts at row 1. QStandardItemModel doesn't like getting a row add at 1 inserted when empty
  • First job with id 40 found
  • insertRow with 2 called
  • Row 2 inserted, it fails, QStandardItemModel doesn't like getting a row add at 2 when empty
  • Job at row 2 updated (from iniside insertRow)
  • Crash because there's no row 2 in the model

BUGS: 414083

Test Plan

Doesn't crash anymore with my weird cups list of pending jobs

Diff Detail

Repository
R363 Print Manager
Branch
release/19.12
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 19291
Build 19309: arc lint + arc unit
aacid created this revision.Nov 29 2019, 11:56 PM
Restricted Application added a subscriber: kde-utils-devel. · View Herald TranscriptNov 29 2019, 11:56 PM
aacid requested review of this revision.Nov 29 2019, 11:56 PM

Hmmm, i think this may not be correct, need to figure out how rows are supposed to be removed

aacid updated this revision to Diff 70594.Nov 30 2019, 12:18 AM

update code

Now it should be good, problem is i ended up doing "cancel -a" to clear my list of pending jobs so now my queue is empty so can't be 100% sure it fixes the crash.

But the AFAICS the logic is good.

fvogt added a subscriber: fvogt.Nov 30 2019, 10:45 AM

I wonder whether the job duplication can just be ignored like this or whether there has to be a "virtual" unique id instead to be able to address duplicated jobs as well.
When cancelling a duplicated job with this patch, are both gone or just one of them?

I can't play with it anymore because i fixed my jobs queue already but.

lpstat -R listed the jobs twice (exact same info for each of the 2 pairs)

the cups web interface list 4 jobs too

With this fix the print-manager UI only lists them once, canceling through the print-manager UI did "nothing" on the UI level because all it did was cancel one of the jobs so you still had the other job available.

After cancelling one job, trying to cancel it again (the reminder one) through the print-manager UI gave an error saying "this job is already cancelled"

I want to remember that at that point trying to cancel thourgh the cups web interface also gave an error (but can't be sure here)

I had to cancel the jobs with a wholesale "cancel -a" that cleaned up everything.

In my mind this is a CUPS fuckup and the best we can do is try not to crash.

But if someone was able to have steps to reproduce the issue in cups we could see if there's a better way out of it

So i got access to another machine with broken situation and i confirm, it has duplicate ids too and my patch fixes the crash

dantti added a comment.Dec 1 2019, 7:19 PM

Oh, now I see the problem with this, I tried many times reproduce but never managed to see the issue with the dups IDs.

I think perhaps line 205 (insertRow()) could be fixed to always check if 'i' is greater than rowCount();
would perhaps make for a smaller patch.

dantti added a comment.Dec 1 2019, 7:20 PM

But the remove old jobs code would also need fixing to account with the dups

dantti added a comment.Dec 1 2019, 7:25 PM

Well I think your approach is probably the best to this, I'd only change the sanitize loop to use iterators and erase as afair they are a little faster.

aacid added a comment.Dec 1 2019, 10:25 PM

Well I think your approach is probably the best to this, I'd only change the sanitize loop to use iterators and erase as afair they are a little faster.

In a QList, direct index access is not really slower than an iterator (that would be the case of QLinkedList).

Also even if it was, we're speaking of lists of what, 10 jobs? Given that i can no longer reproduce the issue i would very much prefer not to rewrite the code

Perhaps this is helpful. It appears that the CUPS queue (or a cached version of it) is stored in /var/cache/cups/job.cache. By editing it manually, you can remove duplicate entries but also create them for testing.

My queue with duplicate ids, which led to plasma crashing, looked like this:

$ lpstat -R                                                                                                                                                                                                                                                                
  0 MyPrinter-148         marcel           60416 2016-05-24T16:35:53 CEST                                                                                                                                                                                                                   
  1 MyPrinter-148         marcel           60416 2016-05-24T16:35:53 CEST                                                                                                                                                                                                                   
  2 MyPrinter-188         marcel           72704 2017-08-29T17:49:47 CEST                                                                                                                                                                                                                   
  3 MyPrinter-188         marcel           72704 2017-08-29T17:49:47 CEST                                                                                                                                                                                                                   

Both job 148 and 188 had duplicated sections in the job.cache file. While CUPS is not running, I edited the file and removed the duplicate sections. After re-starting CUPS, only a single copy of each job remained in lpstat -R output.

To reproduce the duplicate ids, print a document but keep it in "held" state so that the job stays in the queue: lpr -PMyPrinter -q document.pdf. Find out the job id (with lpstat or similar). Then stop cups, edit the job.cache and duplicate the section in that file corresponding to the just printed job (probably at the end of the file). This is how one of those sections looks:

<Job 257>
State 4
Created 1575280039
Priority 50
Username marcel
Name document.pdf
Destination MyPrinter
DestType 0
KOctets 393
NumFiles 1
File 1 application/pdf 0
</Job>

After starting CUPS, that job will also appear duplicated when inspecting the queue with lpstat -R.

I still don’t know how this can happen without manually mucking around with job.cache, but the above at least reproduces the symptom.

dantti accepted this revision.Dec 2 2019, 1:51 PM

LGTM

This revision is now accepted and ready to land.Dec 2 2019, 1:51 PM
This revision was automatically updated to reflect the committed changes.