Adding the error handling strategy caused a regression such that running jobs where a constraint is not met are nevertheless restarted. Additionally, setting them to "COMPLETED" as soon as a constraint is not met, is misleading. With this diff, jobs are set to state IDLE as soon as a constraint is no longer met. Jobs where a constraint is not met are scheduled to a future date where the constraint is met again.
Details
Test case #1: Create a job with a constraint - e.g. the altitude - is met and that is infinitely repeated. Start the job and check whether
- the job is postponed as soon as the constraint is not longer valid
- as soon as the constraint is valid again, the job is restarted
Test case #2: Add a job in front of the job from test case #1 that is only executed once. Start the schedule and check that the completed job will not be restarted as soon as the scheduler wakes up.
Diff Detail
- Repository
- R321 KStars
- Branch
- scheduler_stop_idle
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 17281 Build 17299: arc lint + arc unit
Still in progress, so far so good.
This morning I tested a very simple Simulator observation schedule using Deneb with default constrains, and a SimClock interval of 5 minutes.
Deneb starts after dusk, proceeds to focus and align, is cancelled when dawn is there and is rescheduled to the next night.
That situation repeats indefinitely as if a multi-night schedule.
It would be quite important to use pre-emptive shutdown here, as the Mount Simulator probably wrapped cables up multiple times during my test... (I spotted a wrong warning there, I'll fix that)
Proceeded with a test with two schedule jobs, one that is able to finish inside one night, and one that cannot.
First job runs after dusk, and manages to finish before dawn, and is marked Completed.
Second job starts to run, is preempted by dawn, and is marked Scheduled for the next night while the first remains Completed.
So far so good.
But when Scheduler wakes up on next dusk, the first job is reset to Scheduled and is run again.
I'm not convinced this is a regression from D22446, I'm having a look.
OK so as far as I have investigated this issue doesn't seem to be a regression from D22446: when Scheduler wakes up, Scheduler::wakeUpScheduler() calls Scheduler::start(), which calls Scheduler::startJobEvaluation(), which resets all jobs to JOB_IDLE.
Scheduler::startJobEvaluation() is supposed to be called from the UI when the end-user really presses the play button.
Here state is SCHEDULER_IDLE, and the wake up procedure doesn't seem to care whether Scheduler is actually more or less already running.
That needs a bit more time to check.
All in all, test #2 is KO, but something older is at work there.
I think I got the thing solved. I'm running further tests and I will update this differential.
- Fix log crash when scaling SimClock without current job.
- Assert prerequisite of valid current job when testing whether Scheduler should sleep or not.
- Update altitude each time Scheduler timer elapses while running.
- Fix preemptive shutdown warning when preemptive shutdown is enabled.
- Adjust altitude-related tooltip in queue table.
- Fix jobs being reset when Scheduler wakes up.
So after this change, what's the meaning exactly of IDLE and ABORTED? when is the job made into ABORTED?
@wreissenberger Please check the PR
kstars/ekos/scheduler/scheduler.cpp | ||
---|---|---|
1564 | Maybe update the comments? | |
3110 | for (auto job: jobs) |