Fix issue with Meridian Flip Retry
ClosedPublic

Authored by murveit on May 2 2020, 7:34 AM.

Details

Summary

The retry timing if fixed so that it waits at least 4 minutes
Mount waits if Capture is in a state where it won't accept the meridian flip request.

Test Plan

In a situation where the mount doesn't flip (e.g. it needs a larger HA angle before going to the other side of the pier)
this will wait at least the full 4 minutes.
Also, if the capture module is still busy with tasks from the previous failed flip, this should wait until capture is ready to rety.

Diff Detail

Repository
R321 KStars
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
murveit created this revision.May 2 2020, 7:34 AM
Restricted Application added a project: KDE Edu. · View Herald TranscriptMay 2 2020, 7:34 AM
Restricted Application added a subscriber: kde-edu. · View Herald Transcript
murveit requested review of this revision.May 2 2020, 7:34 AM

I can't comment on the change to the way that the capture meridian flip state and the mount meridian flip status state machines interact because there's no description of how the capture state machine works, nor what these changes do and why.

It looks as if some additional signals and slots have been added to pass state information between the modules. This is in addition to the polling that both modules do to monitor and respond to each other's state. Is this mixture a good idea?

Could we have some more description of what this very complex code is intending to achieve? And future developers will thank you if someone could document how the state machines should work.

kstars/ekos/capture/capture.cpp
3520–3521

Could do with some curly brackets here to improve readability.

3523

What does this do and why?

kstars/ekos/mount/mount.cpp
1293–1296

No issues with doing this, I'd missed that it could take some time before the flip is attempted and fails.
How will this test work if the mount doesn't report pier side? pierside could be PIER_UNKNOWN
From what I can see it will use ha + 12 which is the case for the flip as the mount is crossing the meridian under the pole at about 12h.
Maybe it would be better to check for PIER_EAST and swap the if and else statements.

chrisrowland added inline comments.May 2 2020, 3:21 PM
kstars/ekos/mount/mount.cpp
1293–1296

Tried and there doesn't seem to be a problem. The PIER_UNKNOWN case has already been trapped so it can't get here.

murveit marked 4 inline comments as done.May 2 2020, 7:22 PM

Chris,

I agree whole-heartedly with the need for doc for the state machines, but I am not on top of them, nor have I made any real changes to them. If you like, we can collaborate on some doc for mount and capture, but I think that's outside the scope of this PR.

I do want to explain what I did with respect to the new emit in capture.cpp. When capture emits newMeridianFlipStatus, mount's meridianFlipStatusChanged() method gets called. Those two things are tied together by the "connect" in ekos/manager.cpp line 3369. I believe that is the only effect of that emit.

Unfortunately, mount's meridianFlipStatusChanged() method also got called inside mount.cpp at line 805. That is why I added the ...Internal method(), so that the internal mount call wouldn't execute the 4 new lines I added 1378-1384, but rather would do what it used to do.

So the net effect of the new capture emit, is simply to update the new Mount::m_CaptureMFStatus class variable. I believe that's all it changes.

Now, the new class variable is just used in one place, line 1225, where it may delay mount moving from FLIP_NONE to FLIP_PLANNED. The reason I delay this, is because if mount moves to FLIP_PLANNED, it emits newMeridianFlipStatus(FLIP_PLANNED), which (via the connect in manager.cpp line 3371) calls Capture::meridianFlipStatusChanged(FLIP_PLANNED), and that will ignore the message unless the capture state is MF_NONE. So now mount doesn't send it until capture won't ignore it. If mount sent it earlier, the flip wouldn't happen, as I described in my email bug report. I did try to comment this with the comment on line 1223. If you feel this comment isn't clear enough, please suggest how you'd like it worded, assuming it is clear to you now.

Like you, I don't fully understand the mount/capture state machines enough to make significant changes to them, so I tried to keep things as they were, except for this one delay in change of state.

kstars/ekos/capture/capture.cpp
3520–3521

I removed the dead code, and the IMHO obvious comment.
I think without them, the readability is fine, but if you want, I can add braces here.

3523

Please see my general comment on this update.

kstars/ekos/mount/mount.cpp
1293–1296

I did what you suggested and reversed if/else and tested for east. As you say, unknown doesn't get here.

murveit updated this revision to Diff 81754.May 2 2020, 7:24 PM
murveit marked 2 inline comments as done.

Comments and cosmetic changes

Thanks for the update. Can this be merged now while the state machine discussion + documentation be tackled later on? Would this introduce any further regressions?

The general comments are a great help in understanding what this does but don't address what I think is needed, more description in the code of what is being done and why. The comments here are ephemeral and in six months time when somebody else needs to work on this code they will not be available.

Can some of the description of what this does be added to the source? Thge comments hereIt will then be of use to the next person who needs to work on this.

As for the overall design of this fix what seems to have happened was that in some cases the capture module would cancel a request from the mount to pause acquisition instead of leaving the request in place until the capture was in a position to pause. The fix is to add a sort of half state in the mount where a pause request is delayed if the capture module is not in a state to accept it, then, when the capture module is in a state to accept a pause request, to send it. This sort of merges the mount meridian flip and capture meridian flip state machines.

I am uneasy about this, it seems to be adding a special case to manage another special case, a band aid on a band aid. Is there a way that the original problem - that Capture had decided to cancel a meridian flip request - be fixed?

I don't feel that I'm competent to decide that this is OK. The capture and mount state machines are so complex that I can't tell. I understand that it hasn't failed any testing so far and maybe that's the best that we can say.

No reason to delay.

mutlaqja accepted this revision.May 4 2020, 10:16 AM

I agree. I think it would be a good idea to freeze the Ekos code after this merge until the state machine issue is managed better and more documentation is added.

This revision is now accepted and ready to land.May 4 2020, 10:16 AM
This revision was automatically updated to reflect the committed changes.