I'm seeing some strange failures to recover on error that I could do with some help on - It's not urgent, but it would be useful to know if you can help with a 'quick fix'. ACP is version 8.0 build 6 service release 1, and Maxim DL is 6.13.
I am seeing an issue where my filter wheel occasionally locks up. I'm pretty sure it's a dodgy DC cable, though it might be a dodgy driver . Nothing you guys can do about that!
However, if the lock-up happens, then ACP gets into a very weird state that needs a complete restart to recover from.
The cascade of events that ends up with me needing to restart is:
- ACP working along just fine, tries to take an image (usually a pointing update, I guess because the dodgy cable gets loose after a slew to a new target)
- Maxim tries to move the filter wheel and fails.
- Maxim looks OK after this failure, but any attempt to take images with it after this gives a filter wheel error. Power cycling the wheel usually fixes Maxim, but sometimes Maxim needs to be restarted to recover.
Now, that's a bit painful and I'm troubleshooting it separately. I don't imagine it's anything to do with ACP though. Hopefully if I sort out my filter wheel power, then I won't hit this again!
My worry is that ACP doesn't really seem to handle this failure very well, and wondering if you've got any suggestions:
1) ACP doesn't notice the failure, and just sits waiting forever for the exposure to finish, leaving the mount tracking. Not sure if that's an ACP issue, or Maxim failing to return the right status. Log attached from ACP, but there's no detail in there that I can see - you will see a log at 12:07:08 UTC taking the pointing exposure, then nothing. The next log is at 16:09:46 UTC when I manually stopped and restarted Maxim, around 4 hours later to try to recover:
16:09:46 **Pointing update error from Microsoft VBScript runtime error:
The remote server machine does not exist or is unavailable: 'Camera.CameraStatus'
I imagine this means that Maxim simply hasn't returned an error message or something to ACP, but it would be really nice if ACP had some sort of timeout to detect this rather than sitting there forever with the mount tracking. It knows that it's trying to take a 10 second exposure, so if that doesn't return for 4 hours, then presumably you could run a timer that fires and takes recovery action? Even if that's just to shutdown - anything would be better than letting the mount continue tracking, and possibly hitting the pier. Could you look at adding something like this?
2) The other symptom is that if I manually stop/restart Maxim to recover from the filter wheel lock up issue, ACP doesn't notice that the old instance of Maxim has gone, and doesn't try to re-attach to the new instance of Maxim. Any attempts to take images result in the same 'remote server machine does not exist or is unavailable' error. In fact, even if you go to the ACP UI, and try to 'disconnect' the camera, you get the same error message in a popup window. That effectively means that once Maxim has terminated and restarts, ACP is stuck unable to communicate with it, and you can't even disconnect/reconnect from within the ACP UI to recover. That means that I then need to restart ACP as well. I'm not sure what logs would be useful to help troubleshoot this - but do you have any suggestions?
Thanks in advance!