10micron ascom driver crashes

**Manoj Koushik** · Nov 13, 2019, 02:50

Hi all, wondering if there is anyone else here who has a similar setup as mine and has *ever* seen the issue I am seeing with alarming consistency.

I have a 10micron GM 3000 HPS mount which is being controlled by ACP. The ascom driver version is 1.5.1.0 (the latest that's posted on their forum).

Every so often it just randomly crashes leaving ACP with a "hardware error" which causes ACP to crash. Unfortunately, it does not look like the 10micron ascom driver has any logging. When I see in Windows Event viewer, every time I see this, there is an event logged of event type CLR203r (basically an unhandled exception) against the 10micron ascom driver.

I have, to eliminate all other sources of issues, disabled wifi on the controller box and also removed the ethernet connection (this way, no router mis-behavior should affect the mount). The only physical connection is RS232 and the ascom driver is configured as such. Even then it crashes as mentioned above.

So, there is some polling sequence that ACP is doing that is causing an unhandled exception in 10micron Ascom driver (this is not an ACP problem. If you expose an API, no matter how it's called, crashing out, is not the calling programs' fault).

I have noticed this happen over the last few days, when I am collecting darks. Basically the mount is stopped and there is not much happening except for constant polling. Which is surprising because it's the same set of calls! Not sure why the 100th call to get something would be different than the 1st. Maybe an outofmemory type issue?

On Bob's suggestion, since there is no other logging, I am running the driver through ascom device hub, which does provide logging to see if there is a specific call that is causing this. Hoping this provides more info.

I have contacted 10micron (via Ed Thomas@deepspaceproducts) but have not heard back much in terms of investigation or discovery.

In any case, just want to poll to see if anyone else is using a GM3000HPS with the 1.5.1.0 ascom driver and has seen anything remotely similar to this.

Any help will be much appreciated. This is bugging the crap out of me and making me reconsider my (expensive) purchase. Maybe astrophysics would have been a better bet.

**Eric Dose** · Nov 13, 2019, 14:56

I've seen random GEM driver crashes happen for two reasons:

The all-too-common USB Suspend problem from Windows 7 and 10. At least one Windows Update has overwritten many users' (correctly) chosen disabling of USB Selective Suspend to re-enable it, leading to USB crashes. The mount usually crashes first, as it polls fastest. Not sure if USB is involved in your mount setup.
For mounts with fragile physical USB interfaces (like Paramount's), any nearby electrical interference can reset its USB. My own instance was caused by a thermostat 6 feet away whose contacts sent enough of an EM pulse to disrupt my previous Paramount MX+. This problem will be more common in the crowding of hosting facilities. I put ferrites at each end of each USB line to dampen common-mode noise, and that helps, but sparks from mechanical switches and relays can still do harm. Happily, I've never even once noticed such a problem from Digital Logger's web switches--good parts and metal enclosures certainly help.

Or, as you've guessed, it could be software, but the above are 2 things to check in the meantime.

Well, that's what I've seen. Best of luck with it.

**Bob Denny** · Nov 13, 2019, 15:27

Thanks Eric, I keep forgetting to tell people about USB suspend. It's evil.

**Manoj Koushik** · Nov 14, 2019, 00:35

I have made sure that USB suspend is disabled. So that can't be it. The driver ran last night through the device hub and did crash faithfully sometime in the morning. Attached is the trace from the device hub. It crashed when trying to set tracking to false (although the same command had succeeded in the past in the same log). So it's not a fundamental problem with the call itself. Something else is going on.

I figured it might be something like a memory leak, or maybe an issue with caching where one too many requests too frequently causes it to crash, so created this simple script to stress it out. And ran it for 500 iterations with no delay. And it does not crash. In task manager I don't notice the memory ballooning either. So it does not seem like a memory leak. Not I could run this for many more cycles, but not sure if that's productive.

var numRuns = WScript.Arguments.Item(0);
var delay = WScript.Arguments.Item(1);

WScript.Echo("Doing a " + numRuns + " runs...");
var mount = new ActiveXObject("ASCOM.tenmicron_mount.Telescope");

mount.Connected = true;

var i = 0;
while (i < numRuns) {
WScript.Echo(i);
mount.CanSetTracking();
mount.RightAscension();
mount.Declination();
mount.AtPark();
mount.Slewing();
mount.Tracking();
mount.SideOfPier();
mount.Tracking = false;
i++;
WScript.Sleep(delay);
}
mount.connected = false;

So it looks like there is some unique thing that's going on between 10micron ascom driver and ACP that is causing this

Running out of ideas on how to debug this.

On other forums and the interwebs in general I not found any mention of this sort of a problem and I have been working with Ed, who also mentions that he has not seen this with others!

My system is plain vanilla. I can't think of anything that is super unique about it or the installation so far. So not sure why I am the statistical outlier here. Such a bummer.

**Bob Denny** · Nov 14, 2019, 15:15

Good morning from San Jose.... this will be a result of the patterns of activity coming from ACP and something like a latent timing bug in the 10 Micron driver, may e an unprotected shared variable being accessed from multiple threads etc. Those kind of things are almost impossible to trigger with test code, you’re shooting in the dark trying to reproduce the timing and patterns.

The best/fastest way to zero in on it, since your ACP activity is able to reliably reproduce it, is to get together with the 10 Micron developers and get a debug version or one that’s instrumented to do more logging or something, to get right to the code that’s failing. Ideally, if you have Visual Studio 2017/2019 on that system, and the driver is a C# app (I’m 99% sure it is having talked with the late Per Frejvall, the original author) you could catch it right at the point of failure and they could determine what’s getting jammed up or (likely) seeing trash from being in the process of being modified from another thread or ???

**Manoj Koushik** · Nov 14, 2019, 20:17

You are absolutely right Bob. Just wanted to eliminate other easier to identify issues like memory leaks etc.

I have narrowed it down to a point where, at least, every time it's happening, it is happening in AcquireImages.js when tracking is turned off after the sequence is done (based on the logs from the DeviceHub, this is consistently the behavior when it crashes). I am currently only noticing this when acquiring darks where the mount is stopped etc. Otherwise nightly runs, including morning flats have not been a problem.

if(Telescope.CanSetTracking)
{
Console.PrintLine(" (turning tracking off for safety)");
Telescope.Tracking = false;
Util.WaitForMilliseconds(1000); // **TODO** Remove after this is done inside ACP
}

This happens after the last image is done. And I can see that this is where the crash happens and the last image is not saved. If I change this to

if(Telescope.CanSetTracking && Telescope.Tracking)
{
Console.PrintLine(" (turning tracking off for safety)");
Telescope.Tracking = false;
Util.WaitForMilliseconds(1000); // **TODO** Remove after this is done inside ACP
}

I stop noticing the problem (or I have not yet done enough runs for it to happen). But it's perplexing because Telescope.Tracking = false by itself is not a problem even if tracking is currently stopped. So unfortunately, like you said, it's probably a timing issue and making it hard to reliably reproduce.

I have asked 10micron folks to provide me an instrumented driver, or I can even debug for them if they provide sources. Let see. For now, I have modified AcquireImages.js to have that additional check for Telescope.Tracking which avoid that call and hence the problem.

**Bob Denny** · Nov 14, 2019, 23:20

Ha ha ha!!! So testing whether it is tracking right before turning tracking off...... That really looks like a timing bug!! They might even be able so spot it in the code if you tell them just that. Note that, in general, the test for tracking needs to be there because of some other misbehaving mount that I have long ago forgotten about (I can likely find it in the deep code tracking history but it is harmless so why bother?) or to simply avoid logging “(turning off tracking)” if the tracking is off already.

My question is how in the hell did you come up with removing that tracking test as a “shot in the dark”?????????

**Manoj Koushik** · Nov 15, 2019, 00:22

Bob!! First off, you brought clouds with you to San Jose! :-p Welcome BTW, looking forward to seeing you tomorrow.

On the issue itself, it's the other way around. Right now in AcquireImages.js the check does not exist and it will always try to turn tracking off at the end of the sequence if the mount supports setting tracking.

I put the check in, which basically skips setting of the tracking to off if it is already off.

In any case this was a red herring. Basically here's what happened:
- I managed to get to a run of calibration frames which was short enough that I could repeatedly reproduce the issue (2 frames of 10 sec darks)
- I ran ACP-->Ascom Device Hub-->10micron Ascom driver
- From the Ascom Device Hub logs, I noticed that it would always crash when trying to set tracking off after the last image
- Starting looking at AcquireImages.js and figured let me add the check so that it skips that particular call if the tracking is already stopped

And this did fix the issue. BUT, I then went back to ACP-->10micron Ascom driver. And the problem happened again. I realized that the chaining of Ascom Device Hub to 10micron Ascom driver was probably changing up the timing enough that it was behaving differently than when connected directly to 10micron Ascom driver.

So I restored AcquireImages.js to what it was before. And went back to chaining the drivers. Again I could repro the issue with that minimal set of darks.

Now, I suspected the issue is probably related to timing in how the driver is caching the values and when clients are requesting it. So I disabled caching in the 10micron ascom driver and the problem went away. I turned back caching to make sure I could go back to reproducing the issue, and I no longer can!

For now I will turn off caching, connect ACP to 10micron driver directly and see what happens tonight.

**Bob Denny** · Nov 19, 2019, 17:00

OK, thanks for doing a great job diagnosing and finding the problem. It was great to meet you at the conference! I made you a Wizard here :-) :-)

**Manoj Koushik** · Nov 19, 2019, 21:07

Thank YOU Bob!

Thread: 10micron ascom driver crashes

Thread Tools

Display

10micron ascom driver crashes

Thread Information

Users Browsing this Thread

Similar Threads

ASCOM AstroHaven driver

Astrohaven ASCOM driver?

Is it better to use the native or ASCOM driver?

Help getting chooser to find new ASCOM driver

AP1200 V2 Ascom Driver

Posting Permissions