Safely handling bitmaps from webcam service.
Hi all,
I hope you'll bear with me on this question- I come from the world of pthreads where I had a pretty clear understanding of thread safety, race conditions, etc. The CCR, not to mention GDI is all a bit new to me. :-)
I have a service that's subscribed to the webcam service and polling it for images, very much like the soccer player sample. When I receive an image I copy the bytes to a bitmap using a memorystream and make another copy so I can close the stream:
Code Snippet
if (rsp.Frame != null)
using (MemoryStream stream = new MemoryStream(rsp.Frame, false))
{
Bitmap tempFrame = new Bitmap(stream);
_lastFrame = new Bitmap(tempFrame);
tempFrame.Dispose();
}
Subsequently, I send it to another service which displays it in a Winforms widget (I might operate on the image first, but my bug presents itself under this simple setup).
I find that when I run my service under the debugger everything works great, but when I unleash it from the command prompt, it will run for a while before crashing with a buffer overrun. If I attach a debugger I can see a handful of GDI+ calls in the stack trace but nothing otherwise informative to me. (I've pasted the salient bits of the stack trace below).
There seem to be two possible spots for race conditions. One is in receiving images from the webcam. I assume that the byte[] Frame is going to be safe until the QueryFrameResponse is destroyed, and I explicitly copy it to a new bitmap, so I don't think the problem is there.
The bitmap will be cloned (I think) in passing the message to the display service, and at the receiving end I pass it to a handler through a WinFormsServicePort.FormInvoke() call. The receiving form makes its own copy and assigns it to the form's image widget, and then calls Invalidate() on the form:
Code Snippet
if (update.Body.Image != null)
{
WinFormsServicePort.FormInvoke(
delegate()
{
_TrackerForm.UpdateFrame(update.Body.Image);
}
);
}
...
public void UpdateFrame(Bitmap frame)
{
Bitmap drframe = new Bitmap(frame);
this.picImageDisplay.Image = drframe
this.Invalidate();
}
My best guess is that update.Body.Image is destroyed before the UpdateFrame handler is scheduled. However, I have tried creating a new bitmap before calling FormInvoke() and the same crash happens.
Alternatively, perhaps the form's previous bitmap is destroyed with the assignment of the next frame, while the form is still drawing the previous frame.
I'm operating under the assumption that 'new Bitmap(bmp)' actually copies the bits of bmp to a new pixel buffer. I am interested to know if I can/should be doing anything more to try to protect the image until it's drawn.
Here's the stack trace:
msvcr80.dll!_crt_debugger_hook(int _Reserved=) Line 65 C
mscorwks.dll!7a262e12()
[Frames below may be incorrect and/or missing, no symbols loaded for mscorwks.dll]
mscorwks.dll!79e7c8de()
...
mscorwks.dll!79e75c77()
mscorlib.ni.dll!793d7b5c()
mscorwks.dll!79e771e9()
...
mscorwks.dll!79f10b04()
ntdll.dll!7c910732()
mscorwks.dll!79f2fa76()
...
mscorlib.ni.dll!7947192c()
ntdll.dll!7c90dec2()
kernel32.dll!7c801a7d()
kernel32.dll!7c801ae8()
mscorwks.dll!79e74b61()
...
mscorwks.dll!79e74b32()
ntdll.dll!7c90d57d()
kernel32.dll!7c80a049()
mscorwks.dll!79e74b94()
...
mscorwks.dll!79f1141b()
> GdiPlus.dll!4ed1d994()
mscorwks.dll!79f10529()
...
mscorwks.dll!79f134b2()
GdiPlus.dll!4ecad00c()
mscorwks.dll!79e751aa()
...
mscorwks.dll!79e7e7a4()
mscorlib.ni.dll!7947149f()
...
mscorlib.ni.dll!796566f9()
System.Drawing.ni.dll!7ae324f2()
System.Drawing.ni.dll!7ae324c0()
mscorwks.dll!79f1ef33()
...
mscorwks.dll!79f1d940()
GdiPlus.dll!4ed83c69()
...
GdiPlus.dll!4ec83938()
mscorwks.dll!79f025b2()
System.Drawing.ni.dll!7ae09130()
...
System.Drawing.ni.dll!7ae08f9d()
mscorlib.ni.dll!79596c4c()
...
mscorlib.ni.dll!793d7b5c()
mscorwks.dll!79e88f63()
...
mscorwks.dll!79e88db3()
mscorwks.dll!79e88dc3()
ntdll.dll!7c91056d()
mscorwks.dll!79e783ca()
mscorwks.dll!79e783e6()
ntdll.dll!7c910732()
mscorwks.dll!79e783e6()
...
mscorwks.dll!7a07d585()
ntdll.dll!7c91657e()
...
ntdll.dll!7c9106eb()
kernel32.dll!7c80e4a4()
...
kernel32.dll!7c80e62b()
ntdll.dll!7c91056d()
mscorwks.dll!79e783ca()
mscorwks.dll!79e783e6()
ntdll.dll!7c910732()
mscorwks.dll!79e783e6()
...
mscorwks.dll!79ecb00b()
ntdll.dll!7c919d27()
...
ntdll.dll!7c919aeb()
oleaut32.dll!771215f8()
oleaut32.dll!77121629()
ntdll.dll!7c9106eb()
PhysXCore.dll!07095dc2()
...
PhysXCore.dll!0708e2b0()
xinput1_3.dll!08e67843()
ntdll.dll!7c9011a7()
...
ntdll.dll!7c918dfa()
mscorwks.dll!79e9ea54()
ntdll.dll!7c918dfa()
...
ntdll.dll!7c90eacf()
mscorwks.dll!79e9ea54()
...
mscorwks.dll!79e9ea54()
kernel32.dll!7c80b683()
mscorwks.dll!79e9ea54()
One more thought on this issue- do the PhysXCore calls indicate that the problem is in the simulator? I am running the simulator with a differential drive and polling it for robot pose information. Otherwise I am not interacting with it at all- i.e. no motion commands are sent. I have three cameras defined but never toggle between them.
cheers,
R
Hi i dont think this is a CCR/threading issue. More of a GC issue. Also your stack can not be parsed since the symbolds ar enot there, so we cant really tell what is going on. Having PhysxCore in there however makes us think this might all have nothing to do with your bitmaps.
I di notice however you are doing alot of extra copies that i dont think are necessery. Another issue might the "using" statement around the MemoryStream
using (MemoryStream stream = new MemoryStream(rsp.Frame, false))
{
Bitmap tempFrame = new Bitmap(stream);
_lastFrame = new Bitmap(tempFrame);
tempFrame.Dispose();
}
this will destroy the underlying stream immediately. YOu will need to check with the Bitmap calss documentation to see if it expect the underlying stream to be valid. If it does, dont use the "using"statement. Also you might be able to just pass the byte [] directly to the bitmap class.
Hi George,
Wrt the stack trace- there are no symbols because these are all system dlls. In fact there aren't any calls from my code in there- that's what makes me suspect it's the simulator or a system thread dedicated to the winform.
The MemoryStream issue: the destruction of the memory stream is the reason for the extra copy before it goes out of scope. I'm assuming the Bitmap constructor does a deep copy. One site suggests that memorystreams don't suffer from the same filestream bug, but I haven't test that theory:
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1390605&SiteID=1
The only other time I make a copy is in passing the image to the form. Maybe I can remove that. One other copy happens implicity when I pass the message to the display service. I don't think I can get around that one.
Not sure you have seen this example, but the SimpleDashboard uses bitmaps and displays them in a winform. It might shed some light on what is going wrong.
Hi Rob,
My concern here is with passing bitmaps between services. As I'm sure you know, it is important, especially when handling a lot of bitmaps over a short time frame, to always dispose bitmaps when no longer needed. For Example, when you set the image field of a picture box, in my experience, it is a good idea to dispose the old image (if any) after setting the new one.
In general I don't think that it is a good idea to send Bitmap objects in messages. I would think about only using GDI+ objects at the ends of the process, and handling the raw data everywhere in between. So if, for example, you are using the webcam service, query for raw frames (by using the empty guid as the format) and pass around the byte array. When you need to use this as a bitmap you can copy the bits back into a Bitmap using the LockBits method on the Bitmap class.
Hope this Helps
Paul
Thanks Paul, George,
I think you're right about passing around the raw data. That's something I can change easily enough. I'm not entirely sure it will solve the problem, though, since we haven't even concluded that the bitmaps are the source of the crash (I assumed so at first but now I'm less certain). I'll let you know what happens...
Hi guys,
Ok, I've narrowed down this bug. It looks like the same as this one:
http://connect.microsoft.com/roboticsstudio/feedback/ViewFeedback.aspx?FeedbackID=257747
That bug is marked resolved/closed. Any idea what the solution is?
It has nothing to do with stuff on my end. It manifests at random intervals when I just query images continuously from the webcam service:
Steps to repro:
MSRS 1.0, Microsoft Lifecam VX6000. Dual core CPU (if that matters)
The code is very similar to the soccer player sample. See below for the initialization/query code. The crash seems to be more likely to happen if I set the timer to a small value (faster than the camera's full frame rate), but I haven't checked this extensively. Under these circumstances I can see that the query responses contain repeats of the same image. One other thing I haven't tested is whether other request.format values produce the same result.
Sorry about the lack of tabification...
Code Snippet
protected override void Start()
{
// Listen on the main port for requests and call the appropriate handler. ActivateDsspOperationHandlers();
ConnectWebCam();
LogInfo(
LogGroups.Console, "PFTracker Service uri: "); }
private void ConnectWebCam() {
_camPort.Subscribe(_camNotifyPort);
Microsoft.Robotics.Services.MultiDeviceWebCam.Proxy.
Format format = new Microsoft.Robotics.Services.MultiDeviceWebCam.Proxy.Format(); // must match exactly one of the supported formats of the webcam... format.Width = 320;
format.Height = 240;
format.MinFramesPerSecond = 5;
format.MaxFramesPerSecond = 30;
camera.
WebCamState state = null; System.Threading.
Thread.Sleep(8000); Activate(
Arbiter.Choice( _camPort.UpdateFormat(format),
delegate(DefaultUpdateResponseType response) {
MainPortInterleave.CombineWith(
new Interleave( new TeardownReceiverGroup(), new ExclusiveReceiverGroup(), new ConcurrentReceiverGroup( Arbiter.Receive<DateTime>(true, _timerPort, GetFrame) )
));
_timerPort.Post(
DateTime.Now); },
delegate(Fault fault) {
LogError(
"Unable to set webcam format", fault); }
));
}
void StartWebCamTimer() {
TaskQueue.EnqueueTimer(
TimeSpan.FromMilliseconds(0.1), _timerPort);
}
private void GetFrame(DateTime timestamp)
{
webcam.QueryFrame query = new webcam.QueryFrame();
webcam.QueryFrameRequest request = new webcam.QueryFrameRequest();
request.Format = System.Drawing.Imaging.ImageFormat.Bmp.Guid;
query.Body = request;
_camPort.Post(query);
Activate(Arbiter.Choice(
query.ResponsePort,
delegate(webcam.QueryFrameResponse rsp)
{
SpawnIterator(ProcessFrame);
},
delegate(Fault f)
{
LogError(LogGroups.Console, "Could not query webcam frame", f);
StartWebCamTimer();
}));
}
public IEnumerator<ITask> ProcessFrame()
{
StartWebCamTimer();
yield break;
}
cheers,
R
the repeat of images, if you query faster than the frame rate is be design. We simply provide a cached copy. We will try this repro, thank you. Also, using the web page on the webcam service, setting the frame rate (polling interval) high, does it reproduce?
Yep, it sure does.
My hardware: MS Lifecam VX 6000. AMD Athlon 64 X2 Dual Core 3800+
MSRS 1.0
Repro:
1. Start up the webcam service: dsshost -p:50000 -m:"samples\config\webcam.manifest.xml"
2. Open the webcam dashboard in IE.
3. With the lifecam I have to explicitly set the size 320x240 and click 'change' due to the framegrabber bug I reported last week. This ensures we get a 30fps framerate from the camera (otherwise it does only 5 or 6 fps).
3b. I can produce the crash with either JPG or BMP format selected. I haven't tried other formats.
4. Set the refresh rate to 1ms (I have also repro'ed the crash with refresh set to 40ms, so I'm not sure this matters.)
5. Click start and wait a few minutes (anywhere from 30s to 5 mins).
6. ?
7. Crash. Debugging in VS always indicates a buffer overrun. The stack trace is usually similar to what appears above- some GDIPlus calls, a lot of system calls. Here are the few symbols that appear in a trace from this morning:
...
msvcr80.dll!_fputwc_nolock(wchar_t ch=L'=', _iobuf * str=0x00000100) Line 154 + 0xd bytes C
msvcr80.dll!write_char(wchar_t ch=L'', _iobuf * f=0x00000000, int * pnumwritten=0x00000000) Line 2440 + 0xa bytes C++
...
msvcr80.dll!_VEC_memcpy(void * dst=0x79e74ea0, void * src=0x00000000, int len=1280) + 0xb4 bytes C
... [ Several GDIPLus.dll calls in here, but no symbols ]
msvcr80.dll!__addlocaleref(threadlocaleinfostruct * ptloci=0x71ab130b) Line 248 + 0xe bytes C
...
The addlocaleref symbol is one that I've seen consistently in the past week.
At 1ms refresh, the dashboard shows a nominal framerate of about 36fps. (I'm impressed that IE can do a roundtrip HTTP request and refresh that fast!)
I hope you find this useful. I'm happy to provide as much info as I can. We are working on some vision-based services and it would help us a lot to have a stable webcam driver. :-)
One more update: I can reproduce this bug simply by starting the webcam service and never connecting/querying an image.
Yet another update:
This bug repros in CTP 1.5. I'm not sure how long it takes as I left it running overnight. Unfortunately I closed the crash dialog before I thought to debug it. If I can get it to happen again I'll post any useful info I can find.
As an aside, is there any value in clicking 'send report' when dsshost crashes? I usually don't bother.
cheers,
R
One more update: this same bug appears using a logitech quickcam 5000.
cheers,
R
If you're getting an AV, ...
+ first: is the AV getting translated to a managed System.AccessViolationException? Is the CCR catching that exception?
+ second: have you tried running DssHost.exe under AppVerifier? It's a great way to find-and-diagnose memory issues. Pair this with !heap -a, and you should be able to narrow down the line of code at fault in minutes.
#aaron
Hi Aaron,
It's a buffer overrun that's not caught by CCR. The stack trace often indicates an origin in GDI. My suspicion is that there's a problem in the framegrabber class (which I don't have symbols or source for). I've spent the last week implementing my own framegrabber class and learning first-hand about the unpredictable (lack of) thread-safety in GDI.
I'll see what AppVerifier has to say.
cheers,
R