Mystery function in Game.Run using more CPU than Update+Render
I implemented multi-light shadow maps in my XNA game engine and I am experiencing a pretty significant slowdown. I checked with PIX and I seem to be at a reasonable number of DrawPrim (under 50) and SetRenderTarget (single digit) calls per frame. The shadow map shader is based on the D3D9 sample. I had the Task Manager open and I spotted my game was only using about 60% CPU (I have only a single core processor).
My profiler shows this:
48.92 % Game.Run... - 10384* ms - 0 calls
13.82 % ?function?* - 2933 ms - 987 calls - ?class?.?function?*(?parameters?)
7.34 % Draw - 1557 ms - 329 calls - SharpX.Game.Draw(GameTime)
4.94 % Update - 1050 ms - 620 calls - SharpX.Test.TestGame.Update(GameTime)
1.71 % MouseSubClassFunc - 363 ms - 401 calls - .MouseSubClassFunc(HWND__ *, UInt32, UInt32, Int32)
(The 0 calls for "Game.Run" is because I started the profiler after the game was already running)
The profiler says that ?function? is an unmanaged function. What could cause it to take up such a disproportionally large CPU time slice? I think I am CPU bound and the game loop is idling, but I'm not sure what to do.
Thanks!
You can't be CPU bound and have the game loop idle at the same time.
Typically, XNA will have vblank enabled. This means that, when running each step at 16 ms or less, you'll get 60 Hz; however, as soon as you go over that limit (to 17 ms), you'll drop down to 30 Hz -- and the driver will likely spend a lot of time inside the kernel, just waiting for the vsync to happen and free up the next back buffer. That MAY be what you're seeing -- you could take VTune (for Intel) or CodeAnalyst (for AMD) to your application to get a good view of what it looks like from the native point of view.
The problem occurs with VSynch on or off, but the framerate is neither 30 nor 60. It varies from single digits to the 60 cap.
I played with it some more, and you are probably right, it can't be the CPU because the slow down depends on what I am looking at, but I am not doing any culling yet.
I'm sending a very reasonable amount of polys and batches with only two objects even textured (512x512 textures right now) and vertex shaders 1.4 with pixel shaders 2.0 and not even that crazy of shaders at that (as I said: based on the D3D Shadow Mapping sample).
The code is a from-memory re-write of my work with Managed Direct3D and the art assets are very similar, but the MD3D version shows no slowdown. I'm not blaming XNA, I'm sure I messed something up in the translation.
What could be causing the slow down? I realize that is a hard question to answer without seeing the code, but then where should I start looking?