Optimizations for rendering ROAM
I am implementing ROAM for a planetary procedural terrain. I am getting decent framerates for the actual rendering, but would like to check if someone has some directx specific info, since most roam implementations are done with OpenGL.
The mesh consitsts of currently 64000 triangles which I render with with 30FPS when they are all in view. This is a wildly triangulated ball which fills the entire 500x500 pixel window in wireframe.
The gfx card is an nvidia gf6800 mx or something in thet neighbourhood.. I forget.
What I do is to allocate an array of the needed vertices, which is roughly 160000 and then I alter the triangles, and create new ones, by changing the vertices in the array.
When it is time to render, I lock a vertexbuffer and then copy the vertices from my array to the vertexbuffer.
I do the same for the index buffer. Lock the buffer and write the indexes, which are contained in my triangle objects.
That is about all the directx specific rendering I am doing. The question is however if there is abetter way to go about this than to have my cganging mesh in an array and doing the lock, copy, unlock each frame.
I create my buffers like this
vertexBuffer = new VertexBuffer(typeof(CustomVertex.PositionColored),maxVertices, device, Usage.Dynamic | Usage.WriteOnly, CustomVertex.PositionColored.Format, Pool.Default);
indexBuffer = new IndexBuffer(typeof(short), maxTriangles*3, device, Usage.Dynamic | Usage.WriteOnly, Pool.Default);
Some hardware has lower performance when using vertex buffers with more than a certain number of vertices in them. I would try slicing that buffer into several buffers, each with < 30000 vertices in it. Or, at least, issue multiple DrawPrimitives calls, each specifying a single sub-range of the vertex buffer that's < 30000 vertices.
Another thing I would try is to turn off V-sync, and profile the application. You may be bound on the CPU writing into the triangle array, rather than on graphics.
I will try the smaller subset of triangles and see what it does. I was absolutely not aware that this could be an issue, so it is a good hint.
I did profile, and the copying to vertex array is taking up 70% of the time for each frame. That is partly why I am asking if it is even the right thing to do... working in an array and then copying to vertexbuffer. I was thinking about only copying vertices which have changed, but since they are spread out over the entire array, the checking would most likely be too costly to give any gain. Working directly on the vertexbuffer would (as far as I am informed) be extremely slow.
Any pointers regarding this would be helpfull.
Btw, I am using managed dx with c# and a standard array.