pixel shader,strange result!
float4 PS(float3 Tex0:TEXCOORD0):COLOR
{
float4 ret;
ret = tex3D(Sampler3D,Tex0);
int dd= (int)(ret.r*255);
int mark = dd;
int markModul =0;
int i=0;
for(i=0;i<8;i++)
{
markModul = mark%2;
mark = (int)(mark/2);
if((markModul==1))
{
}
}
//return float4(0.12f,0.34f,0.743f,0.84f);
return ret;
}
I use the pixel shader code above to achieve some effect I want, the final frame rate is about 10 fps.
the strange thing is :
- if change the "ret.r"(line 5) to some other floating point numbers(between 0.0f and 1.0f) such as 1.0f, or change the "dd"(line 6) to any other integer(between 0 and 255), the final frame rate is about 20 fps.
- if enable the last third line("return float4(0.12f,0.34f,0.743f,0.84f);"), the member of "float4" can be any floating number between 0.0f and 1.0f, then the final frame rate is up to 70 fps.
I'm a new with directx, but I think for GPU,there are nothing change , please someone give me a explanation, thanks a lot! My GPU type is Nvidia 7900GT, and the technique code list below:
technique TVertexAndPixelShader
{
pass P0
{
ALPHAREF = 0x0000000A;
ALPHATESTENABLE = TRUE;
ALPHAFUNC = GREATEREQUAL ;
ZENABLE = TRUE;
CULLMODE = NONE;
ALPHABLENDENABLE= TRUE;
BLENDOP = ADD;
DESTBLEND = INVSRCALPHA;
SRCBLEND = SRCALPHA;
VertexShader = compile vs_3_0 VS();
PixelShader = compile ps_3_0 PS();
}
}
The HLSL optimizer is very smart.
The reason you're getting better frame rate when changing the return value to a constant, is that the rest of the code will simply be eliminated -- the entirety of your pixel shader will be a single instruction to move a constant into the output color.
The reason it's getting faster when you replace the sampling of the texture with a constant, is that the sampling takes time. When you're not actually using the sampled value, the sampling won't happen, and the pixel shader will run faster.
I don't know exactly how large your 3D texture is, how many pixels you're shading (what's your overdraw?), etc, so it's hard to say whether 10 fps is expected, or problematic, in your case. You should look into the specific performance using the PIX tool that comes with the DirectX SDK. However, it feels as if you're just very pixel shader and texture reading bound.
Possible solutions: Draw fewer pixels (make sure to draw front-to-back to get good Z reject). Do less in the pixel shader (find ways to fake what you actually want to do). Use a smaller 3D texture, or at least use a 3D texture with MIP maps (to make sampling hit the cache better).
Thank you very much!
The 3D texture size is about 512*512*512 bytes in my volume rendering projection.
I have try to use "font to back" order, and use the technique below, but nothing appeared on the screen!
technique TVertexAndPixelShader
{
pass P0
{
ALPHAREF = 0x0000000A;
ALPHATESTENABLE = TRUE;
ALPHAFUNC = GREATEREQUAL ;
ZENABLE = TRUE;
CULLMODE = NONE;
ALPHABLENDENABLE= TRUE;
BLENDOP = ADD;
DESTBLEND = DESTALPHA;
SRCBLEND = INVDESTALPHA;
VertexShader = compile vs_3_0 VS();
PixelShader = compile ps_3_0 PS();
}
}