Pages: [1] 2 3   Go Down
  Print  
Author Topic: render global shader speed question  (Read 10351 times)
June 22, 2011, 01:11:39 am
Hi, for a while now I've been wondering if render global shader could speed up fps in my projects by reducing draw calls. I thought I'd try out a straight forward test, by taking a simple shader (only diffuse texture, fullbright. about as simple as you could get) and comparing it to regular quest 3d material equivalent

I guess my question is- is my test flawed? I didn't get the result I expected.

I created a for loop rendered 2000 objects (the quest fish). then I rendered the same forloop using render global shader, and the result was slower with the global shader.

my guess is the bottleneck isn't the rendering in this scenario and maybe the global shader logic takes a little more cpu to pull off than rendering a regular material. I also notice the framerate jumps when the normally rendered objects go offscreen, but the global shader fps stays low. Maybe I'm encoutering a different bottleneck in each case?

so, my second question I guess is what *is* a good test to compare the two?

June 22, 2011, 06:47:29 am
My guess is that the problem is because of the for loop. Fair test should be test them both without for loop and with a complex shader ( 8 or more texture and higher instruction count).

ali-rahimi.net
June 22, 2011, 08:12:33 am
What are you calling under the global shader? The object, the surface or the 3dobjectdata? This does make a difference.
June 22, 2011, 08:27:34 am
Post your test file.
June 22, 2011, 08:38:38 am
the basic idea is to call the global shader once in the frame. And do all calculation in a loop under the global shader. In an ideal solution you call a global shader command only if something changes. So rendering all surfaces into a depth map you call the command only once. You even don't need to do a commit if your surfaces can use a zero matrix.


But yes source code would help

June 22, 2011, 08:43:20 am
You have collected a wrong test!
that would really make a difference, you need to start to connect many different shader with different textures.
June 22, 2011, 08:49:18 am
jos what is a difference? if i call 3dobjectdata it work with my G-Buffer global shader, but dont work with pssm. Really what is going on behind a global shader? Its not documented.

ali-rahimi.net
June 22, 2011, 09:28:57 am
Global shader allows you to render without making state changes (material settings, world matrix etc) unless you tell it to with a commit command. It speeds up rendering by reducing the commands that need to be set on the 3d card.

It works best if you organize your data to reduce the number of commit calls you make.

There is no point using it with a surface channel because that will make the state changes every object data anyway!

It has no effect on draw calls which are done once per object data.
June 22, 2011, 09:43:45 am
Thanks tom. So we must use 3d object data only? with onetime command for global shader? so it mean it don't have that much effect over the animated objects. Am i right?

ali-rahimi.net
June 22, 2011, 11:15:46 am
It will have some effect for animated objects, everytime you set a state on the graphics card takes time, so if you only set world matrix per object data then it is still faster than if you set render states per object data as well.

You should organize your object data per material so you set render states as infrequently as possible and loop through them under the global shader channel. This means you must use OO rendering.

If you are rendering a depth buffer then you don't need to set any render states so this is really fast.

If your geometry is static then if you can offset the object data by their world matrices then you don't need to set anything and that is fast. I use this method with draw call batching and it is as fast as can be. Actually I use vertex colors for diffuse and specular and put an index into the texccords and set a bunch of matrices on the shader and use the index to get the world matrix for animated geometry too but this needs some custom C++. I think quest3d 5.0 will have some similar functionality. then the rendering is as fast as it can be and it all depends on the graphics card.
June 22, 2011, 12:22:46 pm
This is how your test should look like. Result of using global shader properly is clearly seen even with such a simple shader.

* Global_shader_test.cgr (57.43 KB - downloaded 324 times.)

Upgrade to PSSM is available for SSAO customers. Check http://www.3dvrm.com/shadows_solution/ for details.
June 22, 2011, 12:28:05 pm
Too me it looks strange that commit changes and set new motion matrix are in one channel. It dosn't make much sence if you just want to bind different textures and other input values to a bunch of static 3dobjectdatas that has the same "zero" motion matrix.

Upgrade to PSSM is available for SSAO customers. Check http://www.3dvrm.com/shadows_solution/ for details.
June 22, 2011, 12:53:30 pm
I agree. Maybe internally it checks if there are changes. Actually I think ID3DXEffect does this automatically.
June 22, 2011, 12:57:54 pm
woow, thanks Viktor like always, very enlighten example. I get a bit confused. Since these things are not documented can we make a conclusion once and for all how should we setup a global shader for 1000 static mesh with zero motion matrix? should i just connect the 3D Object or surface or 3d object data? Or dose it even matter?

ali-rahimi.net
June 22, 2011, 01:02:41 pm
Thanks again tom. But i don't understand some part of it

1.(offset the object data by their world matrices). Offset them in vertex shader based on each geometry id?

2. (I use this method with draw call batching). What is draw call batching? Related to OO?

3.(Actually I use vertex colors for diffuse and specular ).You mean doing a diffuse and specular in vertex shader? But then how to use a normal map?

4. (put an index into the texccords) I still dont fully understand how to do this. Could you pleaes describe it more? Its very important to me.

Also i still don't get it. When we are using a global shader for static mesh why should i use loop with OO. What is a difference?

ali-rahimi.net
Pages: [1] 2 3   Go Down
  Print  
 
Jump to: