More on software rendering and Direct3D 10
Writing a software renderer is quickly associated with revolutionizing things, but on the more conservative side, one advantage would really be that one has the freedom to optimize things as needed rather than having to try and guess what a driver is doing behind the scenes (maybe write your own driver ?!).
Graphics drivers do a great deal of work, they can make a big differences, they can be quite smart at guessing resources usage, what to prioritize on which basis, but they can't be smarter than a whole application.
APIs such as Direct3D and OpenGL don't have a concept of object (ok, OpenGL supports "lists" at least). So, they miss potentially useful hints. For example, if you know your object bounding box, you can tell right away whether or not the object requires clipping and you can tell in advance which textures you need, and what's the maximum level of detail that is needed for textures (if the bounding box is 512x512, you can't possibly need a 1024x1024 texture 8).
Based on that knowledge, one could completely avoid having to perform per-polygon clipping test and could also avoid loading full size textures in video RAM.. because a smaller mip-level would be sufficient.
Another big problem in dealing with a separate hardware graphics system is communication. Every state change can be a big deal. Drivers will probably cache things, but not necessarily.
Recently, I wrote an immediate rendering library to draw debug primitives. I had a small pool of vertex buffers that I would rotate as I called Draw() several times per frame. It turned out to be a big slowdown, so big that I had to switch to use one vertex buffer per frame (actually 2/3 to rotate at each frame, not at each Draw()).
To do that of course I had to keep track of logical draw calls issued by the application program, so that I could finally unmap/unlock the vertex buffer at the end and call all the Draw() at once.. remembering of which primitive type it was, how many vertices and which draw state was associated with that draw call.
This all comes down to having to deal with separate architectures. I have my vertex buffer, the card has its vertex buffer.. collect here, copy there, avoid touching this buffer or that buffer.
Small things that show the cat and mouse kind of job of having to optimize rendering using an API such as Direct3D 10.
Lastly, recently I managed to crash nVidia's driver on Vista 8)
I think it has to do with.. big polygons, or polygons going way off.. possibly a clipping bug ? All I know is that my screen goes blank and boom !
The window comes back and it's all black, while I get a balloon message from the sys-tray that says that the driver's process has crashed. From then on I can't do 3D unless I reboot 8)
ehhhhhhhhhhh


have you tried using
have you tried using DrawPrimitiveUP or whatever its equivalent is in dx10 instead?
if using DrawPrimitiveUP runs fast, that means you had something wrong in your buffer rotation/filling logic and were causing a stall with the lock/unlock. or maybe an indexing or a data size issue? Maybe you were copying more data than you thought you were?
I don't think there is a
I don't think there is a direct equivalent of DrawPrimitiveUP in DX10.
The thing is that I was using the direct rendering for drawing a menu. In the first version I would call a Draw (one quad) for every character. Then I changed that for every string, but also for every time I changed primitive type, etc etc.
Alternating only a few buffers and doing all those map/unmap_ (lock/unlock) per frame, turned out to be a big performance hit. I assume because by the time that I try overwrite a buffer that is already scheduled to be drawn, DX first flushes the accumulated draws and then actually goes to lock for writing again. Though I was locking the buffer as write-discard, making it easier for potentially allocating another buffer in the background, as I wasn't really asking for any physical tie to the contents.
Now I have a single buffer that all "immediate" draw-calls share (I only use a single vertex format including color and texture coords but no normals).
I lock at the begin of the frame and I unlock it at the end, right before drawing the accumulated draw calls.
In this case I actually alternate between 3 buffers (maybe 2 would be sufficient) so that the next buffer being locked isn't the one I just asked to draw primitives from (basically double buffering the vertex buffer).
It's a bit of a pain to have to build those immediate rendering functionalities from scratch. I wish DX10 had a better system to deal with buffers from system memory without having to think in terms of creating resources.
It's funny to see those examples (even in DX9) that actually create a 4-vertices vertex buffer to draw a full-screen quad for some post processing... give me break !
P.S. Somehow Drupal doesn't let me post this reply without putting a character after unmap_ so I put the underbar 8)
No DrawPrimitiveUP??? Thats
No DrawPrimitiveUP??? Thats sad.... Means any dynamically generated geometry has to be double buffered to avoid stalls... Thats an extra memory and complexity requirement for user app... Having to double buffer something like a 4 point fullscreen quad is really stupid in my opinion... So far I havent heard anything good about dx10... Have you had a chance to try geometry shaders yet?
Don't worry... M$ will learn
Don't worry... M$ will learn and soon they will launch all new DX11 that will be DX9 compatible or something. Somehow nobody yet said good things about DX10.
All those APIs are
All those APIs are bullshit.
The general interface is meant to try to guess what one wants to do, and the drivers try to guess things in real-time.
Guess this guess that.. just let me program the thing !
DirectX is not direct and OpenGL is not open.. it's all old bullshit with sugar coating.
Post new comment