JavaKazRace - Playable Java racing game demo
PSEmu Pro GPU plug-in
DOSX Utils
SHLight 2004
JavaKazRace DSharingu PSEmuGPU DOSX Utils SHLight 2004

More on software rendering and Direct3D 10

Davide's picture

Writing a software renderer is quickly associated with revolutionizing things, but on the more conservative side, one advantage would really be that one has the freedom to optimize things as needed rather than having to try and guess what a driver is doing behind the scenes (maybe write your own driver ?!).
Graphics drivers do a great deal of work, they can make a big differences, they can be quite smart at guessing resources usage, what to prioritize on which basis, but they can't be smarter than a whole application.

APIs such as Direct3D and OpenGL don't have a concept of object (ok, OpenGL supports "lists" at least). So, they miss potentially useful hints. For example, if you know your object bounding box, you can tell right away whether or not the object requires clipping and you can tell in advance which textures you need, and what's the maximum level of detail that is needed for textures (if the bounding box is 512x512, you can't possibly need a 1024x1024 texture 8).
Based on that knowledge, one could completely avoid having to perform per-polygon clipping test and could also avoid loading full size textures in video RAM.. because a smaller mip-level would be sufficient.

Another big problem in dealing with a separate hardware graphics system is communication. Every state change can be a big deal. Drivers will probably cache things, but not necessarily.
Recently, I wrote an immediate rendering library to draw debug primitives. I had a small pool of vertex buffers that I would rotate as I called Draw() several times per frame. It turned out to be a big slowdown, so big that I had to switch to use one vertex buffer per frame (actually 2/3 to rotate at each frame, not at each Draw()).
To do that of course I had to keep track of logical draw calls issued by the application program, so that I could finally unmap/unlock the vertex buffer at the end and call all the Draw() at once.. remembering of which primitive type it was, how many vertices and which draw state was associated with that draw call.

This all comes down to having to deal with separate architectures. I have my vertex buffer, the card has its vertex buffer.. collect here, copy there, avoid touching this buffer or that buffer.
Small things that show the cat and mouse kind of job of having to optimize rendering using an API such as Direct3D 10.

Lastly, recently I managed to crash nVidia's driver on Vista 8)
I think it has to do with.. big polygons, or polygons going way off.. possibly a clipping bug ? All I know is that my screen goes blank and boom !
The window comes back and it's all black, while I get a balloon message from the sys-tray that says that the driver's process has crashed. From then on I can't do 3D unless I reboot 8)

ehhhhhhhhhhh

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

have you tried using

have you tried using DrawPrimitiveUP or whatever its equivalent is in dx10 instead?

if using DrawPrimitiveUP runs fast, that means you had something wrong in your buffer rotation/filling logic and were causing a stall with the lock/unlock. or maybe an indexing or a data size issue? Maybe you were copying more data than you thought you were?

I don't think there is a

Davide's picture

I don't think there is a direct equivalent of DrawPrimitiveUP in DX10.
The thing is that I was using the direct rendering for drawing a menu. In the first version I would call a Draw (one quad) for every character. Then I changed that for every string, but also for every time I changed primitive type, etc etc.
Alternating only a few buffers and doing all those map/unmap_ (lock/unlock) per frame, turned out to be a big performance hit. I assume because by the time that I try overwrite a buffer that is already scheduled to be drawn, DX first flushes the accumulated draws and then actually goes to lock for writing again. Though I was locking the buffer as write-discard, making it easier for potentially allocating another buffer in the background, as I wasn't really asking for any physical tie to the contents.

Now I have a single buffer that all "immediate" draw-calls share (I only use a single vertex format including color and texture coords but no normals).

I lock at the begin of the frame and I unlock it at the end, right before drawing the accumulated draw calls.
In this case I actually alternate between 3 buffers (maybe 2 would be sufficient) so that the next buffer being locked isn't the one I just asked to draw primitives from (basically double buffering the vertex buffer).

It's a bit of a pain to have to build those immediate rendering functionalities from scratch. I wish DX10 had a better system to deal with buffers from system memory without having to think in terms of creating resources.

It's funny to see those examples (even in DX9) that actually create a 4-vertices vertex buffer to draw a full-screen quad for some post processing... give me break !

P.S. Somehow Drupal doesn't let me post this reply without putting a character after unmap_ so I put the underbar 8)

No DrawPrimitiveUP??? Thats

No DrawPrimitiveUP??? Thats sad.... Means any dynamically generated geometry has to be double buffered to avoid stalls... Thats an extra memory and complexity requirement for user app... Having to double buffer something like a 4 point fullscreen quad is really stupid in my opinion... So far I havent heard anything good about dx10... Have you had a chance to try geometry shaders yet?

Don't worry... M$ will learn

Duddie's picture

Don't worry... M$ will learn and soon they will launch all new DX11 that will be DX9 compatible or something. Somehow nobody yet said good things about DX10.

All those APIs are

Davide's picture

All those APIs are bullshit.
The general interface is meant to try to guess what one wants to do, and the drivers try to guess things in real-time.
Guess this guess that.. just let me program the thing !
DirectX is not direct and OpenGL is not open.. it's all old bullshit with sugar coating.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <b> <i> <img> <table> <tr> <td> <ul> <li> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote> <div> <pre> <font> <h1> <h2> <h3> <h4> <h5> <h6>
  • Lines and paragraphs break automatically.
  • You may use [inline:xx] tags to display uploaded files or images inline.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
13 + 4 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.