New version of Ab3d.DXEngine comes with DirectX command list caching and many new features

by abenedik 1. June 2020 11:48

I am proud to inform you that a new major release of the Ab3d.DXEngine has been released. There is also a new minor version of Ab3d.PowerToys.

Let's start with a question: what makes a rendering engine faster? The answer is simple: a faster rendering engine will render the 3D scene quicker and achieve higher frames per second. This can be achieved in two ways. The first option is to provide special rendering techniques to the end-user that can optimize the rendering (for example, object instancing). The second option is to organize the data for the 3D scene so that it can be sent to the graphics card as quickly as possible and in a way that allows the graphics card to render the scene the most optimally.

This part of the job is done internally by the DXEngine. This process was already highly optimized for CPU, memory access and can be executed using multiple threads. Still, a 3D scene with a few thousand objects may require a few milliseconds of CPU time to render. In this case, a rendering engine written in C++ or another low-level language may be faster because its code can be better optimized.

But with the introduction of DirectX commands caching in the new version of Ab3d.DXEngine, this advantage of low-level languages is significantly reduced. In case when only camera or lights are changed (the majority of cases in a typical business or scientific 3D application), then the new version of Ab3d.DXEngine can render the new frame with only updating the camera and lights data and then "instructing" the graphics card to re-render the new frame with using the DirectX commands that were recorded in one of the previous frames. This requires so little code to be executed on the CPU, that even if the engine were written in C++, the performance difference would not be noticeable.

Therefore I would like to conclude the first part of this post with a statement that in cases when DirectX commands caching can be used (in a typical business 3D application this is most of the cases) the Ab3d.DXEngine is as fast as if it would be written in C++. 

Let me show you that in action:

Ab3d.DXEngine with DirectX command list caching

As seen from this screenshot, in case of rendering 160.000 individual 3D objects, the time required to send all that data to the graphics card is only 0.14 ms (see DrawRenderTime which shows time to execute 160.000 draw calls with all required state changes). This way, it would be possible to achieve almost 3000 FPS. When the command list caching would be disabled, then the DrawRenderTime would be around 16 ms. This is still a fantastic result and can be achieved using multi-threading and many other optimizations.

Another area that received some great new features is rendering transparent objects. Rendering transparent objects can be hard in 3D graphics. The reason for that is that when a transparent object is rendered, the colors of the already rendered objects are blended with the color of the transparent object. This means that if we want to see the objects through transparent objects, they need to be rendered before the transparent object. Usually (when the objects are not self-intersecting), this can be solved with sorting the objects so that the objects that are farther away from the camera are rendered first. To help you with sorting, you can use the TransparencySorter class from the Ab3d.PowerToys library. But this class work only for WPF 3D objects. What is more, sorting WPF 3D objects can be a very slow process and in case of complex hierarchy, it may not be possible to sort all the objects correctly.

The new version of Ab3d.DXEngine provides a better way of sorting transparent objects. With setting the DXScene.IsTransparencySortingEnabled property to true (by default it is set to false to make the engine work as in the previous version), the engine automatically sorts all the objects in the TransparentRenderingQueue. Because the objects there are defined in a flat queue, there is no problem with the hierarchical organization of objects. Also, sorting is highly optimized and uses multi-threading, so it is much faster than TransparencySorter. And finally, the DXEngine's SceneNodes can also be sorted in this way.

Another new feature regarding transparent objects is the added support for alpha-clipping and alpha-to-coverage. Those two techniques can be used to render textures that have opaque and also transparent pixels (for example, text with a transparent background or threes). An advantage of those two techniques is that they do not require the objects to be sorted by camera distance. But they can produce some artifacts on the areas where the transparent part of the texture is transitioned to the opaque part. If this transition is small, then the results can be very good. In the case of using alpha-clipping, the user can select an alpha clip threshold - this is a value that specifies at which alpha value the pixels are clipped. When using alpha-to-coverage, it is possible to use MSAA (multi-sample anti-aliasing) to provide a more accurate level of transparency with making some sub-pixel samples transparent and some opaque. See the comments in the new sample for more information.

There are also improvements in how textures are loaded. First, all textures that are created from WPF BitmapImage objects are checked if they actually contain transparent pixels. In the previous version, only the file format was checked and if it supported transparency, then the texture was considered transparent. In this case, the object that used that texture was put into the TransparentRenderingQueue - requiring alpha blending (more GPU memory transfer), transparency sorting and having no rendering optimizations. In the version, the DXEngine "knows" if the texture does not have any transparent pixels so its object can be put into the StandardGeometryRenderingQueue. This way, it gets support for multi-threaded rendering and DirectX commands caching.

The last new feature related to transparency is that the TextureLoader.LoadShaderResourceView method has been significantly improved. The method loads a texture into a DirectX resource and now also sets a new TextureInfo class that describes the loaded texture: image size, dpi, format, has transparent pixels. That information can help you correctly set the properties in the StandardMaterial or any other material that is showing the texture.

There are also some other great new features. Let me quickly mention some of them.

Object instancing also received a few improvements. The first improvement is that it is now possible to quickly hide (discard) some instances when setting their alpha color to 0. In the previous version, such instances were still rendered - they were not visible because of alpha value 0, but their depth values were still written and therefore this prevented rendering of all objects behind them. So in the previous version, you needed to re-create the InstancesData array with removing some items from the array.

The next new feature is that now you can render instanced objects with a single color without any light shading. Another instancing improvement is that it is now possible to specify the size of each instance in screen-space units. This means that even when you zoom in or out, the size of the instances on the screen will remain the same (the size calculations are done on the graphics card).

Another performance-related improvement is that now multi-threaded rendering is also supported for 3D lines. Before, it was supported only for standard and opaque 3D objects. It is still better to combine many 3D lines into a single 3D line object (for example, MultiLineVisual3D or even better ScreenSpaceLineNode), but if you happen to have many individual 3D line objects, then the new version could now render at more then 4-times the speed.

If you are an advanced DXEngine user, you will be interested in that the GeometryRenderingQueue has been split into 4 different rendering queues: 

  • ComplexGeometryRenderingQueue - used for instanced objects and very complex meshes (with more than 100.000 triangle indices or 20.000 lines) - the idea is to send such objects to GPU as soon as possible,
  • StandardGeometryRenderingQueue - highly optimized for rendering standard 3D objects - supports multi-threading and command list caching,
  • OtherGeometryRenderingQueue - used for other 3D objects with non-standard effects,
  • LineGeometryRenderingQueue - used for opaque and solid color 3D lines - support multi-threading),

The old GeometryRenderingQueue is still defined in the DXScene class. But it is marked as obsolete and points to the OtherGeometryRenderingQueue. If you are using the GeometryRenderingQueue, please change that to any of the 4 new rendering queues instead.

Advanced users will be also happy to hear that there are some interesting improvements in the Diagnostics project that is also used in the DXEngineSnoop tool (see also Diagnostics Guide). Now you can render an object-id bitmap of the 3D scene - there the colors in the bitmap represent the ids of the rendered objects. You can also disable and enable rendering of some RenderingQueues and RenderingSteps. With those new features, you can diagnose some low-level rendering problems and get some additional understanding of how the rendering engine works.

 

As always, there are many other new features, improvements and bug fixes. This is also true for Ab3d.PowerToys. To get the full list of changes see Ab3d.DXEngine versions history and Ab3d.PowerToys versions history.

From the number of support requests, I have seen that the corona crises have significantly slowed down the development worldwide. This lasted for around a month and a half. Then suddenly, there was a burst of activity that highly exceeded the time before the pandemic. So it looks like during the quarantine, you had time to gather many great new ideas. So, when the development started again, you wanted to try them right away. I was pleased to see that. I hope that the newly published versions will allow you to realize your ideas even more quickly and have better results.

Tags: , , , ,

Ab3d.PowerToys | DXEngine