-
Notifications
You must be signed in to change notification settings - Fork 265
Refacatored support for dynamic ViewportState to avoid the CPU overhead assocaited with using it. #1431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ng dedicated view related state.
…dual sizing of the State::stateStacks and viewStateStacks.
…urceHints/ResourceRequirements.
|
This seems to work with the application we've discussed that originally had problems with resized viewports and sparked the dynamic viewport work. |
|
However, the lazy state tracking does seem to have exacerbated the problem with disturbed descriptors. I didn't come up with a simple repro when I was investigating it earlier, and thought I'd found a typo in the mitigation attempt I mentioned in #1394 rather than it being a problem with this PR, but just realised that there's probably a new problem. I'll try and make a demo for it tomorrow. |
|
@AnyOldName3 I am consider another approach for State::stateStack + viewStateStack, as I'm not happy with ViewportState being on slot 0 by default, as this might cause problems down the line if folks start putting ViewportState on the vsg::StateGroup. The change I'm thinking of would be to have just State::stateStack, set the ViewportState::slot to something like 8, then have maxSlot number for standard subgraphs, a maxSlot for view releated state, and potentially another higher up for RenderGraph/CommandGraph level state. The later is just a possibility. The crux with this approach is whether I can keep the overhead down. With the other bug(s) mentioned in #1394 is it worth me trying/merging out the vsgincompatiblepipelinelayouts example at this point? |
|
I have checked in my refactor and it's even faster than this branch, and simpler, so all good :-) I have generated a PR for these changes: #1433 I will do a code review and continue with other checks if that all goes smoothly I'll close this PR and focus on the new approach. |
|
stateStackPointer-validation-errors.zip Here's a It contains a group whose children are:
The PBR shaderset uses more descriptor sets than the text shaderset, so for one of the slots, the PBR material is still the most recently bound state command, and when it's popped (after the first node's recorded) and repushed (so the third node can be recorded), that slot's marked as dirty on master, but on this branch, that mechanism's scrapped and it still passes the pointer equality test. Binding the text pipeline in between disturbed the descriptor, though, so it must be rebound like it is on master. |
|
Thanks for the test model. When I run it I see a blue cube, when I zoom in I see "some text", as far as I can tell there isn't visual glitches. Should I be expecting some? When I run: vsgviewer stateStackPointer-validation-errors.vsgt -dI get lots of output in the form: Objects: 2
[0] 0xcb1c7c000000001b, type: 19, name: NULL
[1] 0xd10d270000000018, type: 17, name: NULL
VUID-vkCmdDrawIndexed-None-08600(ERROR / SPEC): msgNum: 941228658 - Validation Error: [ VUID-vkCmdDrawIndexed-None-08600 ] Object 0: handle = 0xcb1c7c000000001b, type = VK_OBJECT_TYPE_PIPELINE; Object 1: handle = 0xd10d270000000018, type = VK_OBJECT_TYPE_PIPELINE_LAYOUT; | MessageID = 0x381a0272 | vkCmdDrawIndexed(): The VkPipeline 0xcb1c7c000000001b[] (created with VkPipelineLayout 0xd10d270000000018[]) statically uses descriptor set 1 which is not compatible with the currently bound descriptor set's pipeline layout (VkPipelineLayout 0xd10d270000000018[])
The set (1) is out of bounds for the number of sets bound (1)
. The Vulkan spec states: For each set n that is statically used by a bound shader, a descriptor set must have been bound to n at the same pipeline bind point, with a VkPipelineLayout that is compatible for set n, with the VkPipelineLayout used to create the current VkPipeline or the VkDescriptorSetLayout array used to create the current VkShaderEXT , as described in Pipeline Layout Compatibility (https://vulkan.lunarg.com/doc/view/1.3.290.0/linux/1.3-extensions/vkspec.html#VUID-vkCmdDrawIndexed-None-08600)This is with viewMaxSlot branch derived from this PR's branch. Both this and the viewMaxSlot branch use the pointer based lazy state updating so will assume the problem exists there as well in the same way. From what you've written it sounds like VSG master works fine with this test model. Is that correct? Is this issue related to other one that you've been looking into? |
|
From the fact that we've not seen major problems in the client app we've discussed, where even with the master branch, descriptor-disturbance-related validation errors are emitted for most draw calls if you disable the ten-of-each-validation-error limit that Vulkan's validation layer has by default, I think it's a safe bet that both Nvidia's and AMD's desktop drivers are more leniant with this particular abuse of the API than the specification requires them to be. On a driver that was stricter, the duplicate cube (which doesn't contribute to any fragments as I didn't bother giving it its own transform) would potentially access uninitialised memory for the material descriptor, so could end up entirely the wrong colour or could even crash the driver and take down the whole system. I don't know the prevalence of implementations that have symptoms, whether they're just minor visual glitches or major consequences like kernel crashes. To confirm, the master branch generates no validation errors with that test data and this branch generates the ones you've seen. The issue here is related to #1394 in that in both cases, we're running afoul of the descriptor set disturbance rules, but it's not entirely the same thing as this one's a regression caused by not rebinding things when we used to with obviously sensible scenegraphs, whereas the scenegraphs that trigger #1394 are structured in a way that looks more fragile once you know the disturbance rules, although it would be reasonable for someone who's got scenegraph experience (especially with the OSG) and hasn't been looking into this class of problem recently to expect them to work. |
|
As for the vsgincompatiblepipelinelayouts example, there are potential fixes for the regression from this branch which would also make it work, but they're invasive and would likely have more of a performance cost than you'd be happy with. For example, we could add pipeline layout tracking to |
|
I will need to spend some time understanding the issue properly. As this lazy state updating results in a Vulkan validation error I will try to resolve the validation error being merging with VSG master. FYI, the pointer based lazy state updating approach I've implemented can be reset for each slot or for all slots at any time by resetting the StateStack.stack[0] entry to nullptr. This essentially implements a dirty() do the next call with force a rebind. So.. I guess when the text subgraph pops it'd need to selectively dirty the StateStack's that have been "disturbed." It's curious that NVidia and AMD drivers aren't been "distribued" by what is happening at present, but the Validation layer is. As you say the Validation layer is following specs, rather than going out of it's way to do something as sensible fallback like NVidia and AMD drivers are doing. |
|
The wording of the spec suggests they expect some drivers to have a stack that holds push constants at the top, then descriptor sets in numerical order, and when something gets bound, things have to be popped off the stack to make the thing that's being bound be at the top, so then everything in higher slots needs binding again. If a driver didn't use a stack and used something like a vector that allowed arbitrary access to things in the middle, then rebinding a low slot wouldn't discard anything bound to a higher slot. |
|
I'm currently thinking of looking at dirtying/resetting the State::stateStacks last applied StateCommand when a GraphicsPipeline/ComputePipeline is bound. The lazy state updating should make sure that pipelines are only bounded when they change. Potentially putting the last applied PipelineLayout into State might be a way of make it possible to use different GraphicsPipeline without dirtying the State::stateStacks. |
|
@AnyOldName3 I have changes that fix the Vulkan validation error in the dataset you provided. There is a performance overhead on the models I'm testing, so I'll keep testing and reviewing the code to see if I can lower the overhead. Changes are not too intrusive as part of the functionality was already in place for tracking current pipline layout. |
|
I have checked in my fix for the change of pipeline causing Vulkan validation errors: bc4ce60 I have got the performance overhead down enough that I'm happy to merge with VSG master. We are about 24% faster than VSG master, and between 20% and 67% faster than v1.1.10. Static ViewportState is still faster for the model with lots of trees in the transparent bin, no matter what optimization I've applied I can't get rid of this overhead, we still have a 2.4% overhead. What it looks like to me is the BindGraphicsPipeline is forcing the driver to set up the pipeline each time and when using dynamic ViewportState there is a small overhead. Once a pipeline is bound there doesn't seem to be little or any overhead. The irony is my trying to get rid of this 2.4% overhead didn't result in fixing it, but I ended up fixing other bottlenecks that affected the static and dynamic ViewportState usage to an extent that both are now significantly faster than v1.0.8 and v.1.10. Unfortunately, it's taken me an extra months work to do it... but hey 20-60% performance improvement has to be worth it. |
|
That fix might throw away more performance than it needs to. I've not made a PR yet and it looks like it doesn't build in CI due to either a missing include or not fencing off a check for an extension feature behind conditional compilation to see if the machine's Vulkan SDK has that extension, but last night I implemented this function to determine how compatible two pipeline layouts are here master...AnyOldName3:VulkanSceneGraph:partial-compatibility. If the |
|
@AnyOldName3 I've done my final code review and testing, so far everything looks like with my latest changes, and they resolve the Vulkan validation error in the stateStackPointer-validation-errors.vsgt model. As I'm happy I've gone ahead and merge the new branch with VSG master #1433. This moves the need for testing on any applications that may be of interest. Fingers crossed the PipelineLayout change forcing a State::stateStack::dirty() will force all the state to rebuild that require it. |
A compatibility function is potentially useful regardless of whether it's used at runtime. I have plenty of other work that's been held up by trying to fix the original performance regressions so now we are significantly ahead I want move on. |
Changes also improve performance when using static ViewportState so everyone wins :-)