-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async GL shader compilation #576
Conversation
This is still a WIP, so no worries about incomplete doc comments, undefined constants for GLES targets, formatting, etc. 😄 While implementing the proposed changes for Shader&Program is no more than splitting code into two functions, handling compile states requires some planning. Compilation states, I suppose, should be only data bags. All logic should be in I came up with the following solution and implemented it for FlatGL:
This solution is compact and doesn't change the way Things to consider when generalizing for more than FlatGL:
What are your thoughts on this? |
Hi, apologies for a delayed response here -- I'm stuck deep in a certain other feature and its taking way more time than expected. Thanks for the code, and especially for the detailed description of your thought process. Very appreciated 👍 I'll post a more detailed reply once I manage ship that other feature (hopefully later this week), but after a brief look at your initial changes it seems my original propsal wasn't exactly great in terms of the amount of boilerplate inflicted on subclasses. (My fault, sorry.) I have to think more about that. The ideal state would be that subclasses wouldn't need to implement that much extra apart from what they did before (because all duplicated state would also mean twice as much testing effort), only splitting the initial setup somewhat, perhaps at the cost of some safety in case the async compilation is used. Which is fine I guess, it's an advanced feature after all. I have a vague idea that the construction could get split via a chain of delegating constructors, which would either directly call into each other in the straightforward case, or an object gets "partially constructed", waited upon, and then moved to a fully-constructed state once the compilation finishes. I'll outline that better once I get my hands free. Until then, in 74f1778 I added support for |
Just throwing this in as as suggestion before you get back to it. template<typename FooGL>
struct CompileState {
private: // no access to shader until compiled
FooGL asp; // moved in from FooGL::compile
Shader vert, frag;
optional<Shader> geom; // or parameterize by N - number of shaders?
} then FooGL(CompileState<FooGL>&& cs) blocks and finishes consturction. |
Okay, that actually seems like the best middle ground :) The delegating constructors from my comment above wouldn't work on its own because there would be nowhere to store the temporary class FlatGL::CompileState: public FlatGL {
private:
friend FlatGL;
using FlatGL::FlatGL; // low-effort way to get access to the NoCreate constructor to populate this
GL::Shader frag, vert;
GL::Version version; // + whatever else the constructor needs to preserve to avoid duplicated code
}; It's publicly derived, so the getter APIs like Then, there would be Additionally:
Does this make sense? I hope I didn't forget something critical again :) And apologies for the delay. |
Makes sense and should work 😄 I'll try this out next week |
I sketched the proposed changes. If it looks acceptable, I'll move on to tidying and implementing the same for other shaders.
protected:
/** @brief Construct without running shader compilation and linking */
FlatGL(NoInitT, Flags flags, etc.) that will create the ASP and set variables, but not trigger compilation (or else we're stuck in a loop 😆).
I saw the containers and it made me not go for the split initially. There is a todo to implement stack based allocation, I can try this out in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thank you. I think the scaffolding essentials are quite close to completion, so I commented more thoroughly -- hope that's okay :)
What's left (apart from eventually stamping this out for the other shader variants once we have the base finalized) is some testing. I imagine this could be sufficient:
- In
GL/Test/ShaderGLTest.cpp
there's acompile()
test case. I'd split it intocompile()
,compileFailure()
,compileAsync()
andcompileAsyncFailure()
, verifying thatisCompileFinished()
is alwaystrue
aftercompile()
orcheckCompile()
, and that the failure message is never printed aftersubmitCompile()
but only aftercheckCompile()
. Given there would be the suggested fallback for drivers withoutKHR_parallel_shader_compile
, the test wouldn't really need to behave differently depending on extension presence. On the other hand, drivers can expose the extension but not actually parallelize anything, so I don't see a way to verify ifisCompileFinished()
actually ever returnsfalse
. Though if you have any idea, feel free to try that out. - Similarly in
AbstractShaderProgramGLTest.cpp
for the linking case. - For the flat shader, there's
compile()
andcompileUniformBuffers()
inShaders/Test/FlatGLTest.cpp
. Those are instanced and test a ton of combinations. I don't think it's needed to test all that again, I'd only addcompileAsync()
andcompileUniformBuffersAsync()
that check a single case with non-trivial input arguments, use theCompileState
workflow there, verify that the flags/material/draw count gets properly propagated right from the start and verify thatisLinkFinished()
returnstrue
after.
Tests can be enabled with MAGNUM_BUILD_TESTS
and MAGNUM_BUILD_GL_TESTS
in CMake, then for easy execution they're all put into Debug/bin
or Release/bin
. I'm pretty sure you'll manage to get around, but just in case here's a small tutorial for the test suite and here's how debug/warning/error output can be captured and verified. You could use for example Compare::StringHasPrefix to check that the output is reasonable without having to worry about random difference in the output across platform.
When you push, the GLES2 and GLES3 CI jobs as well as the Android GLES2 job run the GL tests with SwiftShader. I'm not sure if SwiftShader implements this extension (probably not?), but in any case most of the code gets verified. The other jobs don't run any GL tests, at the very least they'll check that the code compiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor things, and I'm tired so apologies if my sentences don't make sense :D
I'll give it another look when you get to the tests. Thank you!
As it turns out, querying This means that the following is possible: FlatGL2D s{{}, 1,1}; // calls checkLink internally
s.isLinkFinished(); // returns false I've made a small example that works in this odd way on my pc:
I'm not sure if this is the norm, because I couldn't find anything in the khronos docs on whether it can't be used after linking. In my case, it even returns false when the shader is being used - it seems like the driver just needs some time to notify its completion query mechanism 🤔 Update: I've made a raw OpenGL example to make sure the result is not influenced by magnum in any way. Same behaviour: LINK_STATUS returns GL_TRUE, but querying completion does not |
@mosra just a friendly ping in case you didn't see my message above. No hurries if you're busy 😅 😓 I implemented the proposed test changes. If they look ok, I'll move to the other shaders. However, the is a small issue with the query correctness. I've described it above |
I saw the notifications but didn't find time yet, sorry for the wait again. I'll try to check this later today or tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be away until Monday, but I think that apart from these few comments this is ready to be replicated for other shaders.
Great work, thank you 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nice, so the sleep was enough to fix that strange issue :)
It's just DistanceFieldVector
left, right? Awesome. You also managed to match the library coding style quite well, if there will be any minor issues left I'll happily do that post-merge. And I can also write the overview docs / example snippet myself after -- don't want to put all the burden on your first contribution ;)
The linux-nondeprecated
build is failing because it seems you're accidentally using a deprecated alias (MeshVisualizer2D
instead of MeshVisualizerGL2D
, the fomer is a relic of the past and not present on the non-deprecated build). Easiest way to discover all such cases is if you disable the MAGNUM_BUILD_DEPRECATED
CMake option in your local build, then you get the same errors as on the CI.
Well, it does it for the tests 😅 The API still doesn't work "as expected", but i've left a warning in the doc comments
I'll have a pass at the end and try to fix as much as I can. It's just that every time I forget how something is supposed to be aligned, I open a random file and look how its done there 😄 I know there is Corrades styleguide, but it doesn't cover cases like line-breaking a function with 5 arguments and 3 ifdefs
Just didn't get to it. Thanks for the build option, I have been fishing them out of the logs. I'm more worried about AppVeyour... It spits out like 2k lines of warning in other parts of magnum and then the linker fails. There seems to be something wrong with the templated shaders (because MeshVisualizerGL2D is not part of the error). Specifically, the I don't have a dev-ready windows pc right next to me to test on, so it'll be cool if you could tell me the cause if you now it right away. In case this requires a little investigation, you can leave it on me. |
🙇 thank you! For line breaking, it's comments on 79 cols (mainly just to make them easy to read, unless there's an excessively long identifier or expression that can't really be broken). For other stuff, having the whole signature as a single long line is fine. Preprocessor If I may, the only other thing I noticed is a space before
It complains about a TL;DR: try to deinline the I don't have a Windows machine around either, so I usually just abuse AppVeyor, pushing random tries until it starts cooperating. Feel free to do the same, haha. |
I guess we don't need full checks in constructors FooGL(CompileState&&) {
CORRADE_INTERNAL_ASSERT_OUTPUT(checkLink());
CORRADE_INTERNAL_ASSERT_OUTPUT(cs._vert.checkCompile());
CORRADE_INTERNAL_ASSERT_OUTPUT(cs._frag.checkCompile()); because in case the first check fails, the other two fill be skipped. In case the first is successful, the others are guaranteed to be the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Except for one 32-bit-specific issue, all tests are passing, so that's great.
I didn't have time to make a final review pass the past weekend, but on a quick look everything looks alright. Your comment about CORRADE_INTERNAL_ASSERT_OUTPUT(cs._vert.checkCompile());
etc. makes sense, keep just the checkLink()
assert there.
I'll hopefully find time to merge everything the following weekend. No further review rounds necessary I think, I just want to go over everything one last time locally. Thanks a lot for a top-notch contribution! 👍
Okay, finally -- merged this as 96f97d4 and eb17d77, and wrote some introductory docs in 4580c30 and 48326ac. What I only realized when testing locally was that the linker error message alone doesn't contain anything useful if the compilation failed -- somehow I falsely remembered that it contains also the compilation errors. So to provide the full context in case of a failure, as of c9d7365 the But those are all just minor additions on top. Thanks again for this work, and sorry for the extreme delays! 👍 |
#534