Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convergence test and validation experiment Buildkite pipeline #1223

Merged
merged 47 commits into from
Jan 29, 2021

Conversation

ali-ramadhan
Copy link
Member

@ali-ramadhan ali-ramadhan commented Nov 27, 2020

This PR finally sets up a CI pipeline to run convergence tests and validation experiments for CPU and GPU on Buildkite.

Don't think this should run on every push like the main pipeline does and I couldn't figure out how to trigger it via a GitHub comment (see CliMA/slurm-buildkite#13).

We can trigger this pipeline manually from Buildkite and I've scheduled it to run every night at 3am EST (on the master branch).

Cool thing is that it uploads the convergence plots as artifacts so we can view them from Buildkite!

image

Resolves #1216

@ali-ramadhan
Copy link
Member Author

ali-ramadhan commented Nov 27, 2020

Quite concerning that some of the convergence tests do not pass on the GPU...

This is probably due to #1170 since the simpler convergence tests that do not rely on a pressure solver seem to pass (also the fact that it passes forced flow free slip with doubly periodic (x, z) but not with a wall-bounded dimension (x, y)). Hmmm, but the Taylor-Green one is doubly periodic...

image

image

Copy link
Member

@glwagner glwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Some minor comments:

  • test/test_convergence.jl is rather boiler-plate-y. Is there any way to use a list of convergence tests scripts? This will make it easier to add new tests in the future...

  • Could make sense to raise an issue about converting convergence tests plots to Plots.jl or CairoMakie.jl or whatever when this PR is merged, so we don't have to wrestle PyPlot in the test scripts.

@ali-ramadhan
Copy link
Member Author

  • test/test_convergence.jl is rather boiler-plate-y. Is there any way to use a list of convergence tests scripts? This will make it easier to add new tests in the future...

Yeah it could be more automated. I'll give it a try once the tests are all passing.

  • Could make sense to raise an issue about converting convergence tests plots to Plots.jl or CairoMakie.jl or whatever when this PR is merged, so we don't have to wrestle PyPlot in the test scripts.

I don't think it's worth spending much effort on switching plotting libraries (especially since the plots work now) but I'll use CairoMakie.jl for future plots. CairoMakie.jl doesn't support log axes yet (gotta hack it in for now) but hopefully it'll get added soon.

@navidcy
Copy link
Collaborator

navidcy commented Jan 28, 2021

Some tests fail..

@ali-ramadhan
Copy link
Member Author

All tests should pass (see https://buildkite.com/clima/oceananigans-validation-experiments/builds/96) but Tartarus went down so a lot of builds died leading to failing tests.

@ali-ramadhan ali-ramadhan marked this pull request as draft January 28, 2021 16:16
@ali-ramadhan ali-ramadhan marked this pull request as ready for review January 28, 2021 16:16
xlabel(L"N_x")
ylabel("\$L\$-norms of \$ | c_\\mathrm{sim} - c_\\mathrm{analytical} |\$")
removespines("top", "right")
legend = legend(loc="upper right", bbox_to_anchor=(1.4, 1.0), prop=Dict(:size=>6))
lgd = legend(loc="upper right", bbox_to_anchor=(1.4, 1.0), prop=Dict(:size=>6))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👁️

@ali-ramadhan
Copy link
Member Author

44 successful checks 🚀

I think this PR is ready to be merged. All the existing convergence tests are now in CI and pass, and the plots are uploaded as Buildkite artifacts. This PR might conflict with @francispoulin's PR #1276 (I'm happy to fix conflicts no matter which PR is merged first).

There are a few tests that take a long time. This can be shortened in a future PR as it might involve some trial and error and fiddling with rate of convergence tolerances. Other validation experiments should also be added.

Right now I have set the validation pipeline to only run on master (and every night at 3am ET).

@francispoulin
Copy link
Collaborator

44 successful checks

I think this PR is ready to be merged. All the existing convergence tests are now in CI and pass, and the plots are uploaded as Buildkite artifacts. This PR might conflict with @francispoulin's PR #1276 (I'm happy to fix conflicts no matter which PR is merged first).

There are a few tests that take a long time. This can be shortened in a future PR as it might involve some trial and error and fiddling with rate of convergence tolerances. Other validation experiments should also be added.

Right now I have set the validation pipeline to only run on master (and every night at 3am ET).

Please go ahead and don't wait for me. I will work around whatever you have done here in my PR.

@ali-ramadhan ali-ramadhan merged commit d5171c3 into master Jan 29, 2021
@ali-ramadhan ali-ramadhan deleted the ali/validation-pipeline branch January 29, 2021 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run convergence tests on GPU
4 participants