Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subdomain support for Interfaces + AllocCheck tests for iterators + Sparsitypattern bugfix #1108

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

AbdAlazezAhmed
Copy link
Collaborator

@AbdAlazezAhmed AbdAlazezAhmed commented Nov 7, 2024

fixes #1102, #1103, #1110, and maybe #850
Also a bit better performance

some random benchmark:
julia> function foo(dh, top)
           for i in InterfaceIterator(dh, top)
           end
           end
foo (generic function with 1 method)

julia> using BenchmarkTools

julia> grid = generate_grid(Hexahedron, (20,20,20))
Grid{3, Hexahedron, Float64} with 8000 Hexahedron cells and 9261 nodes

julia> topology = ExclusiveTopology(grid);

julia> ip = DiscontinuousLagrange{RefHexahedron, 1}();

julia> dh = DofHandler(grid);

julia> add!(dh, :u, ip);

julia> close!(dh);

julia<050f226>> @benchmark foo($dh, $topology)
BenchmarkTools.Trial: 2131 samples with 1 evaluation.
 Range (min  max):  1.960 ms   28.916 ms  ┊ GC (min  max): 0.00%  89.13%
 Time  (median):     2.151 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.341 ms ± 739.360 μs  ┊ GC (mean ± σ):  1.66% ±  4.48%

   ▅▆▆█▅▄▃▁▁ ▂▃▄▃▅▅▄▃▃▃▂▂▁▁▁▁                                 ▁
  ▇█████████▇█████████████████▆▇▅▇▅▆▅▅▆▆▇▇█▆██▇▅▅▅▅▁▁▁▄▄▄▆▆▄▅ █
  1.96 ms      Histogram: log(frequency) by time      3.74 ms <

 Memory estimate: 1.86 MiB, allocs estimate: 42.

julia<master>> @benchmark foo($dh, $topology)
BenchmarkTools.Trial: 1605 samples with 1 evaluation.
 Range (min  max):  2.567 ms  66.080 ms  ┊ GC (min  max): 0.00%  93.99%
 Time  (median):     2.784 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.109 ms ±  1.686 ms  ┊ GC (mean ± σ):  4.78% ±  9.39%

  ▇█▇▆▆▄▁   ▃▅▅▅▄▄▃▃▁ ▁                  ▁▂▁                 ▁
  █████████▇█████████▇█▇▄▆▄▆▆▄▆▄▅▁▅▇▆█▇███████▇▆▅▄▄▄▆▆▆▇▆▆▆▅ █
  2.57 ms      Histogram: log(frequency) by time     5.18 ms <

 Memory estimate: 2.99 MiB, allocs estimate: 124993.

Copy link

codecov bot commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.69%. Comparing base (3b081be) to head (a38ed02).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1108      +/-   ##
==========================================
- Coverage   93.57%   92.69%   -0.89%     
==========================================
  Files          39       39              
  Lines        6071     6143      +72     
==========================================
+ Hits         5681     5694      +13     
- Misses        390      449      +59     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Project.toml Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@AbdAlazezAhmed AbdAlazezAhmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some points for feedback

@@ -12,7 +12,7 @@ Access some grid representation for the dof handler.
"""
get_grid(dh::AbstractDofHandler)

mutable struct SubDofHandler{DH} <: AbstractDofHandler
mutable struct SubDofHandler{DH, CT} <: AbstractDofHandler
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes us avoid dynamic dispatch for getting nodes/coords

"""
An `InterfaceIndex` wraps an (Int, Int, Int, Int) and defines an interface by pointing to a (cell_here, facet_here, cell_there, facet_there).
"""
struct InterfaceIndex <: BoundaryIndex
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be unnecessary but I think it's convenient to have it

@@ -165,6 +165,7 @@ function _create_boundaryset(f::Function, grid::AbstractGrid, top::ExclusiveTopo
cell_idx = ff_nh_idx[1]
facet_nr = ff_nh_idx[2]
cell = getcells(grid, cell_idx)
length(facets(cell)) < facet_nr && continue
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for #1110

flags::UpdateFlags = UpdateFlags()
)
grid = gridordh isa DofHandler ? get_grid(gridordh) : gridordh
set = Set(create_boundaryfacetset(grid, topology, _ -> true))
Copy link
Collaborator Author

@AbdAlazezAhmed AbdAlazezAhmed Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using sets because ordered sets can allocate while iterating for some reason
And they do something a bit strange I'll have to check it again to make sure it's actually happening not just me hallucinating
They allocate on x86 but not on Arm :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @fredrikekre any idea what is going on here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds very strange.. Needs some kind of repro to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe they rehash (if necessary) before iteration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So It turns out the alloc deference between arm and x86 was because I was testing the constructor too.
There's quite a big difference in nallocs for some reason with OrderedSet

a = Set(rand(Int, 1000));
@benchmark OrderedSet($a)

Arm (Snapdragon X Elite - X1E80100) WSL2 with the old kernel - generic arm cpu build:

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  19.900 μs  115.683 ms  ┊ GC (min  max):  0.00%  99.66%
 Time  (median):     29.100 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   58.872 μs ±   1.163 ms  ┊ GC (mean ± σ):  21.40% ±  1.69%

  █▇▆▂▁ ▅▆▆▆▃▂                                                 ▂
  █████████████▇▇▇▅▅▆▃▄▄▄▃▄▄▃▅▁▃▄▁▁▁▁▃▃▁▃▄▄▃▁▄▃▄▄▅▇▆▆▆▆▅▆▇▇▇▆▆ █
  19.9 μs       Histogram: log(frequency) by time       298 μs <

 Memory estimate: 51.96 KiB, allocs estimate: 20.

x86 (i5-13500):

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min  max): 24.046 μs   14.743 ms ┊ GC (min  max):  0.00%  95.28%
Time (median):    26.610 μs              ┊ GC (median):    0.00%
Time (mean ± σ):  34.303 μs ± 229.372 μs ┊ GC (mean ± σ): 11.95% ± 1.82%
▄▆██▇▅▄▄▄▄▄▃▂▂▁▂▂▁▁▁▁                   ▁▁▁▁▁ ▁▁▁▁▁      ▂
███████████████████████▇▇█▇▇▇█▆▆▆▅▅▅▅▅▆▇█████████████████▇▅▆ █
24 μs        Histogram: log(frequency) by time     55.5 μs < 

Memory estimate: 113.98 KiB, allocs estimate: 1052.

and it scales with the number of elements (for 10k elements -> arm:28 , x86: 10079)
And yes, rehashing seems to be the issue for static analysis :'D

@@ -200,4 +200,21 @@
addboundaryfacetset!(grid, topology, "test_boundary_facetset", filter_function)
@test getfacetset(grid, "test_boundary_facetset") == Ferrite.create_boundaryfacetset(grid, topology, filter_function)
end

@testset "addboundaryset Mixed grid" begin
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for #1110

Comment on lines +209 to +220
for (sdh_here_idx, sdh_here) in enumerate(dh.subdofhandlers)
for (sdh_there_idx, sdh_there) in enumerate(dh.subdofhandlers)
sdh_there_idx < sdh_here_idx && continue
iv = InterfaceValues(
qr_collection[sdh_here_idx], ip_collection[sdh_here_idx],
qr_collection[sdh_there_idx], ip_collection[sdh_there_idx]
)
test_interfacevalues(
iv, sdh_here, sdh_there
)
end
end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this might be the way to go for subdomains + interfaces? -until we get multicell/facet/interfacevalues-

@AbdAlazezAhmed AbdAlazezAhmed changed the title [WIP] Subdomain support for InterfaceIterator Subdomain support for Interfaces + AllocCheck tests for iterators + Sparsitypattern bugfix Nov 13, 2024
@AbdAlazezAhmed AbdAlazezAhmed marked this pull request as ready for review November 13, 2024 19:51
@AbdAlazezAhmed
Copy link
Collaborator Author

#1100 should be merged first iirc. This one needs some feedback first esp with using Set instead of OrderedSet for performance

@termi-official termi-official added the awaiting review PR is finished from the authors POV, waiting for feedback label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review PR is finished from the authors POV, waiting for feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support subdomains in InterfaceIterator
4 participants