Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bundle computation of sets of ndmeasure functions #231

Open
unidesigner opened this issue May 20, 2021 · 7 comments
Open

bundle computation of sets of ndmeasure functions #231

unidesigner opened this issue May 20, 2021 · 7 comments

Comments

@unidesigner
Copy link

If I would like to compute several of the ndmeasures functions over my image data (similar to `scikit-image's regionprops'), is there a way to do that, i.e. to save compute and I/O? I haven't seen this being discussed here or elsewhere. Thank you!

@jakirkham
Copy link
Member

We've discussed in other issues and offline adding regionprops. So thanks for surfacing this as a proper issue. That doesn't exist today, but agree this is very valuable and seems worthwhile to include.

One thing worth noting that may help in the short-term is under-the-hood we use labeled_comprehension to implement these different measurement functions. So this can be a good way to shoehorn in one's own label-based computation.

cc @jni (who also likely has thoughts here 🙂)

@GenevieveBuckley
Copy link
Collaborator

I think some of the holdup for a version of regionprops in dask-image was to do with the find objects function.

But as I recall, there are a fair few 'easy' properties we could be supporting, several tricky ones, and a few very hard things to do. We shouldn't let the hard stuff completely block progress on supporting the easier things.

I was surprised I didn't find more notes about those old discussions with @jni in the issue threads, so thank you for making a dedicated issue for this @unidesigner

@unidesigner
Copy link
Author

Thanks for the feedback. When trying to run the a single ndmeasure function, I ran into another issue. I load a large zarr 3d image, and want to process a large subvolume. The processing used up all the memory and was killed by the OS, so I am not sure how to proceed and/or where to ask for help. I did not find documentation on how to limit threads/CPU/memory usage or similar, to control dask-image's parallel operations. (sorry for asking this here off-topic)

@jakirkham
Copy link
Member

Currently the way the ndmeasure functions work is they load all the values for a particular label into a single chunk of memory. So I guess the question is how large are each of your labels?

@unidesigner
Copy link
Author

My entire 3d zarr array is about 120k x 70k x 7k voxels (dask.array.from_zarr), and I slice into a 10k x 10k x 2000 voxels subvolume (regular slicing syntax), so it's quite a lot of task. Individual labels have no more than approx 1000 voxels, but there are many in the subvolumes. Would this explain an out-of-memory behavior?

@GenevieveBuckley
Copy link
Collaborator

This blogpost and links in it might give you a few suggestions for looking at where all your memory is going:

@unidesigner
Copy link
Author

Thanks @GenevieveBuckley I will have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants