-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use xarray data structures inside models #141
Comments
Thanks for the reply @benbovy.
I have spent little time in the package code, but it seems that one could quite easily expose the underlying @xs.process
class Foo:
# Kwarg in `xsimlab.variable` triggers call to `utils.variable_dict` or similar
as_attr = xsimlab.variable(..., as_attr=True)
def initialize(self):
assert isinstance(self.as_attr, attr.Attribute) Caveat of above: I don't believe that @xs.process
class Foo:
# Retrieve the value separately
bar = xsimlab.variable(...)
# This function would now return an item from `utils.variable_dict()`
bar_metadata = xsimlab.variable_info(..., verbose=False)
# ...or just a new function
bat_metadata = xsimlab.variable_metadata(...)
I agree that the second (non-breaking) option is preferable here. A user-side opinion about this feature: I will almost always prefer to write my own process |
Mmm I'm not sure that we really need something like
I'm not sure either if it's a good idea to manually create @xs.process
class Foo:
bar = xsimlab.variable(dims=('x', 'y'))
def initialize(self):
self.bar = xarray.DataArray(..., dims=('y', 'z')) From your example in #140, it would be definitely possible to automatically create a DataArray for your |
Good point. I overlooked the fact that
I see your point here. Once the Again, thanks for your responsiveness and great work on this project! |
Thanks!
Yes please! This is very much appreciated! |
Thinking again about this, a possible API would be: # Get the value of `self.var` as a DataArray. If `self.var` is not a DataArray,
# construct a new DataArray on the fly by retrieving metadata and coordinates
# from the model. If it is already a DataArray, simply return it.
value = xsimlab.getattr_as_dataarray(self, 'var')
# Set `self.var` with `value` coerced into a DataArray. If `value` is not a DataArray,
# try creating a new one by retrieving metadata and coordinates from the model.
# If it is already a DataArray, perform some sanity checks to ensure that dimensions are
# compatible and add missing coordinates / attributes.
xsimlab.setattr_as_dataarray(self, 'var', value) The user has still full control on the values assigned to 'inout'/'out' variables, but using the functions above provides both a convenient and safe way to get/set values. I think it's safer to let xarray-simlab handle coercing values into DataArray objects -- i.e., infer dimension labels from the shape of the (unlabelled) input array, maybe transpose the dimensions of the input DataArray, etc. -- rather than let the user do it manually. It all can be done automatically. We could expose a global option in Xarray-simlab so that the two functions above are implicitly called in xsimlab variables' getter/setter properties, respectively. With this option activated, the code below would have the exact same behavior than the code above: value = self.var
self.var = value |
Or maybe the right place to expose this option is the |
A concrete example I am working on now is a model in which a process would obtain, say, ISRIC soil data using OWSLib's WebCoverageService and then load it with rioxarray. Then if you have an input rioxarray raster, convenience functions such as |
from @eho-tacc's https://github.com/benbovy/xarray-simlab/issues/140#issuecomment-709586253
Even though I haven't had any use case for this yet, accessing variable metadata from inside process class methods would probably make sense indeed. Right now xarray data structures are used for the model "outer" interface only, but I've been wondering if it would make sense to also leverage it inside models.
Since
xsimlab.variable
attributes contain all the metadata needed to wrap their values as xarray variables in model inputs/outputs, nothing prevents doing the same in process classes too, i.e.,We could probably look model-wise for
xsimlab.index
variables to automatically populatexarray.DataArray
coordinates.That said, I'm not sure if the example above should be the default behavior (this would be a breaking change). Maybe an option or flag exposed somewhere? I don't know where exactly... Or maybe an explicit function? E.g.,
I like the latter option, although it might be quite verbose if we want this as the default behavior.
The text was updated successfully, but these errors were encountered: