What are the expected semantics of changing a method in a class? #35

ltratt · 2020-05-31T07:15:46Z

Consider this program:

mutate_methods = (
    run = (
        mutate_methods methods at:2 put: #a.
        self f.
    )

    f = ( ^#b. )
)

[There is obviously a fragile assumption here that the f is the second method in methods, but let's not worry about that for the time being.]

On Java SOM it crashes on the self f call:

Exception in thread "main" java.lang.ClassCastException: class som.vmobjects.SSymbol cannot be cast to class som.vmobjects.SInvokable (som.vmobjects.SSymbol and som.vmobjects.SInvokable are in unnamed module of loader 'app')

which makes sense. However this seemingly inconsequential variation:

mutate_methods = (
    run = (
        self f.
        mutate_methods methods at:2 put: #a.
        self f.
    )

    f = ( ^#b. )
)

does not crash, presumably because the first self f call populates an inline cache which is not invalidated when the methods Array is mutated.

I assume that this is probably not the intended semantics for this second program (or perhaps it is?), but I'm not sure if e.g. a) SOM should notice the method change b) forbid chnages to the methods Array (e.g. via some sort of "this Array can't be altered?"). Any thoughts?

The text was updated successfully, but these errors were encountered:

smarr · 2020-05-31T10:08:14Z

Thoughts: this is an unsatisfactory situation and reflection is considered unsafe in all SOM implementations.

I believe it is an example of "We know" how to make it fast and correct, and thus, noone ever bothered with it...

In Smalltalk, one would invalidate the caches. At least some Smalltalks assume they are used by "responsible programmers" and put the burden on the programmer by having an explicit API for invalidation: a flushCache primitive http://www.mirandabanda.org/bluebook/bluebook_chapter29.html
It's typically called by the "compiler" or method dictionary implementation.
Another reason why Smalltalk is an inherently unsafe language...

I am not entirely sure who added the lookup caching in SOM. Looking at the available code history it may have involved @mhaupt or @krono, but in the end, I am the one to blame for the current situation.

The "easy" thing to do would be to wrap the method array with something that invalidates all caches on write. In fast systems, one wants a more fine-grained approach, e.g. only removing relevant cache entries.

Currently, none of the SOMs do that. Reflection is simply not used in a mutating way.

ltratt · 2020-05-31T10:22:59Z

I guess there are two questions that we can tease apart here:

What are the intended semantics. I think you're saying that SOMs should automatically crash on the second of the two examples I gave without the user having to do anything else? [i.e. I don't think you're saying that flushCache is a good idea?]
How to make the intended semantics fast.

smarr · 2020-05-31T10:37:00Z

In an ideal world, reflective changes would be safe.

This may include an exception when putting a non-callable object into the method array.
It should likely automatically invalidate lookup caches, as class loading in Java does. Java being an example for a safe language here...

ltratt · 2020-05-31T10:43:04Z

Can we choose one of those models so that I can implement it? :p

smarr · 2020-05-31T10:52:18Z

Sure!

ltratt · 2020-05-31T11:17:23Z

I probably deserved that ;)

I will definitely invalidate caches. I'm not sure about the "raise an exception when something incorrect is put in there", because that makes Arrays appear to behave differently in ways that users can't reflect on (i.e. if I pass you an array, you can't tell if putting a symbol in it will bork until you put a symbol in it). But if you tell me that's the SOM way then I will do it!

smarr · 2020-05-31T11:38:27Z

Yes, invalidating caches is the best first step :)

krono · 2020-06-02T07:43:15Z

For the record, CSOM did not have any lookup cache and both snippets crash at the second invocation of #f.

129: Call the System initialize: method r=ptersilie a=ltratt SOM runs programs by calling the `initialize:` method in `System.som` and having it load the user's class. This PR makes SOM do just this (f942f27). To get to that point, we have to do various bits of preparatory work, notably supporting the `methods` (c64f237) and `signature` (926cd5e) primitives. Note that the `methods` primitive exposes a mutable array and we are thus currently subject to the same flaw as Java SOM in this regard (see SOM-st/SOM#35). Fixing that is for a later PR. Co-authored-by: Laurence Tratt <[email protected]>

130: Invalidate inline caches on method mutation r=ptersilie a=ltratt This fixes the problem identified in SOM-st/SOM#35 and which, AFAIK, affects all SOM implementations: mutating a method in a class causes incorrect behaviour if an inline cache exists for that call (e.g. yksom (before this PR) and Java SOM could "ignore" the change if a cache had been created; and C SOM apparently crashes!). This PR fixes the problem for yksom. It does so in a brute force way: when a method in any class is changed, every inline cache in the VM is reset. This makes correctness easy to reason about, though clearly it is less efficient than more nuanced approaches. The way we do this is sort-of interesting from a Rust perspective. There are two obvious ways to model this: * Have every SOM `Array` have a `bool` that says "if I'm mutated, flush all inline caches". * Subclass the `Array` struct (in Rust) so that only instances of the `MethodArray` subclass flush inline caches on mutation. The former is ugly, the latter seems impossible: Rust doesn't have subclasses, and traits only partially model them. In this case, the `bool` would probably be most efficient, but it's just really ugly. So I tried to make the latter case work, even though it doesn't appear to be possible. In particular, and perhaps surprisingly (though there are good reasons for this), a trait object `T1 + T2` cannot be coerced into a trait object of either `T1` or `T2`. This PR uses a simple way to solve this problem: 1. We define a trait `Array` 2. We define two structs `NormalArray` (which is the old `Array` struct renamed) and `MethodsArray` (which is only used for the arrays of methods in `Class`es). 3. We add a `to_array` function to `Obj` which returns (simplified) `&dyn Array`. Given a `&dyn Obj` we can then turn it into a `&dyn Array` by calling `to_array` *if and only if* an implementation of `to_array` is provided. Only `NormalArray` and `MethodsArray` implement this. Using this, we can get from a `Val` to a `&dyn Array` by calling (simplified) `val.tobj().to_array()`. Yes, it means we have two dynamically dispatched function calls, but it means that we have a relatively nice way of separating out the behaviour of normal arrays from method arrays. Most of the PR is thus shuffling code around (and duplicating some code): the real meat is in 0889b12. Co-authored-by: Laurence Tratt <[email protected]>

smarr added language design This issue requires design decisions spec Needs specification, or is specification related. bug Fixes an issue, incorrect implementation help wanted Pull requests very welcome (usually easier than other issues) labels May 31, 2020

ltratt changed the title ~~What are the expected semantics of chaning a method in a class?~~ What are the expected semantics of changing a method in a class? Jun 2, 2020

ltratt mentioned this issue Jun 2, 2020

Call the System initialize: method softdevteam/yksom#129

Merged

ltratt mentioned this issue Jun 4, 2020

Invalidate inline caches on method mutation softdevteam/yksom#130

Merged

smarr mentioned this issue Jun 5, 2020

What should happen when a class's fields are mutated? #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What are the expected semantics of changing a method in a class? #35

What are the expected semantics of changing a method in a class? #35

ltratt commented May 31, 2020

smarr commented May 31, 2020

ltratt commented May 31, 2020

smarr commented May 31, 2020

ltratt commented May 31, 2020

smarr commented May 31, 2020

ltratt commented May 31, 2020

smarr commented May 31, 2020

krono commented Jun 2, 2020

What are the expected semantics of changing a method in a class? #35

What are the expected semantics of changing a method in a class? #35

Comments

ltratt commented May 31, 2020

smarr commented May 31, 2020

ltratt commented May 31, 2020

smarr commented May 31, 2020

ltratt commented May 31, 2020

smarr commented May 31, 2020

ltratt commented May 31, 2020

smarr commented May 31, 2020

krono commented Jun 2, 2020